Skip to content

Add smithy-xml package#655

Open
arandito wants to merge 1 commit intosmithy-lang:feat/add-query-protocolfrom
arandito:add-smithy-xml
Open

Add smithy-xml package#655
arandito wants to merge 1 commit intosmithy-lang:feat/add-query-protocolfrom
arandito:add-smithy-xml

Conversation

@arandito
Copy link
Contributor

@arandito arandito commented Mar 11, 2026

Description

This PR adds an initial implementation of the new smithy-xml runtime package, which provides generic XML serialization and deserialization support for Smithy clients and servers.

This work primarily supports the XML deserialization path needed by the awsQuery protocol. The serializer is also included as an initial implementation but will not be used immediately. If necessary, we could omit the serializer for now and introduce it later alongside the restXml protocol, where XML request serialization is required and the behavior can be validated more thoroughly.

The implementation follows the XML binding rules defined in the Smithy awsQuery and restXml protocol specifications, including support for the following traits: @xmlName, @xmlAttribute, @xmlFlattened, and @xmlNamespace (defined in smithy-core).

Important

This implementation follows the Smithy Python Protocol Serialization and Deserialization design. I strongly recommend reviewing this document first to better understand the implementation.

Testing

Added round trip serde test cases that follow the same testing strategy as smithy-json.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@arandito arandito requested a review from a team as a code owner March 11, 2026 16:53
@arandito
Copy link
Contributor Author

arandito commented Mar 12, 2026

Reviewer Note

The current XML serializer builds a DOM tree in memory and only writes it to the output sink when serializer.flush() is called. This means any serializer.write_*() call must be followed by flush(). Our current HTTPRequestSerializer does not call flush() today because the smithy-json serializer streams directly to the sink instead of buffering. To support XML serialization, the generic ShapeSerializer must now be flushed at all call sites.

I explored implementing a streaming XML writer to avoid this but ran into an issue with XML attributes. In XML, members marked with @xmlAttribute must appear on the structure’s opening tag (e.g. <OperationInput attr="...">). In Smithy, however, attributes are modeled as normal structure members, so they may appear anywhere in the member list during serialization. As a result, the opening tag cannot be written until all attribute values are known, which effectively requires delaying writes until all members have been visited.

One possible solution is a two-pass approach: first serialize attribute members normally while sending others to a null serializer, then serialize non-attribute members while sending attribute members to a null serializer. This allows the opening tag and its attributes to be written before child elements. However, this introduces additional complexity compared to the current approach without a substantial benefit.

This creates some asymmetry with the current XML deserializer, which reads incrementally from the byte source rather than a DOM. I am open to switching the serializer to a streaming implementation if parity or avoiding the intermediate DOM is preferred.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant