quick_xml

Module de

source
Expand description

Serde Deserializer module.

Due to the complexity of the XML standard and the fact that Serde was developed with JSON in mind, not all Serde concepts apply smoothly to XML. This leads to that fact that some XML concepts are inexpressible in terms of Serde derives and may require manual deserialization.

The most notable restriction is the ability to distinguish between elements and attributes, as no other format used by serde has such a conception.

Due to that the mapping is performed in a best effort manner.

§Table of Contents

§Mapping XML to Rust types

Type names are never considered when deserializing, so you can name your types as you wish. Other general rules:

  • struct field name could be represented in XML only as an attribute name or an element name;
  • enum variant name could be represented in XML only as an attribute name or an element name;
  • the unit struct, unit type () and unit enum variant can be deserialized from any valid XML content:
    • attribute and element names;
    • attribute and element values;
    • text or CDATA content (including mixed text and CDATA content).

NOTE: All tests are marked with an ignore option, even though they do compile. This is because rustdoc marks such blocks with an information icon unlike no_run blocks.

§Basics

To parse all these XML's......use these Rust type(s)
Content of attributes and text / CDATA content of elements (including mixed text and CDATA content):
<... ...="content" />
<...>content</...>
<...><![CDATA[content]]></...>
<...>text<![CDATA[cdata]]>text</...>

Mixed text / CDATA content represents one logical string, "textcdatatext" in that case.

You can use any type that can be deserialized from an &str, for example:

  • String and &str
  • Cow<str>
  • u32, f32 and other numeric types
  • enums, like
    #[derive(Deserialize)]
    enum Language {
      Rust,
      Cpp,
      #[serde(other)]
      Other,
    }

NOTE: deserialization to non-owned types (i.e. borrow from the input), such as &str, is possible only if you parse document in the UTF-8 encoding and content does not contain entity references such as &amp;, or character references such as &#xD;, as well as text content represented by one piece of text or CDATA element.

Content of attributes and text / CDATA content of elements (including mixed text and CDATA content), which represents a space-delimited lists, as specified in the XML Schema specification for xs:list simpleType:

<... ...="element1 element2 ..." />
<...>
  element1
  element2
  ...
</...>
<...><![CDATA[
  element1
  element2
  ...
]]></...>

Use any type that deserialized using deserialize_seq() call, for example:

type List = Vec<u32>;

See the next row to learn where in your struct definition you should use that type.

According to the XML Schema specification, delimiters for elements is one or more space (' ', '\r', '\n', and '\t') character(s).

NOTE: according to the XML Schema restrictions, you cannot escape those white-space characters, so list elements will never contain them. In practice you will usually use xs:lists for lists of numbers or enumerated values which looks like identifiers in many languages, for example, item, some_item or some-item, so that shouldn’t be a problem.

NOTE: according to the XML Schema specification, list elements can be delimited only by spaces. Other delimiters (for example, commas) are not allowed.

A typical XML with attributes. The root tag name does not matter:
<any-tag one="..." two="..."/>

A structure where each XML attribute is mapped to a field with a name starting with @. Because Rust identifiers do not permit the @ character, you should use the #[serde(rename = "@...")] attribute to rename it. The name of the struct itself does not matter:

// Get both attributes
#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@one")]
  one: T,

  #[serde(rename = "@two")]
  two: U,
}
// Get only the one attribute, ignore the other
#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@one")]
  one: T,
}
// Ignore all attributes
// You can also use the `()` type (unit type)
#[derive(Deserialize)]
struct AnyName;

All these structs can be used to deserialize from an XML on the left side depending on amount of information that you want to get. Of course, you can combine them with elements extractor structs (see below).

NOTE: XML allows you to have an attribute and an element with the same name inside the one element. quick-xml deals with that by prepending a @ prefix to the name of attributes.

A typical XML with child elements. The root tag name does not matter:
<any-tag>
  <one>...</one>
  <two>...</two>
</any-tag>
A structure where each XML child element is mapped to the field. Each element name becomes a name of field. The name of the struct itself does not matter:
// Get both elements
#[derive(Deserialize)]
struct AnyName {
  one: T,
  two: U,
}
// Get only the one element, ignore the other
#[derive(Deserialize)]
struct AnyName {
  one: T,
}
// Ignore all elements
// You can also use the `()` type (unit type)
#[derive(Deserialize)]
struct AnyName;

All these structs can be used to deserialize from an XML on the left side depending on amount of information that you want to get. Of course, you can combine them with attributes extractor structs (see above).

NOTE: XML allows you to have an attribute and an element with the same name inside the one element. quick-xml deals with that by prepending a @ prefix to the name of attributes.

An XML with an attribute and a child element named equally:
<any-tag field="...">
  <field>...</field>
</any-tag>

You MUST specify #[serde(rename = "@field")] on a field that will be used for an attribute:

#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@field")]
  attribute: T,
  field: U,
}

§Optional attributes and elements

To parse all these XML's......use these Rust type(s)
An optional XML attribute that you want to capture. The root tag name does not matter:
<any-tag optional="..."/>
<any-tag/>

A structure with an optional field, renamed according to the requirements for attributes:

#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@optional")]
  optional: Option<T>,
}

When the XML attribute is present, type T will be deserialized from an attribute value (which is a string). Note, that if T = String or other string type, the empty attribute is mapped to a Some(""), whereas None represents the missed attribute:

<any-tag optional="..."/><!-- Some("...") -->
<any-tag optional=""/>   <!-- Some("") -->
<any-tag/>               <!-- None -->
An optional XML elements that you want to capture. The root tag name does not matter:
<any-tag/>
  <optional>...</optional>
</any-tag>
<any-tag/>
  <optional/>
</any-tag>
<any-tag/>

A structure with an optional field:

#[derive(Deserialize)]
struct AnyName {
  optional: Option<T>,
}

When the XML element is present, type T will be deserialized from an element (which is a string or a multi-mapping – i.e. mapping which can have duplicated keys).

Currently some edge cases exists described in the issue #497.

§Choices (xs:choice XML Schema type)

To parse all these XML's......use these Rust type(s)
An XML with different root tag names, as well as text / CDATA content:
<one field1="...">...</one>
<two>
  <field2>...</field2>
</two>
Text <![CDATA[or (mixed)
CDATA]]> content

An enum where each variant has the name of a possible root tag. The name of the enum itself does not matter.

If you need to get the textual content, mark a variant with #[serde(rename = "$text")].

All these structs can be used to deserialize from any XML on the left side depending on amount of information that you want to get:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum AnyName {
  One { #[serde(rename = "@field1")] field1: T },
  Two { field2: U },

  /// Use unit variant, if you do not care of a content.
  /// You can use tuple variant if you want to parse
  /// textual content as an xs:list.
  /// Struct variants are not supported and will return
  /// Err(Unsupported)
  #[serde(rename = "$text")]
  Text(String),
}
#[derive(Deserialize)]
struct Two {
  field2: T,
}
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum AnyName {
  // `field1` content discarded
  One,
  Two(Two),
  #[serde(rename = "$text")]
  Text,
}
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum AnyName {
  One,
  // the <two> and textual content will be mapped to this
  #[serde(other)]
  Other,
}

NOTE: You should have variants for all possible tag names in your enum or have an #[serde(other)] variant.

<xs:choice> embedded in the other element, and at the same time you want to get access to other attributes that can appear in the same container (<any-tag>). Also this case can be described, as if you want to choose Rust enum variant based on a tag name:

<any-tag field="...">
  <one>...</one>
</any-tag>
<any-tag field="...">
  <two>...</two>
</any-tag>
<any-tag field="...">
  Text <![CDATA[or (mixed)
  CDATA]]> content
</any-tag>

A structure with a field which type is an enum.

If you need to get a textual content, mark a variant with #[serde(rename = "$text")].

Names of the enum, struct, and struct field with Choice type does not matter:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,

  /// Use unit variant, if you do not care of a content.
  /// You can use tuple variant if you want to parse
  /// textual content as an xs:list.
  /// Struct variants are not supported and will return
  /// Err(Unsupported)
  #[serde(rename = "$text")]
  Text(String),
}
#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@field")]
  field: T,

  #[serde(rename = "$value")]
  any_name: Choice,
}

<xs:choice> embedded in the other element, and at the same time you want to get access to other elements that can appear in the same container (<any-tag>). Also this case can be described, as if you want to choose Rust enum variant based on a tag name:

<any-tag>
  <field>...</field>
  <one>...</one>
</any-tag>
<any-tag>
  <two>...</two>
  <field>...</field>
</any-tag>

A structure with a field which type is an enum.

Names of the enum, struct, and struct field with Choice type does not matter:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
}
#[derive(Deserialize)]
struct AnyName {
  field: T,

  #[serde(rename = "$value")]
  any_name: Choice,
}

NOTE: if your Choice enum would contain an #[serde(other)] variant, element <field> will be mapped to the field and not to the enum variant.

<xs:choice> encapsulated in other element with a fixed name:

<any-tag field="...">
  <choice>
    <one>...</one>
  </choice>
</any-tag>
<any-tag field="...">
  <choice>
    <two>...</two>
  </choice>
</any-tag>

A structure with a field of an intermediate type with one field of enum type. Actually, this example is not necessary, because you can construct it by yourself using the composition rules that were described above. However the XML construction described here is very common, so it is shown explicitly.

Names of the enum and struct does not matter:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
}
#[derive(Deserialize)]
struct Holder {
  #[serde(rename = "$value")]
  any_name: Choice,
}
#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@field")]
  field: T,

  choice: Holder,
}

<xs:choice> encapsulated in other element with a fixed name:

<any-tag>
  <field>...</field>
  <choice>
    <one>...</one>
  </choice>
</any-tag>
<any-tag>
  <choice>
    <two>...</two>
  </choice>
  <field>...</field>
</any-tag>

A structure with a field of an intermediate type with one field of enum type. Actually, this example is not necessary, because you can construct it by yourself using the composition rules that were described above. However the XML construction described here is very common, so it is shown explicitly.

Names of the enum and struct does not matter:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
}
#[derive(Deserialize)]
struct Holder {
  #[serde(rename = "$value")]
  any_name: Choice,
}
#[derive(Deserialize)]
struct AnyName {
  field: T,

  choice: Holder,
}

§Sequences (xs:all and xs:sequence XML Schema types)

To parse all these XML's......use these Rust type(s)
A sequence inside of a tag without a dedicated name:
<any-tag/>
<any-tag>
  <item/>
</any-tag>
<any-tag>
  <item/>
  <item/>
  <item/>
</any-tag>

A structure with a field which is a sequence type, for example, Vec. Because XML syntax does not distinguish between empty sequences and missed elements, we should indicate that on the Rust side, because serde will require that field item exists. You can do that in two possible ways:

Use the #[serde(default)] attribute for a field or the entire struct:

#[derive(Deserialize)]
struct AnyName {
  #[serde(default)]
  item: Vec<Item>,
}

Use the Option. In that case inner array will always contains at least one element after deserialization:

#[derive(Deserialize)]
struct AnyName {
  item: Option<Vec<Item>>,
}

See also Frequently Used Patterns.

A sequence with a strict order, probably with mixed content (text / CDATA and tags):
<one>...</one>
text
<![CDATA[cdata]]>
<two>...</two>
<one>...</one>

NOTE: this is just an example for showing mapping. XML does not allow multiple root tags – you should wrap the sequence into a tag.

All elements mapped to the heterogeneous sequential type: tuple or named tuple. Each element of the tuple should be able to be deserialized from the nested element content (...), except the enum types which would be deserialized from the full element (<one>...</one>), so they could use the element name to choose the right variant:

type One = ...;
type Two = ...;
#[derive(Deserialize)]
struct AnyName(One, String, Two, One);
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
}
type Two = ...;
type AnyName = (Choice, String, Two, Choice);

NOTE: consequent text and CDATA nodes are merged into the one text node, so you cannot have two adjacent string types in your sequence.

NOTE: In the case that the list might contain tags that are overlapped with tags that do not correspond to the list you should add the feature overlapped-lists.

A sequence with a non-strict order, probably with a mixed content (text / CDATA and tags).
<one>...</one>
text
<![CDATA[cdata]]>
<two>...</two>
<one>...</one>

NOTE: this is just an example for showing mapping. XML does not allow multiple root tags – you should wrap the sequence into a tag.

A homogeneous sequence of elements with a fixed or dynamic size:
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
  #[serde(other)]
  Other,
}
type AnyName = [Choice; 4];
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
  #[serde(rename = "$text")]
  Other(String),
}
type AnyName = Vec<Choice>;

NOTE: consequent text and CDATA nodes are merged into the one text node, so you cannot have two adjacent string types in your sequence.

A sequence with a strict order, probably with a mixed content, (text and tags) inside of the other element:
<any-tag attribute="...">
  <one>...</one>
  text
  <![CDATA[cdata]]>
  <two>...</two>
  <one>...</one>
</any-tag>

A structure where all child elements mapped to the one field which have a heterogeneous sequential type: tuple or named tuple. Each element of the tuple should be able to be deserialized from the full element (<one>...</one>).

You MUST specify #[serde(rename = "$value")] on that field:

type One = ...;
type Two = ...;

#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@attribute")]
  attribute: ...,
  // Does not (yet?) supported by the serde
  // https://github.com/serde-rs/serde/issues/1905
  // #[serde(flatten)]
  #[serde(rename = "$value")]
  any_name: (One, String, Two, One),
}
type One = ...;
type Two = ...;

#[derive(Deserialize)]
struct NamedTuple(One, String, Two, One);

#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@attribute")]
  attribute: ...,
  // Does not (yet?) supported by the serde
  // https://github.com/serde-rs/serde/issues/1905
  // #[serde(flatten)]
  #[serde(rename = "$value")]
  any_name: NamedTuple,
}

NOTE: consequent text and CDATA nodes are merged into the one text node, so you cannot have two adjacent string types in your sequence.

A sequence with a non-strict order, probably with a mixed content (text / CDATA and tags) inside of the other element:
<any-tag>
  <one>...</one>
  text
  <![CDATA[cdata]]>
  <two>...</two>
  <one>...</one>
</any-tag>

A structure where all child elements mapped to the one field which have a homogeneous sequential type: array-like container. A container type T should be able to be deserialized from the nested element content (...), except if it is an enum type which would be deserialized from the full element (<one>...</one>).

You MUST specify #[serde(rename = "$value")] on that field:

#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
  #[serde(rename = "$text")]
  Other(String),
}
#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@attribute")]
  attribute: ...,
  // Does not (yet?) supported by the serde
  // https://github.com/serde-rs/serde/issues/1905
  // #[serde(flatten)]
  #[serde(rename = "$value")]
  any_name: [Choice; 4],
}
#[derive(Deserialize)]
#[serde(rename_all = "snake_case")]
enum Choice {
  One,
  Two,
  #[serde(rename = "$text")]
  Other(String),
}
#[derive(Deserialize)]
struct AnyName {
  #[serde(rename = "@attribute")]
  attribute: ...,
  // Does not (yet?) supported by the serde
  // https://github.com/serde-rs/serde/issues/1905
  // #[serde(flatten)]
  #[serde(rename = "$value")]
  any_name: Vec<Choice>,
}

NOTE: consequent text and CDATA nodes are merged into the one text node, so you cannot have two adjacent string types in your sequence.

§Composition Rules

The XML format is very different from other formats supported by serde. One such difference it is how data in the serialized form is related to the Rust type. Usually each byte in the data can be associated only with one field in the data structure. However, XML is an exception.

For example, took this XML:

<any>
  <key attr="value"/>
</any>

and try to deserialize it to the struct AnyName:

#[derive(Deserialize)]
struct AnyName { // AnyName calls `deserialize_struct` on `<any><key attr="value"/></any>`
                 //                         Used data:          ^^^^^^^^^^^^^^^^^^^
  key: Inner,    // Inner   calls `deserialize_struct` on `<key attr="value"/>`
                 //                         Used data:          ^^^^^^^^^^^^
}
#[derive(Deserialize)]
struct Inner {
  #[serde(rename = "@attr")]
  attr: String,  // String  calls `deserialize_string` on `value`
                 //                         Used data:     ^^^^^
}

Comments shows what methods of a Deserializer called by each struct deserialize method and which input their seen. Used data shows, what content is actually used for deserializing. As you see, name of the inner <key> tag used both as a map key / outer struct field name and as part of the inner struct (although value of the tag, i.e. key is not used by it).

§Enum Representations

quick-xml represents enums differently in normal fields, $text fields and $value fields. A normal representation is compatible with serde’s adjacent and internal tags feature – tag for adjacently and internally tagged enums are serialized using Serializer::serialize_unit_variant and deserialized using Deserializer::deserialize_enum.

Use those simple rules to remember, how enum would be represented in XML:

  • In $value field the representation is always the same as top-level representation;
  • In $text field the representation is always the same as in normal field, but surrounding tags with field name are removed;
  • In normal field the representation is always contains a tag with a field name.

§Normal enum variant

To model an xs:choice XML construct use $value field. To model a top-level xs:choice just use the enum type.

KindTop-level and in $value fieldIn normal fieldIn $text field
Unit<Unit/><field>Unit</field>Unit
Newtype<Newtype>42</Newtype>Err(Unsupported)Err(Unsupported)
Tuple<Tuple>42</Tuple><Tuple>answer</Tuple>Err(Unsupported)Err(Unsupported)
Struct<Struct><q>42</q><a>answer</a></Struct>Err(Unsupported)Err(Unsupported)

§$text enum variant

KindTop-level and in $value fieldIn normal fieldIn $text field
Unit(empty)<field/>(empty)
Newtype42Err(Unsupported) 1Err(Unsupported) 2
Tuple42 answerErr(Unsupported) 3Err(Unsupported) 4
StructErr(Unsupported)Err(Unsupported)Err(Unsupported)

§Difference between $text and $value special names

quick-xml supports two special names for fields – $text and $value. Although they may seem the same, there is a distinction. Two different names is required mostly for serialization, because quick-xml should know how you want to serialize certain constructs, which could be represented through XML in multiple different ways.

The only difference is in how complex types and sequences are serialized. If you doubt which one you should select, begin with $value.

§$text

$text is used when you want to write your XML as a text or a CDATA content. More formally, field with that name represents simple type definition with {variety} = atomic or {variety} = union whose basic members are all atomic, as described in the specification.

As a result, not all types of such fields can be serialized. Only serialization of following types are supported:

  • all primitive types (strings, numbers, booleans)
  • unit variants of enumerations (serializes to a name of a variant)
  • newtypes (delegates serialization to inner type)
  • Option of above (None serializes to nothing)
  • sequences (including tuples and tuple variants of enumerations) of above, excluding None and empty string elements (because it will not be possible to deserialize them back). The elements are separated by space(s)
  • unit type () and unit structs (serializes to nothing)

Complex types, such as structs and maps, are not supported in this field. If you want them, you should use $value.

Sequences serialized to a space-delimited string, that is why only certain types are allowed in this mode:

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct AnyName {
    #[serde(rename = "$text")]
    field: Vec<usize>,
}

let obj = AnyName { field: vec![1, 2, 3] };
let xml = to_string(&obj).unwrap();
assert_eq!(xml, "<AnyName>1 2 3</AnyName>");

let object: AnyName = from_str(&xml).unwrap();
assert_eq!(object, obj);

§$value

NOTE: a name #content would better explain the purpose of that field, but $value is used for compatibility with other XML serde crates, which uses that name. This will allow you to switch XML crates more smoothly if required.

Representation of primitive types in $value does not differ from their representation in $text field. The difference is how sequences are serialized. $value serializes each sequence item as a separate XML element. The name of that element is taken from serialized type, and because only enums provide such name (their variant name), only they should be used for such fields.

$value fields does not support struct types with fields, the serialization of such types would end with an Err(Unsupported). Unit structs and unit type () serializing to nothing and can be deserialized from any content.

Serialization and deserialization of $value field performed as usual, except that name for an XML element will be given by the serialized type, instead of field. The latter allow to serialize enumerated types, where variant is encoded as a tag name, and, so, represent an XSD xs:choice schema by the Rust enum.

In the example below, field will be serialized as <field/>, because elements get their names from the field name. It cannot be deserialized, because Enum expects elements <A/>, <B/> or <C/>, but AnyName looked only for <field/>:

#[derive(Deserialize, Serialize)]
enum Enum { A, B, C }

#[derive(Deserialize, Serialize)]
struct AnyName {
    // <field>A</field>, <field>B</field>, or <field>C</field>
    field: Enum,
}

If you rename field to $value, then field would be serialized as <A/>, <B/> or <C/>, depending on the its content. It is also possible to deserialize it from the same elements:

#[derive(Deserialize, Serialize)]
struct AnyName {
    // <A/>, <B/> or <C/>
    #[serde(rename = "$value")]
    field: Enum,
}

§Primitives and sequences of primitives

Sequences serialized to a list of elements. Note, that types that does not produce their own tag (i. e. primitives) are written as is, without delimiters:

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct AnyName {
    #[serde(rename = "$value")]
    field: Vec<usize>,
}

let obj = AnyName { field: vec![1, 2, 3] };
let xml = to_string(&obj).unwrap();
// Note, that types that does not produce their own tag are written as is!
assert_eq!(xml, "<AnyName>123</AnyName>");

let object: AnyName = from_str("<AnyName>123</AnyName>").unwrap();
assert_eq!(object, AnyName { field: vec![123] });

// `1 2 3` is mapped to a single `usize` element
// It is impossible to deserialize list of primitives to such field
from_str::<AnyName>("<AnyName>1 2 3</AnyName>").unwrap_err();

A particular case of that example is a string $value field, which probably would be a most used example of that attribute:

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct AnyName {
    #[serde(rename = "$value")]
    field: String,
}

let obj = AnyName { field: "content".to_string() };
let xml = to_string(&obj).unwrap();
assert_eq!(xml, "<AnyName>content</AnyName>");

§Structs and sequences of structs

Note, that structures do not have a serializable name as well (name of the type is never used), so it is impossible to serialize non-unit struct or sequence of non-unit structs in $value field. (sequences of) unit structs are serialized as empty string, because units itself serializing to nothing:

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct Unit;

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct AnyName {
    // #[serde(default)] is required to deserialization of empty lists
    // This is a general note, not related to $value
    #[serde(rename = "$value", default)]
    field: Vec<Unit>,
}

let obj = AnyName { field: vec![Unit, Unit, Unit] };
let xml = to_string(&obj).unwrap();
assert_eq!(xml, "<AnyName/>");

let object: AnyName = from_str("<AnyName/>").unwrap();
assert_eq!(object, AnyName { field: vec![] });

let object: AnyName = from_str("<AnyName></AnyName>").unwrap();
assert_eq!(object, AnyName { field: vec![] });

let object: AnyName = from_str("<AnyName><A/><B/><C/></AnyName>").unwrap();
assert_eq!(object, AnyName { field: vec![Unit, Unit, Unit] });

§Enums and sequences of enums

Enumerations uses the variant name as an element name:

#[derive(Deserialize, Serialize, PartialEq, Debug)]
struct AnyName {
    #[serde(rename = "$value")]
    field: Vec<Enum>,
}

#[derive(Deserialize, Serialize, PartialEq, Debug)]
enum Enum { A, B, C }

let obj = AnyName { field: vec![Enum::A, Enum::B, Enum::C] };
let xml = to_string(&obj).unwrap();
assert_eq!(
    xml,
    "<AnyName>\
        <A/>\
        <B/>\
        <C/>\
     </AnyName>"
);

let object: AnyName = from_str(&xml).unwrap();
assert_eq!(object, obj);

You can have either $text or $value field in your structs. Unfortunately, that is not enforced, so you can theoretically have both, but you should avoid that.

§Frequently Used Patterns

Some XML constructs used so frequent, that it is worth to document the recommended way to represent them in the Rust. The sections below describes them.

§<element> lists

Many XML formats wrap lists of elements in the additional container, although this is not required by the XML rules:

<root>
  <field1/>
  <field2/>
  <list><!-- Container -->
    <element/>
    <element/>
    <element/>
  </list>
  <field3/>
</root>

In this case, there is a great desire to describe this XML in this way:

/// Represents <element/>
type Element = ();

/// Represents <root>...</root>
struct AnyName {
    // Incorrect
    list: Vec<Element>,
}

This will not work, because potentially <list> element can have attributes and other elements inside. You should define the struct for the <list> explicitly, as you do that in the XSD for that XML:

/// Represents <element/>
type Element = ();

/// Represents <root>...</root>
struct AnyName {
    // Correct
    list: List,
}
/// Represents <list>...</list>
struct List {
    element: Vec<Element>,
}

If you want to simplify your API, you could write a simple function for unwrapping inner list and apply it via deserialize_with:

use quick_xml::de::from_str;
use serde::{Deserialize, Deserializer};

/// Represents <element/>
type Element = ();

/// Represents <root>...</root>
#[derive(Deserialize, Debug, PartialEq)]
struct AnyName {
    #[serde(deserialize_with = "unwrap_list")]
    list: Vec<Element>,
}

fn unwrap_list<'de, D>(deserializer: D) -> Result<Vec<Element>, D::Error>
where
    D: Deserializer<'de>,
{
    /// Represents <list>...</list>
    #[derive(Deserialize)]
    struct List {
        // default allows empty list
        #[serde(default)]
        element: Vec<Element>,
    }
    Ok(List::deserialize(deserializer)?.element)
}

assert_eq!(
    AnyName { list: vec![(), (), ()] },
    from_str("
        <root>
          <list>
            <element/>
            <element/>
            <element/>
          </list>
        </root>
    ").unwrap(),
);

Instead of writing such functions manually, you also could try https://lib.rs/crates/serde-query.

§Overlapped (Out-of-Order) Elements

In the case that the list might contain tags that are overlapped with tags that do not correspond to the list (this is a usual case in XML documents) like this:

<any-name>
  <item/>
  <another-item/>
  <item/>
  <item/>
</any-name>

you should enable the overlapped-lists feature to make it possible to deserialize this to:

#[derive(Deserialize)]
#[serde(rename_all = "kebab-case")]
struct AnyName {
    item: Vec<()>,
    another_item: (),
}

§Internally Tagged Enums

Tagged enums are currently not supported because of an issue in the Serde design (see serde#1183 and quick-xml#586) and missing optimizations in Serde which could be useful for XML parsing (serde#1495). This can be worked around by manually implementing deserialize with #[serde(deserialize_with = "func")] or implementing Deserialize, but this can get very tedious very fast for files with large amounts of tagged enums. To help with this issue quick-xml provides a macro impl_deserialize_for_internally_tagged_enum!. See the macro documentation for details.


  1. If this serialize as <field>42</field> then it will be ambiguity during deserialization, because it clash with Unit representation in normal field. 

  2. If this serialize as 42 then it will be ambiguity during deserialization, because it clash with Unit representation in $text field. 

  3. If this serialize as <field>42 answer</field> then it will be ambiguity during deserialization, because it clash with Unit representation in normal field. 

  4. If this serialize as 42 answer then it will be ambiguity during deserialization, because it clash with Unit representation in $text field. 

Structs§

  • A structure that deserializes XML into Rust values.
  • XML input source that reads from a std::io input stream.
  • An EntityResolver that does nothing and always returns None.
  • XML input source that reads from a slice of bytes and can borrow from it.
  • Decoded and concatenated content of consequent Text and CData events. Consequent means that events should follow each other or be delimited only by (any count of) Comment or PI events.

Enums§

  • (De)serialization error
  • Simplified event which contains only these variants that used by deserializer
  • Simplified event which contains only these variants that used by deserializer, but Text events not yet fully processed.

Traits§

  • Used to resolve unknown entities while parsing
  • Trait used by the deserializer for iterating over input. This is manually “specialized” for iterating over &[u8].

Functions§

  • Deserialize from a reader. This method will do internal copies of data readed from reader. If you want have a &str input and want to borrow as much as possible, use from_str.
  • Deserialize an instance of type T from a string of XML text.