The XML output method serializes the normalized sequence as an XML entity that MUST satisfy the rules for either a well-formed XML document entity, a well-formed XML external general parsed entity, or both. A serialization error [err:SERE0003] results if the serializer is unable to satisfy those rules, except for content modified by the character expansion phase of serialization, as described in 4 Phases of Serialization. The effects of the character expansion phase could result in the serialized output being not well-formed, but will not result in a serialization error. If a serialization error results, the serializer MUST signal the error.
If the document node of the normalized sequence has a single element node child and no text node children, then the serialized output is a well-formed XML document entity, and the serialized output MUST conform to the appropriate version of the XML Namespaces Recommendation [XML Names] or [XML Names 1.1]. If the normalized sequence does not take this form, then the serialized output is a well-formed XML external general parsed entity, which, when referenced within a trivial XML document wrapper like this:
<?xml version="version"?> <!DOCTYPE doc [ <!ENTITY e SYSTEM "entity-URI"> ]> <doc>&e;</doc>
where entity-URI
is a URI for the entity,
and the value of the version
pseudo-attribute is the value of the version
parameter, produces a
document which MUST itself be a
well-formed XML document conforming
to the
corresponding version of the
XML Namespaces Recommendation [XML Names]
or [XML Names 1.1].
[Definition: A reconstructed tree may be constructed by parsing the XML document and converting it into an instance of the data model as specified in [XQuery and XPath Data Model (XDM) 3.1].] The result of serialization MUST be such that the reconstructed tree is the same as the result tree except for the following permitted differences:
If the document was produced by adding a document wrapper, as
described above, then it will contain an extra doc
element as the document element.
The order of attribute and namespace nodes in the two trees MAY be different.
The following properties of corresponding nodes in the two trees MAY be different:
the base-uri property of document nodes and element nodes;
the document-uri and unparsed-entities properties of document nodes;
the type-name and typed-value properties of element and attribute nodes;
the nilled property of element nodes;
the content property of text nodes, due to the effect of the
indent
and use-character-maps
parameters.
The reconstructed tree MAY contain additional attributes and text nodes resulting from the expansion of default and fixed values in its DTD or schema; also, in the presence of a DTD, non-CDATA attributes may lose whitespace characters as a result of attribute value normalization.
The type annotations of the nodes in the two trees MAY be different. Type annotations in a result tree are discarded when the tree is serialized. Any new type annotations obtained by parsing the document will depend on whether the serialized XML document is assessed against a schema, and this MAY result in type annotations that are different from those in the original result tree.
Note:
In order to influence the type annotations in the
instance of the data model that would result from processing a serialized XML document,
the author of the XSLT stylesheet, XQuery expression or other process
might wish to create the instance of the data model that is input to the
serialization process so that it makes use of mechanisms provided by
[XML Schema], such as xsi:type
and
xsi:schemaLocation
attributes. The serialization process
will not automatically create such attributes in the serialized
document if those attributes were not part of the result tree that is
to be serialized.
Similarly, it is possible that an element node in
the instance of the data model that is to be serialized has the nilled
property with the value true
, but no xsi:nil
attribute. The serialization process will not create such an attribute
in the serialized document simply to reflect the value of the property.
The value of the nilled
property has no direct effect on
the serialized result.
Additional namespace nodes MAY be present in the reconstructed tree if the serialization process did not undeclare one or more namespaces, as described in 5.1.8 XML Output Method: the undeclare-prefixes Parameter, and the starting instance of the data model contained an element node with a namespace node that declared some prefix, but a child element of that node did not have any namespace node that declared the same prefix.
The result tree MAY contain namespace nodes that are not present in the reconstructed tree, as the process of creating an instance of the data model MAY ignore namespace declarations in some circumstances. See Section 6.2.3 Construction from an Infoset DM31 and Section 6.2.4 Construction from a PSVI DM31 of [XQuery and XPath Data Model (XDM) 3.1] for additional information.
If the indent
parameter has
one of the values yes
, true
or 1
,
additional text nodes consisting of whitespace characters MAY be present in the reconstructed tree; and
text nodes in the result tree that contained only whitespace characters MAY correspond to text nodes in the reconstructed tree that contain additional whitespace characters that were not present in the result tree
See 5.1.4 XML Output Method: the indent and suppress-indentation Parameters for more information on the
indent
parameter.
Additional nodes MAY be present in the reconstructed tree due to the effect of character mapping in the character expansion phase, and the values of attribute nodes and text nodes in the reconstructed tree MAY be different from those in the result tree, due to the effects of URI expansion, character mapping and Unicode Normalization in the character expansion phase of serialization.
Note:
The use-character-maps
parameter can
cause arbitrary characters to be inserted into the serialized XML document
in an unescaped form, including characters that would be considered to be
part of XML markup. Such characters could result in arbitrary new element
nodes, attribute nodes, and so on, in the reconstructed tree that results from
processing the serialized XML document.
A consequence of this rule is that certain characters
MUST be output as character
references, to ensure that they survive
the round trip through serialization and parsing.
Specifically, CR, NEL and LINE
SEPARATOR characters in text nodes MUST be output respectively as
"
", "…
", and
"

", or their equivalents; while CR, NL, TAB, NEL and
LINE SEPARATOR characters in attribute nodes MUST be output respectively
as "
", "

", "	
",
"…
", and "

", or their equivalents.
In addition, the non-whitespace control characters
#x1 through #x1F and #x7F through #x9F in text nodes and attribute nodes MUST be
output as character references.
For example, an attribute with the value "x" followed by "y"
separated by a newline will result in the output
"x
y"
(or with any equivalent character
reference). The XML output cannot be "x" followed by a literal newline
followed by a "y" because after parsing, the attribute value would be
"x y"
as a consequence of the XML attribute normalization
rules.
Note:
XML 1.0 did not permit
an XML processor to normalize NEL or LINE SEPARATOR characters to a LINE FEED character.
However, if
a document entity that specifies version 1.1 invokes an external general
parsed entity with no text declaration or a text declaration that specifies
version 1.0, the external parsed entity is processed according to the rules
of XML 1.1. For this reason, NEL and LINE SEPARATOR characters in text and
attribute nodes MUST always be escaped using character references,
regardless of the value of the version
parameter.
XML 1.0 permitted control characters in the range #x7F through #x9F
to appear as literal characters in an XML document, but XML 1.1
requires such characters, other than NEL,
to be escaped as character references. An
external general parsed entity with no text declaration or a text
declaration that specifies a version pseudo-attribute with value
1.0
that is invoked by an XML 1.1 document entity
MUST
follow the rules of XML 1.1. Therefore, the non-whitespace control
characters in the ranges #x1 through #x1F and #x7F through #x9F
MUST
always be escaped, regardless of the value of the version
parameter.
It is a serialization error [err:SEPM0004] to specify the doctype-system parameter, or to specify the standalone parameter
with a value other than omit
, if the
instance of the data model contains text nodes or multiple element nodes as children
of the root node. The
serializer
MUST either signal the error, or recover
by ignoring the request to output a document type declaration or
standalone
parameter.
version
Parameter
The version
parameter specifies the version of XML
and the version of Namespaces in XML to
be used for outputting the instance of the data model.
The version output in the XML declaration (if an XML declaration is not omitted)
MUST correspond to the version of XML that
the serializer
used for outputting the instance of the data model. The value of the
version
parameter
MUST match the
VersionNumXML
production of the XML Recommendation [XML10] or [XML11].
A serialization error [err:SESU0013] results if the value of the version
parameter specifies
a version of XML that is not supported by the serializer;
the serializer MUST
signal the error.
This document provides the normative
definition of serialization for the XML output method if the
version
parameter has either the value 1.0
or
1.1
. For
any other value of version
parameter, the behavior is
implementation-defined.
In that case the implementation-defined
behavior MAY supersede all other requirements of
this recommendation.
If the serialized result would contain an
NCNameNames that contains a character that is not
permitted by the version of Namespaces in XML specified by the
version
parameter, a serialization error [err:SERE0005] results.
The serializer MUST signal the error.
If the serialized result would contain a character
that is not permitted by the version of XML specified by the
version
parameter, a serialization error [err:SERE0006] results. The
serializer MUST signal the error.
For example, if the version
parameter has the value 1.0
, and the instance of the data
model contains a non-whitespace control character in the range #x1 to
#x1F, a serialization error [err:SERE0006] results.
If the version
parameter has the value 1.1
and a comment node in the instance of the data model contains a
non-whitespace control character in the range #x1 to #x1F or a
control character other than NEL in the range #x7F to #x9F, a
serialization error [err:SERE0006] results.
html-version
Parameter
The html-version
parameter is not applicable to the
XML output method. It is the responsibility of the
host language to specify
whether an error occurs if this parameter is specified in combination
with the XML output method, or if the parameter is simply dropped.
encoding
Parameter
The encoding
parameter specifies the
encoding to be used for outputting the instance of the data model.
Serializers
are REQUIRED to support values of UTF-8
and
UTF-16
. A serialization error [err:SESU0007] occurs if an output
encoding other than UTF-8
or UTF-16
is
requested and the serializer
does not support that encoding. The serializer
MUST signal the error, or recover by using
UTF-8
or UTF-16
instead.
The serializer
MUST NOT use an encoding whose name does not match the
EncNameXML
production of the XML Recommendation [XML10].
When outputting a newline character in the instance of the data model, the serializer is free to represent it using any character sequence that will be normalized to a newline character by an XML parser, unless a specific mapping for the newline character is provided in a character map (see 11 Character Maps).
When outputting any other character that is defined in the selected encoding, the character MUST be output using the correct representation of that character in the selected encoding.
It is possible that the instance of the data model will contain a character that cannot be represented in the encoding that the serializer is using for output. In this case, if the character occurs in a context where XML recognizes character references (that is, in the value of an attribute node or text node), then the character MUST be output as a character reference. A serialization error [err:SERE0008] occurs if such a character appears in a context where character references are not allowed (for example, if the character occurs in the name of an element). The serializer MUST signal the error.
For example,
if a text node contains the character LATIN SMALL LETTER E WITH ACUTE (#xE9),
and the value of the encoding
parameter is
US-ASCII
, the character MUST be serialized as a character
reference. If a comment node contains the same character, a
serialization error [err:SERE0008] results.
indent
and suppress-indentation
Parameters
The indent
and
suppress-indentation
parameters control whether the
serializer MAY adjust the whitespace
in the serialized result so that a person will find it easier to read.
If the indent
parameter has
one of the values yes
, true
or 1
,
the serializer MAY output whitespace characters in
addition to the whitespace characters in the instance of the data
model. It MAY also elide from the output whitespace
characters that occurred in the instance of the data model or replace
such whitespace characters with other whitespace characters.
[Definition: The term content has the same meaning as the term ContentXML defined in Section 3.1 Start-Tags, End-Tags, and Empty-Element TagsXML of [XML10].] [Definition: The immediate content of an element is the part of the content of the element that is not also in the content of a child element of that element.]
If the
indent
parameter has the value no
, false
or 0
, the
serializer MUST NOT output any additional, elide
or replace whitespace characters. If the indent
parameter has
one of the values yes
, true
or 1
,
the serializer MUST
use an algorithm for dealing with whitespace characters that satisfies
all of the following constraints.
If more than one constraint applies,
the serializer
MUST apply the most restrictive constraint. That is, if
any applicable constraint indicates that whitespace
MUST NOT be added, elided or replaced, that
constraint prevails; if an applicable constraint indicates that
whitespace SHOULD NOT be added, elided or replaced,
while all other applicable constraints indicate that whitespace
MAY
be added, elided or replaced, whitespace SHOULD NOT
be added, elided or replaced.
Whitespace characters MAY be added adjacent to a text node only if the text node contains only whitespace characters. Whitespace characters in such a text node MAY also be elided or replaced. For example, a tab MAY be inserted as a replacement for existing spaces.
Whitespace characters
MAY be added, elided or replaced in the
immediate content of an
element whose type annotation is xs:untyped
or
xs:anyType
and that has element node children, in the
immediate content of an element whose content model
is element only, or outside
the content of any element.
Whitespace characters MUST NOT be added, elided or replaced in the immediate content of an element whose content model is known to be simple or empty.
Whitespace characters
SHOULD NOT be added,
elided or replaced
in places where the characters would
constitute significant whitespace, for example, in the
immediate content of an element
that is annotated with a type
other than xs:untyped
or xs:anyType
, and
whose content model is known to be mixed.
Whitespace characters
MUST NOT be added,
elided or replaced
in the content
of an element
whose expanded QName is a member of the list of expanded QNames in the
value of the suppress-indentation
parameter.
Whitespace characters MUST NOT be
added, elided or replaced
in a part of the result document that is controlled by an
xml:space
attribute with value preserve
(See [XML10] for more information about the
xml:space
attribute).
Note:
The effect of these rules is to ensure that whitespace is only
added in places where (a) XSLT's <xsl:strip-space>
declaration could cause it to be removed, and
(b) it does not affect the string value of any element node with
simple content. It is usually not safe to indent document types that include elements
with mixed content.
Note:
The whitespace added may possibly be based on whitespace stripped from either the source document or the stylesheet (in the case of XSLT), or guided by other means that might depend on the host language, in the case of an instance of the data model created using some other process.
cdata-section-elements
Parameter
The cdata-section-elements
parameter contains a list
of expanded QNames. If the expanded QName of the parent of a text node
is a member of the list, then the text node
MUST be output as a
CDATA section, except in those circumstances
described below.
If the text node contains the sequence of characters
]]>
, then the currently open CDATA section
MUST be
closed following the ]]
and a new CDATA section opened
before the >
.
If the text node contains characters that are not representable in the character encoding being used to output the instance of the data model, then the currently open CDATA section MUST be closed before such characters, the characters MUST be output using character references or entity references, and a new CDATA section MUST be opened for any further characters in the text node.
CDATA sections
MUST NOT be used except where they
have been explicitly requested by the user, either by using the
cdata-section-elements
parameter, or by using some other
implementation-defined mechanism.
Note:
This is phrased to permit an implementor to provide an option that attempts to preserve CDATA sections present in the source document.
omit-xml-declaration
and standalone
Parameters
The XML output method
MUST output an XML declaration if the omit-xml-declaration
parameter has the value no
, false
or 0
.
The XML declaration MUST include both version information and an encoding declaration.
If the standalone
parameter has
one of the values yes
, true
, 1
, no
, false
or 0
,
the XML declaration MUST include a standalone document declaration with the same value as the value of the
standalone
parameter.
If the standalone
parameter has
the value omit
, the XML declaration
MUST NOT include a standalone document declaration; this ensures
that it is both an XML declaration (allowed at the beginning of a
document entity) and a text declaration (allowed at the beginning of
an external general parsed entity).
A serialization error [err:SEPM0009] results if the
omit-xml-declaration
parameter has
one of the values yes
, true
or 1
,
and
the standalone
parameter has a value other than
omit
; or
the version
parameter has a value other than
1.0
and the doctype-system
parameter is specified.
The serializer MUST signal the error.
Otherwise, if the
omit-xml-declaration
parameter has
one of the values yes
, true
or 1
,
the XML output method
MUST NOT output an XML declaration.
doctype-system
and doctype-public
Parameters
If the doctype-system
parameter is specified, the
XML output method
MUST output a document type
declaration immediately before the first element. The name following
<!DOCTYPE
MUST be the name of the first element,
if any. If the doctype-public
parameter is also specified, then the
XML output method MUST output PUBLIC
followed by the public identifier and then the system identifier;
otherwise, it MUST output SYSTEM
followed by the system
identifier. The internal subset
MUST be empty. The
doctype-public
parameter
MUST be ignored unless the
doctype-system
parameter is specified.
undeclare-prefixes
Parameter
The Data Model allows an element
node that binds a non-empty prefix to
have a child element node that does
not bind that same prefix. In Namespaces in XML 1.1 ([XML Names 1.1]), this can be represented accurately by undeclaring
prefixes. For the undeclaring prefix of the child element node,
if the undeclare-prefixes
parameter has
one of the values yes
, true
or 1
,
the output method is XML or XHTML, and the version
parameter value is greater than 1.0
,
the serializer
MUST undeclare its namespace. If the
undeclare-prefixes
parameter has the value no
, false
or 0
and the output method is XML or
XHTML, then the undeclaration of prefixes MUST NOT occur.
Consider an element x:foo
with four in-scope namespaces
that associate prefixes with URIs as follows:
x
is associated with
http://example.org/x
y
is associated with
http://example.org/y
z
is associated with
http://example.org/z
xml
is associated with
http://www.w3.org/XML/1998/namespace
Suppose that it has a child element x:bar
with three in-scope namespaces:
x
is associated with
http://example.org/x
y
is associated with
http://example.org/y
xml
is associated with
http://www.w3.org/XML/1998/namespace
If namespace undeclaration is in effect, it will be serialized this way:
<x:foo xmlns:x="http://example.org/x" xmlns:y="http://example.org/y" xmlns:z="http://example.org/z"> <x:bar xmlns:z="">...</x:bar> </x:foo>
In Namespaces in XML 1.0 ([XML Names]), prefix undeclaration is not possible.
If the output method is XML or XHTML, the value of the undeclare-prefixes
parameter is
one of,
yes
,
true
or 1
,
and the value of the version
parameter is 1.0
,
a serialization error [err:SEPM0010] results; the
serializer MUST signal the error.
normalization-form
Parameter
The normalization-form
parameter is applicable to the XML output method.
The values NFC
and none
MUST be supported by the serializer.
A serialization error [err:SESU0011] results if the value of the
normalization-form
parameter specifies a normalization form
that is not supported by the
serializer; the
serializer MUST signal the error.
The meanings associated with the possible values of
the normalization-form
parameter are as follows:
NFC
specifies the serialized result will be
in Normalization Form C, using the rules specified in [Character Model for the World Wide Web 1.0: Normalization].
NFD
specifies the serialized result will be
in Normalization Form D, as specified in [UAX #15: Unicode Normalization Forms].
NFKC
specifies the serialized result will be
in Normalization Form KC, as specified in [UAX #15: Unicode Normalization Forms].
NFKD
specifies the serialized result will be
in Normalization Form KD, as specified in [UAX #15: Unicode Normalization Forms].
fully-normalized
specifies the serialized result
will be in fully normalized text, as specified in [Character Model for the World Wide Web 1.0: Normalization].
none
specifies that no Unicode Normalization will
be applied.
An implementation-defined value has an implementation-defined effect.
If the value of the parameter is fully-normalized
, then no
relevant construct of the parsed entity created by the serializer
may start with a composing character. The term relevant construct
has the meaning defined in section 2.13 of [XML11]. If this condition is not
satisfied, a serialization error [err:SERE0012] MUST be signaled.
Note:
Specifying fully-normalized
as the value of this parameter
does not guarantee that the XML document output by the serializer will in fact
be fully normalized as defined in [XML11]. This is because the serializer does
not check that the text is include normalized
, which would involve
checking all external entities that it refers to (such as an external DTD).
Furthermore, the serializer does not check whether any character escape
generated using character maps represents a composing character.
media-type
Parameter
The media-type
parameter is applicable to the
XML output method.
See 3 Serialization Parameters for more
information.
use-character-maps
Parameter
The use-character-maps
parameter is applicable to the XML output method.
The result of serialization using the XML output method is not
guaranteed to be well-formed XML if character maps have been specified.
See 11 Character Maps for more information.
byte-order-mark
Parameter
The byte-order-mark
parameter is
applicable to the XML output method. See
3 Serialization Parameters for more information.
Note:
The byte order mark may be undesirable under certain circumstances;
for example, to concatenate resulting XML fragments without additional processing
to remove the byte order mark.
Therefore this specification does not mandate the byte-order-mark
parameter to have
one of the values yes
, true
or 1
when the encoding is UTF-16,
even though the XML 1.0 and XML 1.1 specifications state that entities encoded in
UTF-16 MUST begin with a byte order mark.
Consequently, this specification does not guarantee that the resulting XML fragment,
without a byte order mark, will not cause an error when processed by a conforming
XML processor.
escape-uri-attributes
Parameter
The escape-uri-attributes
parameter is
not applicable to the XML output method. It
is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination
with the XML output method, or if the parameter is simply dropped.
include-content-type
Parameter
The include-content-type
parameter is
not applicable to the XML output method. It
is the responsibility of the host language to specify whether an error occurs if this parameter is specified in combination
with the XML output method, or if the parameter is simply dropped.
item-separator
Parameter
The effect of the item-separator
serialization parameter
is described in 2 Sequence Normalization.