The XHTML output method serializes the instance of the data model as XML, using the HTML compatibility guidelines defined in the XHTML specification ([XHTML 1.0] or the XHTML syntax of HTML5 (see [HTML5]).
[Definition: An element node is recognized as an HTML element by the XHTML output method if
the element node is in the
XHTML namespace,
regardless of the value of the
html-version
serialization parameter
or if the html-version
serialization parameter is absent; or
the value of the
html-version
serialization parameter is
5.0
, the element has a
null namespace URI, and
the local part of the name is equal
to the name of an element defined by HTML5 [HTML5],
making the comparison
without regard to case.
]
It is entirely the responsibility of the
person or process that creates the instance of
the data model
to ensure that the instance of the data model
conforms to the [XHTML 1.0] or
[XHTML 1.1] specification
if the html-version
serialization parameter is absent or has a value less than
5.0
or the XHTML syntax of
HTML5 if the value of the
html-version
serialization parameter is 5.0
.
It is not an error if the
instance of the data model is invalid XHTML. Equally, it is entirely under the
control of the person or process that creates the instance
of the data model whether the output conforms to XHTML 1.0
Strict, XHTML 1.0 Transitional,
the XHTML syntax of HTML5 (see
[HTML5]),
[POLYGLOT]
or any other specific definition of XHTML.
The serialization of the instance of the data model follows the same rules as for the XML output method, with the general exceptions noted below and parameter-specific exceptions in 6.1 The Influence of Serialization Parameters upon the XHTML Output Method. These differences are based on the HTML compatibility guidelines published in Appendix C of [XHTML 1.0] and on [POLYGLOT], both of which are designed to ensure that as far as possible, XHTML is rendered correctly on user agents designed originally to handle HTML.
If the value of the html-version
serialization parameter is 5.0
, the instance of the data model that
is to be serialized is first subjected to
prefix normalization.
[Definition: During prefix normalization, any element node in the instance of the data model that is to be serialized that is in one of the XHTML namespace, the SVG namespace or the MathML namespace has its name replaced by the local part of its name. Such an element node is given a default namespace node whose value is the element's namespace URI. Any namespace node for any of those three namespaces that was previously present on any element node in the instance of the data model is also removed, unless the prefix that that namespace node declared is used as the prefix on the name of an attribute on that element or an ancestor of that element.]
The process of prefix normalization is equivalent to replacing the instance of the data model that is to be serialized with the result of the transformation described by this XSLT stylesheet, with the instance of the data model as the initial context item.
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="3.0" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xmlns:mathml="http://www.w3.org/1998/Math/MathML"> <xsl:template match="xhtml:*|svg:*|mathml:*"> <xsl:element name="{local-name()}" namespace="{namespace-uri()}"> <xsl:apply-templates select="@*|namespace::*|node()"/> </xsl:element> </xsl:template> <xsl:template match="node()|@*|namespace::*"> <xsl:copy copy-namespaces="no"> <xsl:apply-templates select="@*|namespace::*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="namespace::*[. eq 'http://www.w3.org/1999/xhtml']| namespace::*[. eq 'http://www.w3.org/2000/svg']| namespace::*[. eq 'http://www.w3.org/1998/Math/MathML']"/> </xsl:stylesheet>
[Definition: The following XHTML elements have an EMPTY content model: area
, base
, br
, col
, embed
, hr
, img
, input
, link
, meta
, basefont
, frame
, isindex
, and param
.]
[Definition: The
void elements of HTML5 are
area
, base
,
br
, col
, embed
,
hr
, img
, input
,
keygen
, link
, meta
,
param
, source
, track
and
wbr
.]
[Definition: An element node is expected to be empty if
it is recognized as an HTML element
and if either
the html-version
serialization parameter is
absent or has a value less than 5.0
and the content model is
EMPTY, or
the html-version
serialization parameter has the value
5.0
and the element is a void
element.
]
If an element node that has no child nodes is not expected to be empty,
the
html-version
serialization parameter is
absent or has a value
less than 5.0
, and the
content model of the HTML element
is not EMPTY
(for example, an empty title or paragraph); or
the value of the
html-version
serialization parameter is 5.0
, and the
HTML element is not a void element,
the serializer
MUST NOT use the minimized form.
That is, it
MUST
output <p></p>
and not
<p />
.
If an element that has no
children is
expected to be empty,
the serializer
MUST use the minimized tag syntax,
for example
<br />
, as the alternative syntax
<br></br>
allowed by XML gives uncertain
results in many
legacy
user agents.
If the
html-version
serialization parameter is
absent or has a value
less than 5.0
,
the serializer
MUST include a
space before the trailing />
, e.g.
<br />
, <hr />
and
<img src="karen.jpg" alt="Karen" />
.
If the
html-version
serialization parameter is
absent or has a value
less than 5.0
,
the serializer
MUST NOT use the entity reference
'
which, although
defined
in XML and therefore in
XHTML, is not defined in
versions of HTML
prior to HTML5,
and is not recognized by all HTML user
agents.
If the
html-version
serialization parameter is
absent or has a value
less than
5.0
,
the serializer SHOULD output namespace declarations
in a way that is consistent with the requirements of the XHTML DTD if this is
possible.
If the value of the
html-version
serialization parameter is
5.0
,
the serializer SHOULD
output namespace declarations in a way that is consistent with the requirements
of
[POLYGLOT].
The XHTML 1.0 DTDs require the declaration
xmlns="http://www.w3.org/1999/xhtml"
to appear on the html
element, and only on the html
element.
The
[POLYGLOT] specification
permits
namespace declarations
to appear in a conforming document, but
restricts the elements on which
they can appear.
The serializer MUST output namespace declarations that are consistent with
the namespace nodes present in the result tree, but it MUST avoid outputting
redundant namespace declarations on elements where the DTD would make them invalid,
for versions prior to HTML5, or where they
are not permitted by
[POLYGLOT],
for serialization according to the syntax of HTML5.
Note:
If the html
element is generated by an XSLT literal result element of
the form <html xmlns="http://www.w3.org/1999/xhtml"> ... </html>
, or by an
XQuery direct element constructor of the same form, then the html
element in
the result document will have a node name whose prefix is "", which will
satisfy the requirements of the DTD. In other cases the prefix assigned to
the element is implementation-dependent.
Note:
[POLYGLOT] and Appendix C of [XHTML 1.0] describe a number of compatibility guidelines for users of XHTML who wish to render their XHTML documents with HTML user agents. In some cases, such as the guideline on the form empty elements take, only the serialization process itself has the ability to follow the guideline. In such cases, those guidelines are reflected in the requirements on the serializer described above.
In all other cases, the guidelines can be
adhered to by the instance of the data model that is input to the serialization
process. The guideline on the use of whitespace characters in attribute
values is one such example. Another example is that xml:lang="..."
does not serialize to both xml:lang="..."
and lang="..."
as required by some legacy user agents. It is the responsibility of the person or
process that creates the instance of the data model that is input to the
serialization process to ensure it is created in a way that is consistent
with the guidelines. No serialization error results if the input instance
of the data model does not adhere to the guidelines.
version
Parameter
The behavior for the version
parameter for the XHTML output method is described in
5.1.1 XML Output Method: the version Parameter.
html-version
Parameter
The html-version
parameter specifies whether the XHTML
output method will produce a serialized document following rules that
are tailored to the requirements of the XHTML syntax of [HTML5]
or the requirements of [XHTML 1.0] and [XHTML 1.1].
The differences are described in detail throughout 6 XHTML Output Method.
encoding
Parameter
The behavior for encoding
parameter for the XHTML output method is described in 5.1.3 XML Output Method: the encoding Parameter.
indent
and suppress-indentation
Parameters
If the indent
parameter has
one of the values yes
, true
or 1
,
the
serializer MAY add or remove whitespace as it serializes the
result tree,
if it observes the following
constraints.
Whitespace MUST NOT be added other than before or after an element, or adjacent to an existing whitespace character.
Whitespace MUST NOT be added or removed adjacent to
an inline element. The inline elements are those elements
recognized
as HTML elements that are
in the %inline category of any of the XHTML 1.0 DTDs, in the
%inline.class category of the XHTML 1.1 DTD,
those elements defined to be phrasing
elements in HTML5
and elements
recognized
as HTML elements
with local names ins
and del
if they are used as
inline elements (i.e., if they do not contain element children).
Whitespace MUST NOT be added or removed inside a
formatted element, the formatted elements being those
recognized
as HTML elements
with local names pre
, script
, style
,
title
, and
textarea
.
Whitespace characters
MUST NOT be added in the content of an element
whose expanded QName matches a
member of the list of expanded QNames in the
value of the suppress-indentation
parameter.
The expanded QName of an element node
is considered to match a member of the list of expanded QNames
if:
the two expanded QNames are equal;
the expanded QNames both have null namespace URIs, and the local parts of the two QNames are equal without regard to case; or
the value of the
html-version
serialization parameter is
5.0
, the local parts of the two QNames are equal
without regard to case
and one QName has a null namespace
URI and the namespace URI of the other is equal to the
XHTML namespace URI.
Note:
The effect of the above constraints is to ensure any insertion or deletion of whitespace would not affect how an HTML user agent that conforms to the specified version of HTML would render the output, assuming the serialized document does not refer to any HTML style sheets.
The HTML definition of whitespace is different from the XML definition: see section 9.1 of [HTML] 4.01 specification.
cdata-section-elements
Parameter
The behavior for cdata-section-elements
parameter for the XHTML output method is described in 5.1.5 XML Output Method: the cdata-section-elements Parameter.
omit-xml-declaration
and standalone
Parameters
The behavior for omit-xml-declaration
and standalone
parameters for the XHTML output method is described in 5.1.6 XML Output Method: the omit-xml-declaration and standalone Parameters.
Note:
As with the XML output method, the XHTML
output method specifies that an XML declaration will be output unless it is suppressed
using
the omit-xml-declaration
parameter. Appendix C.1 of
[XHTML 1.0]
provides advice on the consequences of including,
or omitting, the XML declaration.
doctype-system
and doctype-public
Parameters
If the value of the
html-version
serialization parameter is 5.0
, the
doctype-system
serialization parameter is
absent,
the first element node child of
the document node that is to be serialized
is
recognized as an HTML
element, the local part of the QName of which is equal to
the string HTML
,
without regard to case,
and any text node preceding that
element in document order contains only whitespace characters,
then
the XHTML output method MUST output a document type
declaration immediately before the first element, with no public or
system identifier. The name following <!DOCTYPE
MUST
be the same as the local part of the
name of the element.
Otherwise, the behavior for doctype-system
and doctype-public
parameters for the XHTML output method is described in 5.1.7 XML Output Method: the doctype-system and doctype-public Parameters.
undeclare-prefixes
Parameter
The behavior for undeclare-prefixes
parameter for the XHTML output method is described in 5.1.8 XML Output Method: the undeclare-prefixes Parameter.
normalization-form
Parameter
The behavior for normalization-form
parameter for the XHTML output method is described in 5.1.9 XML Output Method: the normalization-form Parameter.
media-type
Parameter
The behavior for media-type
parameter for the XHTML output method is described in 5.1.10 XML Output Method: the media-type Parameter.
use-character-maps
Parameter
The behavior for use-character-maps
parameter for the XHTML output method is described in 5.1.11 XML Output Method: the use-character-maps Parameter.
byte-order-mark
Parameter
The behavior for byte-order-mark
parameter for the XHTML output method is described in 5.1.12 XML Output Method: the byte-order-mark Parameter.
escape-uri-attributes
Parameter
If the escape-uri-attributes
parameter has
one of the values yes
, true
or 1
,
the XHTML output method
MUST apply URI escaping to
URI attribute values, except that relative URIs MUST NOT be absolutized.
Note:
This escaping is deliberately confined to non-ASCII characters,
because escaping of ASCII characters is not always appropriate, for
example when URIs or URI fragments are interpreted locally by the HTML
user agent. Even in the case of non-ASCII characters, escaping can
sometimes cause problems. More precise control of URI escaping is
therefore available by setting escape-uri-attributes
to
no
, and controlling the escaping of URIs by using methods defined in
Section
6.2 fn:encode-for-uri
FO31 and Section
6.3 fn:iri-to-uri
FO31.
include-content-type
Parameter
If the instance of the data model includes a head
element
recognized as
an HTML element,
and the include-content-type
parameter has
one of the values yes
, true
or 1
,
the XHTML output method
MUST
add a meta
element as the first child element of the
head
element, specifying the character encoding actually
used.
The meta
element SHOULD
be in no namespace if the head
element is in no namespace, and in the XHTML namespace if the
head
element is in the XHTML namespace.
For example,
<head> <meta http-equiv="Content-Type" content="text/html; charset=EUC-JP" /> ...
The content type SHOULD be set to the value given for the
media-type
parameter.
Note:
It is recommended that the host language use as default
value for this parameter one of the MIME types ([RFC2046]) registered for
XHTML. Currently, these are text/html
(registered by [RFC2854])
and application/xhtml+xml
(registered by [RFC3236]). Note that
some user agents fail to recognize the charset parameter if the
content type is not text/html
.
If a meta
element has been added to the head
element as described above,
then any existing meta
element child of the head
element having an
http-equiv
attribute with the value "Content-Type",
making the comparison
without regard to case
after first stripping leading and trailing spaces from the value of
the attribute solely for the purposes of comparison,
MUST be discarded.
Note:
This process removes possible parameters in the attribute value. For example,
<meta http-equiv="Content-Type" content="text/html;version='3.0'" />
in the data model instance would be replaced by,
<meta http-equiv="Content-Type" content="text/html;charset=utf-8" />
item-separator
Parameter
The effect of the item-separator
serialization parameter
is described in 2 Sequence Normalization.