A sequence
is an ordered collection of zero or more items
.
An item
is either a node or an atomic value. The terms
sequence
and item
are defined formally in [XQuery 4.1: An XML Query Language] and [XML Path Language (XPath) 4.0].
The following functions are defined on sequences. These functions work on any sequence, without performing any operations that are sensitive to the individual items in the sequence.
Function | Meaning |
---|---|
fn:empty |
Returns true if the argument is the empty sequence. |
fn:exists |
Returns true if the argument is a non-empty sequence. |
fn:foot |
Returns the last item in a sequence. |
fn:head |
Returns the first item in a sequence. |
fn:identity |
Returns its argument value. |
fn:insert-before |
Returns a sequence constructed by inserting an item or a sequence of items at a given position within an existing sequence. |
fn:intersperse |
Inserts a separator between adjacent items in a sequence. |
fn:items-at |
Returns a sequence containing the items from $input
at positions defined by $at , in the order specified. |
fn:remove |
Returns a new sequence containing all the items of $input except the item
at position $position . |
fn:replicate |
Produces multiple copies of a sequence. |
fn:reverse |
Reverses the order of items in a sequence. |
fn:slice |
Returns a sequence containing selected items from a supplied input sequence based on their position. |
fn:subsequence |
Returns the contiguous sequence of items in $input
beginning at the position indicated by $start and
continuing for the number of items indicated by $length . |
fn:tail |
Returns all but the first item in a sequence. |
fn:trunk |
Returns all but the last item in a sequence. |
fn:unordered |
Returns the items of $input in an ·implementation-dependent· order. |
As in the previous section, for the illustrative examples below, assume an XQuery
or transformation operating on a non-empty Purchase Order document containing a
number of line-item elements. The variable $seq
is bound to the
sequence of line-item nodes in document order. The variables
$item1
, $item2
, etc. are bound to separate, individual
line-item nodes in the sequence.
Returns true if the argument is the empty sequence.
fn:empty ( |
||
$input |
as item()* |
|
) as xs:boolean |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $input
is the empty sequence, the function returns
true
; otherwise, the function returns false
.
The expression fn:empty((1,2,3)[10])
returns true()
.
The expression fn:empty(fn:remove(("hello", "world"), 1))
returns false()
.
The expression fn:empty([])
returns false()
.
The expression fn:empty(map{})
returns false()
.
The expression fn:empty("")
returns false()
.
Assuming $in
is an element with no children:
let $break := <br/> return fn:empty($break)
The result is false()
.
Returns true if the argument is a non-empty sequence.
fn:exists ( |
||
$input |
as item()* |
|
) as xs:boolean |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $input
is a non-empty sequence, the function returns
true
; otherwise, the function returns false
.
The expression fn:exists(fn:remove(("hello"), 1))
returns false()
.
The expression fn:exists(fn:remove(("hello", "world"), 1))
returns true()
.
The expression fn:exists([])
returns true()
.
The expression fn:exists(map{})
returns true()
.
The expression fn:exists("")
returns true()
.
Assuming $in
is an element with no children:
let $break := <br/> return fn:exists($break)
The result is true()
.
Returns the last item in a sequence.
fn:foot ( |
||
$input |
as item()* |
|
) as item()? |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the value of the expression $input[position() = last()]
If $input
is the empty sequence the empty sequence is returned.
The expression fn:foot(1 to 5)
returns (5)
.
The expression fn:foot(())
returns ()
.
Proposed for 4.0; not yet reviewed.
Returns the first item in a sequence.
fn:head ( |
||
$input |
as item()* |
|
) as item()? |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the value of the expression $input[1]
If $input
is the empty sequence, the empty sequence is returned. Otherwise
the first item in the sequence is returned.
The expression fn:head(1 to 5)
returns 1
.
The expression fn:head(("a", "b", "c"))
returns "a"
.
The expression fn:head(())
returns ()
.
The expression fn:head([1,2,3])
returns [1,2,3]
.
Returns its argument value.
fn:identity ( |
||
$input |
as item()* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns $input
.
The function is useful in contexts where a function must be supplied, but no processing is required.
The expression fn:identity(0)
returns (0)
.
The expression fn:identity(1 to 10)
returns (1, 2, 3, 4, 5, 6, 7, 8, 9, 10)
.
The expression fn:identity(/) is /
returns true()
.
The expression fn:identity(())
returns ()
.
New in 4.0. Accepted 2022-09-20.
Returns a sequence constructed by inserting an item or a sequence of items at a given position within an existing sequence.
fn:insert-before ( |
||
$input |
as item()* , |
|
$position |
as xs:integer , |
|
$insert |
as item()* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The value returned by the function consists of all items of $input
whose
1-based position is less than $position
, followed by all items of
$insert
, followed by the remaining elements of $input
, in
that order.
If $input
is the empty sequence, $insert
is returned. If
$insert
is the empty sequence, $input
is returned.
If $position
is less than one (1), the first position, the effective value
of $position
is one (1). If $position
is greater than the
number of items in $input
, then the effective value of
$position
is equal to the number of items in $input
plus
1.
The value of $input
is not affected by the sequence construction.
let $abc := ("a", "b", "c")
The expression fn:insert-before($abc, 0, "z")
returns ("z", "a", "b", "c")
.
The expression fn:insert-before($abc, 1, "z")
returns ("z", "a", "b", "c")
.
The expression fn:insert-before($abc, 2, "z")
returns ("a", "z", "b", "c")
.
The expression fn:insert-before($abc, 3, "z")
returns ("a", "b", "z", "c")
.
The expression fn:insert-before($abc, 4, "z")
returns ("a", "b", "c", "z")
.
Inserts a separator between adjacent items in a sequence.
fn:intersperse ( |
||
$input |
as item()* , |
|
$separator |
as item()* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the value of head($input), tail($input) ! ($separator, .)
.
If $input
contains less than two items then it is returned unchanged.
If $separator
is the empty sequence then $input
is returned unchanged.
For example, in XQuery, fn:intersperse(para, <hr/>)
would insert
an empty hr
element between adjacent paragraphs.
The expression fn:intersperse(1 to 5, "|")
returns (1, "|", 2, "|" , 3, "|", 4, "|", 5)
.
The expression fn:intersperse((), "|")
returns ()
.
The expression fn:intersperse("A", "|")
returns "A"
.
The expression fn:intersperse(1 to 5, ())
returns (1, 2, 3, 4, 5)
.
The expression fn:intersperse(1 to 5, ("⅓", "⅔"))
returns (1, "⅓", "⅔", 2, "⅓", "⅔", 3, "⅓", "⅔", 4, "⅓", "⅔", 5)
.
New in 4.0. Accepted 2022-09-27.
Returns a sequence containing the items from $input
at positions defined by $at
, in the order specified.
fn:items-at ( |
||
$input |
as item()* , |
|
$at |
as xs:integer* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Returns the value of $at ! fn:subsequence($input, ., 1)
The effect of the function is to return those items from $items
at the positions given by the integers in $at
, in the order
represented by the integers in $at
.
In the simplest case where $at
is a single integer,
fn:items-at($input, 3)
returns the same result as $input[3]
.
Compared with a simple positional filter expression, the function is useful because:
It can select items at multiple positions, and unlike fn:subsequence
,
these do not need to be contiguous.
The $at
expression can depend on the focus.
The order of the returned items can differ from their order in the $input
sequence.
If any integer in $at
is outside the range 1 to count($input)
, that integer
is effectively ignored: no error occurs.
If either of the arguments is an empty sequence, the result is an empty sequence.
The expression fn:items-at(11 to 20, 4)
returns 14
.
The expression fn:items-at(11 to 20, 4 to 6)
returns 14, 15, 16
.
The expression fn:items-at(11 to 20, (7, 3))
returns 17, 13
.
The expression fn:items-at(11 to 20, fn:index-of(("a", "b", "c"), "b"))
returns 12
.
The expression fn:items-at(fn:characters("quintessential"), (4, 8, 3))
returns ("n", "s", "i")
.
The expression fn:items-at((), 832)
returns ()
.
The expression fn:items-at((), ())
returns ()
.
Proposed for 4.0 in issue 213
Returns a new sequence containing all the items of $input
except the item
at position $position
.
fn:remove ( |
||
$input |
as item()* , |
|
$position |
as xs:integer |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns a sequence consisting of all items of $input
whose
1-based position is less than $position
, followed by all items of $target
whose 1-based position is greater than $position
.
If $position
is less than 1 or greater than the number of items in
$input
, $input
is returned.
If $input
is the empty sequence, the empty sequence is returned.
let $abc := ("a", "b", "c")
The expression fn:remove($abc, 0)
returns ("a", "b", "c")
.
The expression fn:remove($abc, 1)
returns ("b", "c")
.
The expression fn:remove($abc, 6)
returns ("a", "b", "c")
.
The expression fn:remove((), 3)
returns ()
.
Produces multiple copies of a sequence.
fn:replicate ( |
||
$input |
as item()* , |
|
$count |
as xs:nonNegativeInteger |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the value of (1 to $count) ! $input
.
If $input
is the empty sequence, the empty sequence is returned.
The $count
argument is declared as xs:nonNegativeInteger
,
which means that a type error occurs if it is called with a negative value.
If the input sequence contains nodes, these are not copied: instead, the result sequence contains
multiple references to the same node. So, for example, fn:count(fn:replicate(/, 6)|())
returns 1, because the fn:replicate
call creates duplicates, and the
union operation eliminates them.
[TODO: the use of type xs:nonNegativeInteger
for the second argument
assumes we will accept the proposal to allow downcasting in the coercion rules for
function arguments. MHK 2022-10-04.]
The expression fn:replicate(0, 6)
returns (0, 0, 0, 0, 0, 0)
.
The expression fn:replicate(("A", "B", "C"), 3)
returns ("A", "B", "C", "A", "B", "C", "A", "B", "C")
.
The expression fn:replicate((), 5)
returns ()
.
The expression fn:replicate(("A", "B", "C"), 1)
returns ("A", "B", "C")
.
The expression fn:replicate(("A", "B", "C"), 0)
returns ()
.
New in 4.0. Accepted 2022-10-04.
Reverses the order of items in a sequence.
fn:reverse ( |
||
$input |
as item()* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns a sequence containing the items in $input
in reverse
order.
If $input
is the empty sequence, the empty sequence is returned.
let $abc := ("a", "b", "c")
The expression fn:reverse($abc)
returns ("c", "b", "a")
.
The expression fn:reverse(("hello"))
returns ("hello")
.
The expression fn:reverse(())
returns ()
.
The expression fn:reverse([1,2,3])
returns [1,2,3]
. (The input is a sequence containing a single item (the array)).
The expression fn:reverse(([1,2,3],[4,5,6]))
returns ([4,5,6],[1,2,3])
.
Returns a sequence containing selected items from a supplied input sequence based on their position.
fn:slice ( |
||
$input |
as item()* , |
|
$start |
as xs:integer? |
:= () , |
$end |
as xs:integer? |
:= () , |
$step |
as xs:integer? |
:= () |
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $input
is the empty sequence, the function returns the empty sequence.
Let $S
be the first of the following that applies:
If $start
is absent, empty, or zero, then 1.
If $start
is negative, then fn:count($input) + $start + 1
.
Otherwise, $start
.
Let $E
be the first of the following that applies:
If $end
is absent, empty, or zero, then fn:count($input)
.
If $end
is negative, then fn:count($input) + $end + 1
.
Otherwise, $end
.
Let $STEP
be the first of the following that applies:
If $step
is absent, empty, or zero, then:
If $E ge $S
, then +1
Otherwise -1
Otherwise, $step
.
If $STEP
is negative, the function returns
$input => fn:reverse() => fn:slice(-$S, -$E, -$STEP)
.
Otherwise the function returns the result of the expression:
$input[position() ge $S and position() le $E and (position() - $S) mod $STEP eq 0]
The function is inspired by the slice operators in Javascript and Python, but it differs
in detail to accommodate the tradition of 1-based addressing in XPath. The end position is
inclusive rather than exclusive, so that in the simple case where $start
and
$end
are positive and $end > $start
,
fn:slice($in, $start, $end)
returns the same result as $in[position() = $start to $end]
.
let $in := ('a', 'b', 'c', 'd', 'e')
The expression fn:slice($in, start:2, end:4)
returns ("b", "c", "d")
.
The expression fn:slice($in, start:2)
returns ("b", "c", "d", "e")
.
The expression fn:slice($in, end:2)
returns ("a", "b")
.
The expression fn:slice($in, start:3, end:3)
returns ("c")
.
The expression fn:slice($in, start:4, end:3)
returns ("d", "c")
.
The expression fn:slice($in, start:2, end:5, step:2)
returns ("b", "d")
.
The expression fn:slice($in, start:5, end:2, step:-2)
returns ("e", "c")
.
The expression fn:slice($in, start:2, end:5, step:-2)
returns ()
.
The expression fn:slice($in, start:5, end:2, step:2)
returns ()
.
The expression fn:slice($in)
returns ("a", "b", "c", "d", "e")
.
The expression fn:slice($in, start:-1)
returns ("e")
.
The expression fn:slice($in, start:-3)
returns ("c", "d", "e")
.
The expression fn:slice($in, end:-2)
returns ("a", "b", "c", "d")
.
The expression fn:slice($in, start:2, end:-2)
returns ("b", "c", "d")
.
The expression fn:slice($in, start:-2, end:2)
returns ("d", "c", "b")
.
The expression fn:slice($in, start:-4, end:-2)
returns ("b", "c", "d")
.
The expression fn:slice($in, start:-2, end:-4)
returns ("d", "c", "b")
.
The expression fn:slice($in, start:-4, end:-2, step:2)
returns ("b", "d")
.
The expression fn:slice($in, start:-2, end:-4, step:-2)
returns ("d", "b")
.
The expression fn:slice(("a", "b", "c", "d"), 0)
returns ()
.
Proposed for 4.0; not yet reviewed. The design depends on having functions with keyword arguments.
Returns the contiguous sequence of items in $input
beginning at the position indicated by $start
and
continuing for the number of items indicated by $length
.
fn:subsequence ( |
||
$input |
as item()* , |
|
$start |
as xs:double |
|
) as item()* |
fn:subsequence ( |
||
$input |
as item()* , |
|
$start |
as xs:double , |
|
$length |
as xs:double |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
In the two-argument case, returns:
$input[fn:round($start) le position()]
In the three-argument case, returns:
$input[fn:round($start) le position() and position() lt fn:round($start) + fn:round($length)]
The first item of a sequence is located at position 1, not position 0.
If $input
is the empty sequence, the empty sequence is returned.
In the two-argument case, the function returns a sequence comprising those items of
$input
whose 1-based position
is greater than or equal to $start
(rounded to an integer).
No error occurs if $start
is zero or negative.
In the three-argument case, The function returns a sequence comprising those items of
$input
whose 1-based position
is greater than or equal to $start
(rounded to an integer), and
less than the sum of $start
and $length
(both rounded to integers).
No error occurs if $start
is zero or negative, or if $start
plus $length
exceeds the number of items in the sequence, or if
$length
is negative.
As a consequence of the general rules, if $start
is
-INF
and $length
is +INF
, then
fn:round($start) + fn:round($length)
is NaN
; since
position() lt NaN
is always false, the result is an empty sequence.
The reason the function accepts arguments of type xs:double
is that many
computations on untyped data return an xs:double
result; and the reason for
the rounding rules is to compensate for any imprecision in these floating-point
computations.
let $seq := ("item1", "item2", "item3", "item4", "item5")
The expression fn:subsequence($seq, 4)
returns ("item4", "item5")
.
The expression fn:subsequence($seq, 3, 2)
returns ("item3", "item4")
.
Returns all but the first item in a sequence.
fn:tail ( |
||
$input |
as item()* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the value of the expression subsequence($input, 2)
If $input
is the empty sequence, or a sequence containing a single item, then
the empty sequence is returned.
The expression fn:tail(1 to 5)
returns (2, 3, 4, 5)
.
The expression fn:tail(("a", "b", "c"))
returns ("b", "c")
.
The expression fn:tail("a")
returns ()
.
The expression fn:tail(())
returns ()
.
The expression fn:tail([1,2,3])
returns ()
.
Returns all but the last item in a sequence.
fn:trunk ( |
||
$input |
as item()* |
|
) as item()* |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the value of the expression fn:remove($input, count($input))
If $input
is the empty sequence, or a sequence containing a single item, then
the empty sequence is returned.
The expression fn:trunk(1 to 5)
returns (1, 2, 3, 4)
.
The expression fn:trunk(("a", "b", "c"))
returns ("a", "b")
.
The expression fn:trunk("a")
returns ()
.
The expression fn:trunk(())
returns ()
.
The expression fn:trunk([1,2,3])
returns ()
.
Proposed for 4.0.
Returns the items of $input
in an ·implementation-dependent· order.
fn:unordered ( |
||
$input |
as item()* |
|
) as item()* |
This function is ·nondeterministic-wrt-ordering·, ·context-independent·, and ·focus-independent·.
The function returns the items of $input
in an ·implementation-dependent· order.
Query optimizers may be able to do a better job if the order of the output sequence is not specified. For example, when retrieving prices from a purchase order, if an index exists on prices, it may be more efficient to return the prices in index order rather than in document order.
The expression fn:unordered((1, 2, 3, 4, 5))
returns some permutation of (1, 2, 3, 4, 5)
.
The functions in this section rely on comparisons between the items in one or more sequences.
Function | Meaning |
---|---|
fn:starts-with-sequence |
Determines whether one sequence starts with another, using a supplied callback function to compare items. |
fn:ends-with-sequence |
Determines whether one sequence ends with another, using a supplied callback function to compare items. |
fn:contains-sequence |
Determines whether one sequence contains another as a contiguous subsequence, using a supplied callback function to compare items. |
fn:distinct-values |
Returns the values that appear in a sequence, with duplicates eliminated. |
fn:index-of |
Returns a sequence of positive integers giving the positions within the sequence
$input of items that are equal to $search . |
fn:deep-equal |
This function assesses whether two sequences are deep-equal to each other. To be deep-equal, they must contain items that are pairwise deep-equal; and for two items to be deep-equal, they must either be atomic values that compare equal, or nodes of the same kind, with the same name, whose children are deep-equal, or maps with matching entries, or arrays with matching members. |
fn:differences |
This function compares two sequences and returns information about their differences. |
Determines whether one sequence starts with another, using a supplied callback function to compare items.
fn:starts-with-sequence ( |
||
$input |
as item()* , |
|
$subsequence |
as item()* , |
|
$compare |
as function(item(), item()) as xs:boolean |
:= fn:deep-equal#2 |
) as xs:boolean |
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The three-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Informally, the function returns true if $input
starts with $subsequence
,
when items are compared using the supplied (or default) $compare
function.
More formally, the function returns the value of the expression:
fn:count($input) ge fn:count($subsequence) and fn:all(fn:for-each-pair($input, $subsequence, $compare))
There is no requirement that the $compare
function should have the traditional qualities
of equality comparison. The result is well-defined, for example, even if $compare
is not transitive
or not symmetric.
The expression fn:starts-with-sequence((), ())
returns true()
.
The expression fn:starts-with-sequence(1 to 10, 1 to 5)
returns true()
.
The expression fn:starts-with-sequence(1 to 10, ())
returns true()
.
The expression fn:starts-with-sequence(1 to 10, 1 to 10)
returns true()
.
The expression fn:starts-with-sequence(1 to 10, 1)
returns true()
.
The expression fn:starts-with-sequence(1 to 10, 101 to 105, ->($x, $y){$x mod 100 = $y mod 100})
returns true()
.
The expression fn:starts-with-sequence(("A", "B", "C"), ("a", "b"), ->($x, $y){fn:compare($x, $y, "http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive") eq 0})
returns true()
.
The expression let $p := parse-xml("<doc><chap><p/><p/></chap></doc>")//p[2] return fn:starts-with-sequence($p!ancestor::*, $p!parent::*, op("is"))
returns true()
.
The expression fn:starts-with-sequence(10 to 20, 1 to 5, op("gt"))
returns true()
.
The expression fn:starts-with-sequence(("Alpha", "Beta", "Gamma"), ("A", "B"), fn:starts-with#2)
returns true()
.
The expression fn:starts-with-sequence(("Alpha", "Beta", "Gamma", "Delta"), 1 to 3, ->($x, $y){fn:ends-with($x, 'a')}
returns true()
. (True because the first three items in the input sequence end with "a".)
Accepted 2022-11-01
Determines whether one sequence ends with another, using a supplied callback function to compare items.
fn:ends-with-sequence ( |
||
$input |
as item()* , |
|
$subsequence |
as item()* , |
|
$compare |
as function(item(), item()) as xs:boolean |
:= fn:deep-equal#2 |
) as xs:boolean |
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The three-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Informally, the function returns true if $input
ends with $subsequence
,
when items are compared using the supplied (or default) $compare
function.
More formally, the function returns the value of the expression:
fn:starts-with-sequence(fn:reverse($input), fn:reverse($subsequence), $compare)
There is no requirement that the $compare
function should have the traditional qualities
of equality comparison. The result is well-defined, for example, even if $compare
is not transitive
or not symmetric.
The expression fn:ends-with-sequence((), ())
returns true()
.
The expression fn:ends-with-sequence(1 to 10, 5 to 10)
returns true()
.
The expression fn:ends-with-sequence(1 to 10, ())
returns true()
.
The expression fn:ends-with-sequence(1 to 10, 1 to 10)
returns true()
.
The expression fn:ends-with-sequence(1 to 10, 10)
returns true()
.
The expression fn:ends-with-sequence(1 to 10, 108 to 110, ->($x, $y){$x mod 100 = $y mod 100})
returns true()
.
The expression fn:ends-with-sequence(("A", "B", "C"), ("b", "c"), ->($x, $y){fn:compare($x, $y, "http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive") eq 0})
returns true()
.
The expression let $p := parse-xml("<doc><chap><p/><p/></chap></doc>")//p[2] return fn:ends-with-sequence($p!ancestor::node(), $p!root(), op("is"))</fos:expression>
<fos:result>true()
returns true()
.
The expression fn:ends-with-sequence(10 to 20, 1 to 5, op("gt"))
returns true()
.
The expression fn:ends-with-sequence(("Alpha", "Beta", "Gamma"), ("B", "G"), fn:starts-with#2)
returns true()
.
The expression fn:ends-with-sequence(("Alpha", "Beta", "Gamma", "Delta"), 1 to 2, ->($x, $y){fn:string-length($x) eq 5}
returns true()
. (True because the last two items in the input sequence have a string length of 5.)
Accepted 2022-11-01
Determines whether one sequence contains another as a contiguous subsequence, using a supplied callback function to compare items.
fn:contains-sequence ( |
||
$input |
as item()* , |
|
$subsequence |
as item()* , |
|
$compare |
as function(item(), item()) as xs:boolean |
:= fn:deep-equal#2 |
) as xs:boolean |
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The three-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Informally, the function returns true if $input
contains a consecutive subsequence matching $subsequence
,
when items are compared using the supplied (or default) $compare
function.
More formally, the function returns the value of the expression:
if (fn:starts-with-sequence($input, $subsequence, $compare)) then true() else if (fn:empty($input)) then false() else fn:contains-sequence(fn:tail($input, $subsequence, $compare))
There is no requirement that the $compare
function should have the traditional qualities
of equality comparison. The result is well-defined, for example, even if $compare
is not transitive
or not symmetric.
The expression fn:contains-sequence((), ())
returns true()
.
The expression fn:contains-sequence(1 to 10, 3 to 6)
returns true()
.
The expression fn:contains-sequence(1 to 10, (2, 4, 6))
returns false()
.
The expression fn:contains-sequence(1 to 10, ())
returns true()
.
The expression fn:contains-sequence(1 to 10, 1 to 10)
returns true()
.
The expression fn:contains-sequence(1 to 10, 5)
returns true()
.
The expression fn:contains-sequence(1 to 10, 103 to 105, ->($x, $y){$x mod 100 = $y mod 100})
returns true()
.
The expression fn:contains-sequence(("A", "B", "C", "D"), ("b", "c"), ->($x, $y){fn:compare($x, $y, "http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive") eq 0})
returns true()
.
The expression let $chap := parse-xml("<doc><chap><h1/><p/><p/><footnote/></chap></doc>")//chap return fn:contains-sequence($chap!child::*, $chap!child::p, op("is"))
returns true()
. (True because the p
children of the chap
element form a contiguous subsequence.)
The expression fn:contains-sequence(10 to 20, (5, 3, 1), op("gt"))
returns true()
.
The expression fn:contains-sequence(("Alpha", "Beta", "Gamma", "Delta"), ("B", "G"), fn:starts-with#2)
returns true()
.
The expression fn:contains-sequence(("Zero", "Alpha", "Beta", "Gamma", "Delta", "Epsilon"), 1 to 4, ->($x, $y){fn:ends-with($x, 'a')}
returns true()
. (True because there is a run of 4 consecutive items ending in "a".)
Accepted 2022-11-01
Returns the values that appear in a sequence, with duplicates eliminated.
fn:distinct-values ( |
||
$values |
as xs:anyAtomicType* |
|
) as xs:anyAtomicType* |
fn:distinct-values ( |
||
$values |
as xs:anyAtomicType* , |
|
$collation |
as xs:string |
|
) as xs:anyAtomicType* |
The one-argument form of this function is ·nondeterministic-wrt-ordering·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
The function returns the sequence that results from removing from $values
all
but one of a set of values that are considered equal to one another.
Two items $J and $K in the input sequence
(after atomization, as required by the function signature)
are considered equal if fn:deep-equal($J, $K, $coll)
is true,
where $coll
is the collation selected according to the rules in 5.3.5 Choosing a collation. This collation is used when string comparison is
required.
The order in which the sequence of values is returned is ·implementation-dependent·.
Which value of a set of values that compare equal is returned is ·implementation-dependent·.
If $values
is the empty sequence, the function returns the empty sequence.
Values of type xs:untypedAtomic
are compared as if they were of type
xs:string
.
Values that cannot be compared, because the eq
operator is not defined for
their types, are considered to be distinct.
For xs:float
and xs:double
values, positive zero is equal to
negative zero and, although NaN
does not equal itself, if $values
contains multiple NaN
values a single NaN
is returned.
If xs:dateTime
, xs:date
or xs:time
values do not
have a timezone, they are considered to have the implicit timezone provided by the
dynamic context for the purpose of comparison. Note that xs:dateTime
,
xs:date
or xs:time
values can compare equal even if their
timezones are different.
In previous versions of this specification, problems could arise when the input
sequence contained a mix of different numeric types, due to non-transitivity of the eq
operator in edge cases. This problem has been fixed by changes to the behavior of op:numeric-equal
:
see 4.3 Comparison operators on numeric values.
The expression fn:distinct-values((1, 2.0, 3, 2))
returns some permutation of (1, 3, 2.0)
. (The result may include either the xs:integer
2 or the xs:decimal
2.0).
The expression fn:distinct-values((xs:untypedAtomic("cherry"),
xs:untypedAtomic("plum"), xs:untypedAtomic("plum")))
returns some permutation of (xs:untypedAtomic("cherry"),
xs:untypedAtomic("plum"))
.
Returns a sequence of positive integers giving the positions within the sequence
$input
of items that are equal to $search
.
fn:index-of ( |
||
$input |
as xs:anyAtomicType* , |
|
$search |
as xs:anyAtomicType |
|
) as xs:integer* |
fn:index-of ( |
||
$input |
as xs:anyAtomicType* , |
|
$search |
as xs:anyAtomicType , |
|
$collation |
as xs:string |
|
) as xs:integer* |
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The three-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
The function returns a sequence of positive integers giving the positions within the
sequence $input
of items that are equal to $search
.
The collation used by this function is determined according to the rules in 5.3.5 Choosing a collation. This collation is used when string comparison is required.
The items in the sequence $input
are compared with $search
under
the rules for the eq
operator. Values of type xs:untypedAtomic
are compared as if they were of type xs:string
. Values that cannot be
compared, because the eq
operator is not defined for their types, are
considered to be distinct. If an item compares equal, then the position of that item in
the sequence $input
is included in the result.
The first item in a sequence is at position 1, not position 0.
The result sequence is in ascending numeric order.
If $input
is the empty sequence, or if no item in
$input
matches $search
, then the function returns the empty
sequence.
No error occurs if non-comparable values are encountered. So when comparing two atomic
values, the effective boolean value of fn:index-of($a, $b)
is true if
$a
and $b
are equal, false if they are not equal or not
comparable.
The expression fn:index-of((10, 20, 30, 40), 35)
returns ()
.
The expression fn:index-of((10, 20, 30, 30, 20, 10), 20)
returns (2, 5)
.
The expression fn:index-of(("a", "sport", "and", "a", "pastime"),
"a")
returns (1, 4)
.
The expression fn:index-of(current-date(), 23)
returns ()
.
The expression fn:index-of([1, [5, 6], [6, 7]], 6)
returns (3, 4)
. (The array is atomized to a sequence of five integers).
If @a
is an attribute of type xs:NMTOKENS
whose string
value is "red green blue"
, and whose typed value is therefore
("red", "green", "blue")
, then fn:index-of(@a, "blue")
returns 3
. This is because the function calling mechanism atomizes the
attribute node to produce a sequence of three xs:NMTOKEN
values.
This function assesses whether two sequences are deep-equal to each other. To be deep-equal, they must contain items that are pairwise deep-equal; and for two items to be deep-equal, they must either be atomic values that compare equal, or nodes of the same kind, with the same name, whose children are deep-equal, or maps with matching entries, or arrays with matching members.
fn:deep-equal ( |
||
$input1 |
as item()* , |
|
$input2 |
as item()* |
|
) as xs:boolean |
fn:deep-equal ( |
||
$input1 |
as item()* , |
|
$input2 |
as item()* , |
|
$collation |
as xs:string |
|
) as xs:boolean |
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The three-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
The $collation
argument identifies a collation which is used at all levels
of recursion when strings are compared (but not when names are compared), according to
the rules in 5.3.5 Choosing a collation.
If the two sequences are both empty, the function returns true
.
If the two sequences are of different lengths, the function returns
false
.
If the two sequences are of the same length, the function returns true
if
and only if every item in the sequence $input1
is deep-equal to the
item at the same position in the sequence $input2
. The rules for
deciding whether two items are deep-equal follow.
Call the two items $i1
and $i2
respectively.
If $i1
and $i2
are both atomic values, they are deep-equal if
and only if ($i1 eq $i2)
is true
, or if both values are
NaN
. If the eq
operator is not defined for $i1
and $i2
, the function returns false
.
If $i1
and $i2
are both ·maps·, the result is true
if and only if all the
following conditions apply:
Both maps have the same number of entries.
For every entry in the first map, there is an entry in the second map that:
has the ·same key· (note that the collation is not used when comparing keys), and
has the same associated value (compared using the fn:deep-equal
function, under the collation supplied in the original call to
fn:deep-equal
).
If $i1
and $i2
are both arrays,
the result is true
if and only if all
the following conditions apply:
Both arrays have the same number of members (array:size($i1) eq
array:size($i2)
).
Members in the same position of both arrays are deep-equal to each other, under
the collation supplied in the original call to fn:deep-equal
: that is,
every $p in 1 to array:size($i1) satisfies deep-equal($i1($p), $i2($p),
$collation)
If $i1
and $i2
are both nodes, they are compared as described
below:
If the two nodes are of different kinds, the result is false
.
If the two nodes are both document nodes then they are deep-equal if and only if
the sequence $i1/(*|text())
is deep-equal to the sequence
$i2/(*|text())
.
Note:
This rule was designed to ensure that comments and
processing instructions are ignored in the comparison. Unfortunately, however,
it fails to merge text nodes that are separated by a comment or processing
instruction. This oversight has been corrected in the new fn:differences
function.
If the two nodes are both element nodes then they are deep-equal if and only if all of the following conditions are satisfied:
The two nodes have the same name, that is (node-name($i1) eq
node-name($i2))
.
Either both nodes are annotated as having simple content or both nodes are annotated as having complex content. For this purpose "simple content" means either a simple type or a complex type with simple content; "complex content" means a complex type whose variety is mixed, element-only, or empty.
Note:
It is a consequence of this rule that validating a document D against a schema will usually (but not necessarily) result in a document that is not deep-equal to D. The exception is when the schema allows all elements to have mixed content.
The two nodes have the same number of attributes, and for every attribute
$a1
in $i1/@*
there exists an attribute
$a2
in $i2/@*
such that $a1
and
$a2
are deep-equal.
One of the following conditions holds:
Both element nodes are annotated as having simple content (as defined
in 3(b) above), and the typed value of $i1
is deep-equal
to the typed value of $i2
.
Both element nodes have a type annotation that is a complex type with
variety element-only, and the sequence $i1/*
is
deep-equal to the sequence $i2/*
.
Both element nodes have a type annotation that is a complex type with
variety mixed, and the sequence $i1/(*|text())
is
deep-equal to the sequence $i2/(*|text())
.
Note:
This rule was designed to ensure that comments and
processing instructions are ignored in the comparison. Unfortunately, however,
it fails to merge text nodes that are separated by a comment or processing
instruction. This oversight has been corrected in the new fn:differences
function.
Both element nodes have a type annotation that is a complex type with variety empty.
If the two nodes are both attribute nodes then they are deep-equal if and only if both the following conditions are satisfied:
The two nodes have the same name, that is (node-name($i1) eq
node-name($i2))
.
The typed value of $i1
is deep-equal to the typed value of
$i2
.
If the two nodes are both processing instruction nodes, then they are deep-equal if and only if both the following conditions are satisfied:
The two nodes have the same name, that is (node-name($i1) eq
node-name($i2))
.
The string value of $i1
is equal to the string value of
$i2
.
If the two nodes are both namespace nodes, then they are deep-equal if and only if both the following conditions are satisfied:
The two nodes either have the same name or are both nameless, that is
fn:deep-equal(node-name($i1), node-name($i2))
.
The string value of $i1
is equal to the string value of
$i2
when compared using the Unicode codepoint collation.
If the two nodes are both text nodes or comment nodes, then they are deep-equal if and only if their string-values are equal.
In all other cases the result is false.
A type error is raised [err:FOTY0015] if either input sequence contains a function item that is not a map or array.
The two nodes are not required to have the same type annotation, and they are not
required to have the same in-scope namespaces. They may also differ in their parent,
their base URI, and the values returned by the is-id
and
is-idrefs
accessors (see Section 5.5 is-id AccessorDM40 and
Section 5.6 is-idrefs AccessorDM40). The order of children is significant,
but the order of attributes is insignificant.
The contents of comments and processing instructions are significant only if these nodes appear directly as items in the two sequences being compared. The content of a comment or processing instruction that appears as a descendant of an item in one of the sequences being compared does not affect the result. However, the presence of a comment or processing instruction, if it causes a text node to be split into two text nodes, may affect the result.
Comparing items of different kind (for example, comparing an atomic
value to a node, or a map to an array, or an integer to an xs:date
) returns false, it does not return an error. So
the result of fn:deep-equal(1, current-dateTime())
is false
.
Comparing a function (other than a map or array) to any other value raises a type error.
let $at := <attendees> <name last='Parker' first='Peter'/> <name last='Barker' first='Bob'/> <name last='Parker' first='Peter'/> </attendees>
The expression fn:deep-equal($at, $at/*)
returns false()
.
The expression fn:deep-equal($at/name[1], $at/name[2])
returns false()
.
The expression fn:deep-equal($at/name[1], $at/name[3])
returns true()
.
The expression fn:deep-equal($at/name[1], 'Peter Parker')
returns false()
.
The expression fn:deep-equal(map{1:'a', 2:'b'}, map{2:'b', 1:'a'})
returns true()
.
The expression fn:deep-equal([1, 2, 3], [1, 2, 3])
returns true()
.
The expression fn:deep-equal((1, 2, 3), [1, 2, 3])
returns false()
.
This function compares two sequences and returns information about their differences.
fn:differences ( |
||
$input1 |
as item() , |
|
$input2 |
as item() |
|
) as map(*)* |
fn:differences ( |
||
$input1 |
as item() , |
|
$input2 |
as item() , |
|
$options |
as map(*) |
|
) as map(*)* |
fn:differences ( |
||
$input1 |
as item() , |
|
$input2 |
as item() , |
|
$options |
as map(*) , |
|
$collation |
as xs:string |
|
) as map(*)* |
The two-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The three-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
The four-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
Calling the 2-argument version of the function has the same effect as calling the 3-argument version with an empty map as the third argument.
Calling the 3-argument version of the function has the same effect as calling the 4-argument version with the default collation as the fourth argument.
The $collation
argument identifies a collation which is used at all levels
of recursion when general strings are compared (but not when node names or map keys are compared), according to
the rules in 5.3.5 Choosing a collation.
The behavior of the function is described in terms of a sequence of rules, which are
applied in order. Each rule takes two values as input, and produces either a difference
or nothing as its result. The final result of the fn:differences
function
is a sequence of differences, represented as maps, and is empty if no differences
were found.
The specification is recursive: recursion is used both when comparing node trees, and when
comparing trees comprising maps and arrays. When differences are found at any level, the
information that is returned identifying the difference includes a path, in the form of
a path expression, indicating how the relevant item was reached from the value passed as an argument.
A single path is sufficient to identify the data in both the input sequences. The path is a string,
in the general form of a path expression, and the specification indicates for each level of recursion
how this path is built up. For example: the path $input
identifies the input sequence supplied
in the call to the expression; $input[3]
identifies the third item in the input sequence;
if this is an array, $input[3](2)[1]
identifies the first item in the second member of that
array; if this is a node, then $input[3](2)[1]/node()[3]
identifies its third child node,
and $input[3](2)[1]/node()[3]/@Q{}name
identifies a named attribute of that node.
Each rule has the following properties:
Name: the name of the rule. This is used in two ways: in the result of the
function, it identifies which rule was not satisfied. In the $options
argument to the
function, it can be used to suppress checking of a particular rule by setting the corresponding
option to false. For example, setting map{"ATTRIBUTES":false()}
means that
the test named "ATTRIBUTES"
is not applied, which means that attributes
are not considered when comparing two elements.
Condition: indicates what kind of values the rule applies to. The condition
applies to both values, and the rule is applied only if the condition is satisfied for both values.
In most cases the condition is expressed simply as a SequenceType
, and the rule
is applicable to items that are instances of that SequenceType
.
Test: the test that is applied to the two values. In many cases the
test is expressed as an XPath expression, in which the two values are denoted as $A
and $B
. The test is satisfied if the expression returns true; it fails if the expression
returns false or fails with a dynamic error.
If the test is satisfied, no output is generated. If the test fails, a difference record is appended to the output of the function.
Some tests invoke recursive application of the rules. The recursive call appends a string to the current path information, so that the location of differences can be determined. The result of the recursive call is a sequence of difference records, possibly empty, which is appended to the function result.
If a rule for comparing two values fails, no further rules for comparing those two values are evaluated. This includes rules that invoke recursion. Comparison of other pairs of values continues, until either all values have been processed, or a limit is reached on the number of differences found.
The result of the function is a sequence of maps, each map holding information about one difference (that is, a failure to satisfy a rule). The map contains the following entries:
key | type | value |
A | item()* | The first value being compared |
B | item()* | The second value being compared |
rule | xs:string | The name of the rule that was not satisfied |
description | xs:string | A description of the mismatch, intended for the human reader. The content is ·implementation-dependent·. |
path | xs:string |
Path to the items from the root. This captures how the failing values were reached from the original input to the function, as a sequence of selection steps. The steps recorded are as follows:
Note: If the two arguments to |
The option limit
in the $options
argument may be set to an integer indicating
the maximum number of differences that should be reported. This is advisory only. The default is 100.
The rules for comparing sequences (at any level of recursion) are as follows:
Name | Condition | Test |
---|---|---|
COUNT |
|
|
ITEMS |
|
For each pair of items in corresponding positions in the two sequences,
apply the rules for comparing items recursively, appending |
The rules for comparing items are given in the next table. Most of the rules are checked by default;
those that are not are marked using the symbol † after the name. A rule that is
normally checked can be suppressed by including an entry in $options
whose key matches the rule name, and whose value is false()
. Conversely,
a rule that is not checked by default can be activated by means of an entry whose value is
true()
Name | Condition | Test | |
---|---|---|---|
KIND |
|
Either both items are atomic, or both items are nodes, or both items are functions. |
|
VALUE |
|
|
|
ATOMIC-TYPE† |
|
|
|
NODE-KIND |
|
|
|
NODE-NAME |
|
|
|
PREFIX† |
|
|
|
NODE-TYPE-ANNOTATION† |
|
|
|
BASE-URI† |
|
|
|
CONTENT |
|
Apply the rules for comparing sequences recursively to the sequence of child nodes,
appending |
|
ATTRIBUTES |
element(*) |
Construct maps representing the attributes of the two elements by applying
the function |
|
NAMESPACES† |
element(*) |
Construct maps representing the namespaces of the two elements by applying
the function |
|
STRING-VALUE |
|
|
|
TYPED-VALUE† |
|
Compare the typed values of the two nodes by recursively invoking the
rules for comparing sequences, appending Note: The typed value of a node is, in general, a sequence. |
|
FUNCTION-KIND |
function(*) |
|
|
ARRAY-SIZE |
|
|
|
ARRAY-CONTENT |
array(*) |
for every |
|
3 |
MAP-SIZE |
|
|
MAP-KEYS |
|
Note: That is, the two maps have the same set of key values |
|
5 |
MAP-ENTRIES |
|
For every key
fn:differences
to the two sequences, using the same options, and retaining $k in the path. |
2 |
FUNCTION-NAME |
|
|
2 |
FUNCTION-ARITY |
|
|
2 |
FUNCTION-SIGNATURE† |
|
The signatures of the two functions are identical (that is, the types of the arguments and the type of the result, but ignoring the names of arguments). |
Prior to comparison, the supplied sequences may be normalized. By default, no normalization is performed.
The following normalizations are defined. Each is performed only if there is an entry in $options
whose key matches the name of the normalization rule, and whose corresponding value is true()
.
Normalization Rule | Action |
---|---|
ignore-comments |
Comment nodes are removed within a tree (but not if they appear as top-level items). Following removal of a comment node, adjacent text nodes are merged. |
ignore-processing-instructions |
Processing instruction nodes are removed within a tree (but not if they appear as top-level items). Following removal of a processing instruction node, adjacent text nodes are merged. |
ignore-whitespace-nodes |
Whitespace text nodes are removed within a tree (but not if they appear as top-level items). |
normalize-space |
Any string values that are compared using a collation are first processed
using the |
normalize-unicode |
Any string values that are compared using a collation are first processed
using the |
The function is primarily designed to enable testing of the results of queries and stylesheets by comparing actual results with expected results. In this scenario, it is useful to know not only whether the actual results match the expected results, but also what the differences are, if any. It is also useful to be able to control which properties of the results are compared, for example, whether namespace prefixes, in-scope namespaces, and whitespace text nodes are considered significant.
Broadly speaking, the function returns an empty sequence in situations where fn:deep-equal
returns true. However, the two functions differ slightly in what properties of the supplied input values
are considered significant. A reasonably close (but not exact) approximation to the rules for
fn:deep-equal
is achieved by setting the normalization options ignore-comments
and
ignore-processing-instructions
to true, and by suppressing the tests
ATOMIC-TYPE
, PREFIX
, NODE-TYPE-ANNOTATION
, and
NAMESPACES
.
The function is specified to achieve a high level of interoperability between implementations, but it is to be expected that some differences in results will arise because different implementations perform the same tests in a different order.
Proposed for 4.0; not yet reviewed.
The following functions test the cardinality of their sequence arguments.
Function | Meaning |
---|---|
fn:zero-or-one |
Returns input if it contains zero or one items. Otherwise, raises an
error. |
fn:one-or-more |
Returns $input if it contains one or more items. Otherwise, raises an error.
|
fn:exactly-one |
Returns $input if it contains exactly one item. Otherwise, raises an error.
|
The functions fn:zero-or-one
, fn:one-or-more
, and
fn:exactly-one
defined in this section, check that the cardinality
of a sequence is in the expected range. They are particularly useful with regard
to static typing. For example, the function call fn:remove($seq, fn:index-of($seq2, 'abc'))
requires the result of the call on fn:index-of
to be a singleton integer,
but the static type system cannot infer this; writing the expression as
fn:remove($seq, fn:exactly-one(fn:index-of($seq2, 'abc')))
will provide a suitable static type at query analysis time, and ensures that the length of the sequence is
correct with a dynamic check at query execution time.
The type signatures for these functions deliberately declare the argument type as
item()*
, permitting a sequence of any length. A more restrictive
signature would defeat the purpose of the function, which is to defer
cardinality checking until query execution time.
Returns input
if it contains zero or one items. Otherwise, raises an
error.
fn:zero-or-one ( |
||
$input |
as item()* |
|
) as item()? |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Except in error cases, the function returns $input
unchanged.
A dynamic error is raised [err:FORG0003] if $input
contains more than one item.
Returns $input
if it contains one or more items. Otherwise, raises an error.
fn:one-or-more ( |
||
$input |
as item()* |
|
) as item()+ |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Except in error cases, the function returns $input
unchanged.
A dynamic error is raised [err:FORG0004] if $input
is an
empty sequence.
Returns $input
if it contains exactly one item. Otherwise, raises an error.
fn:exactly-one ( |
||
$input |
as item()* |
|
) as item() |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Except in error cases, the function returns $input
unchanged.
A dynamic error is raised [err:FORG0005] if $input
is an
empty sequence or a sequence containing more than one item.
Aggregate functions take a sequence as argument and return a single value
computed from values in the sequence. Except for fn:count
, the
sequence must consist of values of a single type or one if its subtypes, or they
must be numeric. xs:untypedAtomic
values are permitted in the
input sequence and handled by special conversion rules. The type of the items in
the sequence must also support certain operations.
Function | Meaning |
---|---|
fn:count |
Returns the number of items in a sequence. |
fn:avg |
Returns the average of the values in the input sequence $values , that is, the
sum of the values divided by the number of values. |
fn:max |
Returns a value that is equal to the highest value appearing in the input sequence. |
fn:min |
Returns a value that is equal to the lowest value appearing in the input sequence. |
fn:sum |
Returns a value obtained by adding together the values in $values . |
fn:all-equal |
Returns true if all items in a supplied sequence (after atomization) are equal. |
fn:all-different |
Returns true if no two items in a supplied sequence are equal. |
Returns the number of items in a sequence.
fn:count ( |
||
$input |
as item()* |
|
) as xs:integer |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns the number of items in $input
.
Returns 0 if $input
is the empty sequence.
let $seq1 := ($item1, $item2)
let $seq2 := (98.5, 98.3, 98.9)
let $seq3 := ()
The expression fn:count($seq1)
returns 2
.
The expression fn:count($seq3)
returns 0
.
The expression fn:count($seq2)
returns 3
.
The expression fn:count($seq2[. > 100])
returns 0
.
The expression fn:count([])
returns 1
.
The expression fn:count([1,2,3])
returns 1
.
Returns the average of the values in the input sequence $values
, that is, the
sum of the values divided by the number of values.
fn:avg ( |
||
$values |
as xs:anyAtomicType* |
|
) as xs:anyAtomicType? |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $values
is the empty sequence, the empty sequence is returned.
If $values
contains values of type xs:untypedAtomic
they are cast
to xs:double
.
Duration values must either all be xs:yearMonthDuration
values or must all
be xs:dayTimeDuration
values. For numeric values, the numeric promotion
rules defined in 4.2 Arithmetic operators on numeric values are used to promote all values to a single
common type. After these operations, $values
must satisfy the following condition:
There must be a type T such that:
every item in $values
is an instance of T.
T is one of xs:double
, xs:float
,
xs:decimal
, xs:yearMonthDuration
, or
xs:dayTimeDuration
.
The function returns the average of the values as sum($values) div
count($values)
; but the implementation may use an otherwise equivalent algorithm
that avoids arithmetic overflow.
A type error is raised [err:FORG0006] if the input sequence contains items of incompatible types, as described above.
let $d1 := xs:yearMonthDuration("P20Y")
let $d2 := xs:yearMonthDuration("P10M")
let $seq3 := (3, 4, 5)
The expression fn:avg($seq3)
returns 4.0
. (The result is of type xs:decimal
.)
The expression fn:avg(($d1, $d2))
returns xs:yearMonthDuration("P10Y5M")
.
fn:avg(($d1, $seq3))
raises a type error [err:FORG0006].
The expression fn:avg(())
returns ()
.
The expression fn:avg((xs:float('INF'), xs:float('-INF')))
returns xs:float('NaN')
.
The expression fn:avg(($seq3, xs:float('NaN')))
returns xs:float('NaN')
.
Returns a value that is equal to the highest value appearing in the input sequence.
fn:max ( |
||
$values |
as xs:anyAtomicType* |
|
) as xs:anyAtomicType? |
fn:max ( |
||
$values |
as xs:anyAtomicType* , |
|
$collation |
as xs:string |
|
) as xs:anyAtomicType? |
The zero-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The one-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
The following conversions are applied to the input sequence $values
, in order:
Values of type xs:untypedAtomic
in $values
are cast to
xs:double
.
If the resulting sequence contains values that are instances of more than one primitive type (meaning the 19 primitive types defined in [Schema 1.1 Part 2]), then:
If each value is an instance of one of the types xs:string
or xs:anyURI
,
then all the values are cast to type xs:string
.
If each value is an instance of one of the types xs:decimal
or xs:float
,
then all the values are cast to type xs:float
.
If each value is an instance of one of the types xs:decimal
, xs:float
,
or xs:double
, then all the values are cast to type xs:double
.
Otherwise, a type error is raised [err:FORG0006].
Note:
The primitive type of an xs:integer
value for this purpose is xs:decimal
.
The items in the resulting sequence may be reordered in an arbitrary order. The resulting sequence is referred to below as the converted sequence. The function returns an item from the converted sequence rather than the input sequence.
If the converted sequence is empty, the function returns the empty sequence.
All items in the converted sequence must be derived from a single base type for which
the le
operator is defined. In addition, the values in the sequence must
have a total order. If date/time values do not have a timezone, they are considered to
have the implicit timezone provided by the dynamic context for the purpose of
comparison. Duration values must either all be xs:yearMonthDuration
values
or must all be xs:dayTimeDuration
values.
If the converted sequence contains the value NaN
, the value
NaN
is returned
(as an xs:float
or xs:double
as appropriate).
If the items in the converted sequence are of type xs:string
or types
derived by restriction from xs:string
, then the determination of the item
with the smallest value is made according to the collation that is used. If the type of
the items in the converted sequence is not xs:string
and
$collation
is specified, the collation is ignored.
The collation used by this function is determined according to the rules in 5.3.5 Choosing a collation.
The function returns the result of the expression:
if (every $v in $c satisfies $c[1] ge $v) then $c[1] else fn:max(fn:tail($c))
evaluated with $collation
as the default collation if specified, and with
$c
as the converted sequence.
A type error is raised [err:FORG0006] if the input sequence contains items of incompatible types, as described above.
Because the rules allow the sequence to be reordered, if there are two or more items that are
"equal highest", the specific item whose value is returned is ·implementation-dependent·. This can arise for example if two different strings
compare equal under the selected collation, or if two different xs:dateTime
values compare equal despite being in different timezones.
If the converted sequence contains exactly one value then that value is returned.
The default type when the fn:max
function is applied to
xs:untypedAtomic
values is xs:double
. This differs from the
default type for operators such as gt
, and for sorting in XQuery and XSLT,
which is xs:string
.
The rules for the dynamic type of the result are stricter in version 3.1 of the specification than
in earlier versions. For example, if all the values in the input sequence belong to types derived from
xs:integer
, version 3.0 required only that the result be an instance
of the least common supertype of the types present in the input sequence; Version 3.1
requires that the returned value retains its original type. This does not apply, however, where type promotion
is needed to convert all the values to a common primitive type.
The expression fn:max((3,4,5))
returns 5
.
The expression fn:max([3,4,5])
returns 5
. (Arrays are atomized).
The expression fn:max((xs:integer(5), xs:float(5.0), xs:double(0)))
returns xs:double(5.0e0)
.
fn:max((3,4,"Zero"))
raises a type error [err:FORG0006].
The expression fn:max((fn:current-date(), xs:date("2100-01-01")))
returns xs:date("2100-01-01")
. (Assuming that the current date is during the 21st
century.)
The expression fn:max(("a", "b", "c"))
returns "c"
. (Assuming a typical default collation.)
Returns a value that is equal to the lowest value appearing in the input sequence.
fn:min ( |
||
$values |
as xs:anyAtomicType* |
|
) as xs:anyAtomicType? |
fn:min ( |
||
$values |
as xs:anyAtomicType* , |
|
$collation |
as xs:string |
|
) as xs:anyAtomicType? |
The zero-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and implicit timezone.
The one-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations, and static base URI, and implicit timezone.
The following rules are applied to the input sequence:
Values of type xs:untypedAtomic
in $values
are cast to
xs:double
.
If the resulting sequence contains values that are instances of more than one primitive type (meaning the 19 primitive types defined in [Schema 1.1 Part 2]), then:
If each value is an instance of one of the types xs:string
or xs:anyURI
,
then all the values are cast to type xs:string
.
If each value is an instance of one of the types xs:decimal
or xs:float
,
then all the values are cast to type xs:float
.
If each value is an instance of one of the types xs:decimal
, xs:float
,
or xs:double
, then all the values are cast to type xs:double
.
Otherwise, a type error is raised [err:FORG0006].
Note:
The primitive type of an xs:integer
value for this purpose is xs:decimal
.
The items in the resulting sequence may be reordered in an arbitrary order. The resulting sequence is referred to below as the converted sequence. The function returns an item from the converted sequence rather than the input sequence.
If the converted sequence is empty, the empty sequence is returned.
All items in the converted sequence must be derived from a single base type for which
the le
operator is defined. In addition, the values in the sequence must
have a total order. If date/time values do not have a timezone, they are considered to
have the implicit timezone provided by the dynamic context for the purpose of
comparison. Duration values must either all be xs:yearMonthDuration
values
or must all be xs:dayTimeDuration
values.
If the converted sequence contains the value NaN
, the value
NaN
is returned
(as an xs:float
or xs:double
as appropriate).
If the items in the converted sequence are of type xs:string
or types
derived by restriction from xs:string
, then the determination of the item
with the smallest value is made according to the collation that is used. If the type of
the items in the converted sequence is not xs:string
and
$collation
is specified, the collation is ignored.
The collation used by this function is determined according to the rules in 5.3.5 Choosing a collation.
The function returns the result of the expression:
if (every $v in $c satisfies $c[1] le $v) then $c[1] else fn:min(fn:tail($c))
evaluated with $collation
as the default collation if specified, and with
$c
as the converted sequence.
A type error is raised [err:FORG0006] if the input sequence contains items of incompatible types, as described above.
Because the rules allow the sequence to be reordered, if there are two or items that are
"equal lowest", the specific item whose value is returned is ·implementation-dependent·. This can arise for example if two different strings
compare equal under the selected collation, or if two different xs:dateTime
values compare equal despite being in different timezones.
If the converted sequence contains exactly one value then that value is returned.
The default type when the fn:min
function is applied to
xs:untypedAtomic
values is xs:double
. This differs from the
default type for operators such as lt
, and for sorting in XQuery and XSLT,
which is xs:string
.
The rules for the dynamic type of the result are stricter in version 3.1 of the specification than
in earlier versions. For example, if all the values in the input sequence belong to types derived from
xs:integer
, version 3.0 required only that the result be an instance
of the least common supertype of the types present in the input sequence; Version 3.1
requires that the returned value retains its original type. This does not apply, however, where type promotion
is needed to convert all the values to a common primitive type.
The expression fn:min((3,4,5))
returns 3
.
The expression fn:min([3,4,5])
returns 3
. (Arrays are atomized).
The expression fn:min((xs:integer(5), xs:float(5), xs:double(10)))
returns xs:double(5.0e0)
.
fn:min((3,4,"Zero"))
raises a type error [err:FORG0006].
fn:min((xs:float(0.0E0), xs:float(-0.0E0)))
can return either positive
or negative zero. The two items are equal, so it is ·implementation-dependent· which is returned.
The expression fn:min((fn:current-date(), xs:date("1900-01-01")))
returns xs:date("1900-01-01")
. (Assuming that the current date is set to a reasonable
value.)
The expression fn:min(("a", "b", "c"))
returns "a"
. (Assuming a typical default collation.)
Returns a value obtained by adding together the values in $values
.
fn:sum ( |
||
$values |
as xs:anyAtomicType* |
|
) as xs:anyAtomicType |
fn:sum ( |
||
$values |
as xs:anyAtomicType* , |
|
$zero |
as xs:anyAtomicType? |
|
) as xs:anyAtomicType? |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Any values of type xs:untypedAtomic
in $values
are cast to
xs:double
. The items in the resulting sequence may be reordered in an
arbitrary order. The resulting sequence is referred to below as the converted
sequence.
If the converted sequence is empty, then the single-argument form of the function
returns the xs:integer
value 0
; the two-argument form returns
the value of the argument $zero
.
If the converted sequence contains the value NaN
, NaN
is
returned.
All items in $values
must be numeric or derived from a single base type. In
addition, the type must support addition. Duration values must either all be
xs:yearMonthDuration
values or must all be
xs:dayTimeDuration
values. For numeric values, the numeric promotion
rules defined in 4.2 Arithmetic operators on numeric values are used to promote all values to a single
common type. The sum of a sequence of integers will therefore be an integer, while the
sum of a numeric sequence that includes at least one xs:double
will be an
xs:double
.
The result of the function, using the second signature, is the result of the expression:
if (fn:count($c) eq 0) then $zero else if (fn:count($c) eq 1) then $c[1] else $c[1] + fn:sum(subsequence($c, 2))
where $c
is the converted sequence.
The result of the function, using the first signature, is the result of the expression:
fn:sum($arg, 0)
.
A type error is raised [err:FORG0006] if the input sequence contains items of incompatible types, as described above.
The second argument allows an appropriate value to be defined to represent the sum of an empty sequence. For example, when summing a sequence of durations it would be appropriate to return a zero-length duration of the appropriate type. This argument is necessary because a system that does dynamic typing cannot distinguish "an empty sequence of integers", for example, from "an empty sequence of durations".
If the converted sequence contains exactly one value then that value is returned.
let $d1 := xs:yearMonthDuration("P20Y")
let $d2 := xs:yearMonthDuration("P10M")
let $seq1 := ($d1, $d2)
let $seq3 := (3, 4, 5)
The expression fn:sum(($d1, $d2))
returns xs:yearMonthDuration("P20Y10M")
.
The expression fn:sum($seq1[. lt xs:yearMonthDuration('P3M')],
xs:yearMonthDuration('P0M'))
returns xs:yearMonthDuration("P0M")
.
The expression fn:sum($seq3)
returns 12
.
The expression fn:sum(())
returns 0
.
The expression fn:sum((),())
returns ()
.
The expression fn:sum((1 to 100)[. lt 0], 0)
returns 0
.
fn:sum(($d1, 9E1))
raises a type error [err:FORG0006].
The expression fn:sum(($d1, $d2), "ein Augenblick")
returns xs:yearMonthDuration("P20Y10M")
. (There is no requirement that the $zero
value should be
the same type as the items in $value
, or even that it should belong to
a type that supports addition.)
The expression fn:sum([1, 2, 3])
returns 6
. (Atomizing an array returns the sequence obtained by atomizing its members.)
The expression fn:sum([[1, 2], [3, 4]])
returns 10
. (Atomizing an array returns the sequence obtained by atomizing its members.)
Returns true if all items in a supplied sequence (after atomization) are equal.
fn:all-equal ( |
||
$values |
as xs:anyAtomicType* |
|
) as xs:boolean |
fn:all-equal ( |
||
$values |
as xs:anyAtomicType* , |
|
$collation |
as xs:string |
|
) as xs:boolean |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations.
Omitting the second argument, $collation
, is equivalent to supplying
fn:default-collation()
. For more
information on collations see 5.3.5 Choosing a collation.
The result of the function fn:all-equal($values, $collation)
is true if and only if the result
of fn:count(fn:distinct-values($values, $collation)) le 1
is true (that is, if the sequence
is empty, or if all the items in the sequence are equal under the rules of the
fn:distinct-values
function).
The expression fn:all-equal((1,2,3))
returns false()
.
The expression fn:all-equal((1, 1.0, 1.0e0))
returns true()
.
The expression fn:all-equal("one")
returns true()
.
The expression fn:all-equal(())
returns true()
.
The expression fn:all-equal(("ABC", "abc"), "http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive")
returns true()
.
The expression fn:all-equal(//p/@class)
returns true if all
p
elements have the same value for @class
.
The expression fn:all-equal(*!fn:node-name())
returns true if all
element children of the context node have the same name.
Originally proposed for 4.0 under the name fn:uniform
.
Accepted 2022-09-20 with a change of name.
Returns true if no two items in a supplied sequence are equal.
fn:all-different ( |
||
$values |
as xs:anyAtomicType** |
|
) as xs:boolean |
fn:all-different ( |
||
$values |
as xs:anyAtomicType** , |
|
$collation |
as xs:string |
|
) as xs:boolean |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on collations.
Omitting the second argument, $collation
, is equivalent to supplying
fn:default-collation()
. For more
information on collations see 5.3.5 Choosing a collation.
The result of the function fn:all-different($values, $collation)
is true if and only if the result
of fn:count(fn:distinct-values($values, $collation)) eq fn:count($values)
is true
(that is, if the sequence
is empty, or if all the items in the sequence are distinct under the rules of the
fn:distinct-values
function).
The expression fn:all-different((1,2,3))
returns true()
.
The expression fn:all-different((1, 1.0, 1.0e0))
returns false()
.
The expression fn:all-different("one")
returns true()
.
The expression fn:all-different(())
returns true()
.
The expression fn:all-different(("ABC", "abc"), "http://www.w3.org/2005/xpath-functions/collation/html-ascii-case-insensitive")
returns false()
.
The expression fn:all-different(//employee/@ssn)
is true if no two employees have the same value for their
@ssn
attribute.
The expression fn:all-different(*!fn:node-name())
returns true if all
element children of the context node have distinct names.
Originally proposed for 4.0 under the name fn:unique
.
Accepted 2022-09-20 with a change of name and with clarifications to the description.
This section defines a number of functions used to find elements by ID
or IDREF
value,
or to generate IDs.
Function | Meaning |
---|---|
fn:id |
Returns the sequence of element nodes that have an ID value matching the
value of one or more of the IDREF values supplied in $values . |
fn:element-with-id |
Returns the sequence of element nodes that have an ID value matching the
value of one or more of the IDREF values supplied in $values . |
fn:idref |
Returns the sequence of element or attribute nodes with an IDREF value
matching the value of one or more of the ID values supplied in
$values . |
fn:generate-id |
This function returns a string that uniquely identifies a given node. |
Returns the sequence of element nodes that have an ID
value matching the
value of one or more of the IDREF
values supplied in $values
.
fn:id ( |
||
$values |
as xs:string* |
|
) as element()* |
fn:id ( |
||
$values |
as xs:string* , |
|
$node |
as node() |
:= . |
) as element()* |
The one-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-dependent·.
The two-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns a sequence, in document order with duplicates eliminated,
containing every element node E
that satisfies all the following
conditions:
E
is in the target document. The target document is the document
containing $node
, or the document containing the context item
(.
) if the second argument is omitted. The behavior of the
function if $node
is omitted is exactly the same as if the context
item had been passed as $node
.
E
has an ID
value equal to one of the candidate
IDREF
values, where:
An element has an ID
value equal to V
if either
or both of the following conditions are true:
The is-id
property (See Section 5.5 is-id AccessorDM40.) of the element node is true, and the typed value
of the element node is equal to V
under the rules of the
eq
operator using the Unicode codepoint collation
(http://www.w3.org/2005/xpath-functions/collation/codepoint
).
The element has an attribute node whose is-id
property
(See Section 5.5 is-id AccessorDM40.) is true and whose typed
value is equal to V
under the rules of the
eq
operator using the Unicode code point collation
(http://www.w3.org/2005/xpath-functions/collation/codepoint
).
Each xs:string
in $values
is parsed as if it were of
type IDREFS
, that is, each xs:string
in
$values
is treated as a whitespace-separated sequence of
tokens, each acting as an IDREF
. These tokens are then included
in the list of candidate IDREF
s. If any of the tokens is not a
lexically valid IDREF
(that is, if it is not lexically an
xs:NCName
), it is ignored. Formally, the candidate
IDREF
values are the strings in the sequence given by the
expression:
for $s in $values return fn:tokenize(fn:normalize-space($s), ' ')[. castable as xs:IDREF]
If several elements have the same ID
value, then E
is
the one that is first in document order.
A dynamic error is raised [err:FODC0001] if
$node
, or the context item if the second argument is absent, is a node
in a tree whose root is not a document node.
The following errors may be raised when $node
is omitted:
If the context item is absentDM40, dynamic error [err:XPDY0002]XP
If the context item is not a node, type error [err:XPTY0004]XP.
The effect of this function is anomalous in respect of element nodes with the
is-id
property. For legacy reasons, this function returns the element
that has the is-id
property, whereas it would be more appropriate to return
its parent, that being the element that is uniquely identified by the ID. A new function
fn:element-with-id
has been introduced with the desired
behavior.
If the data model is constructed from an Infoset, an attribute will have the
is-id
property if the corresponding attribute in the Infoset had an
attribute type of ID
: typically this means the attribute was declared as an
ID
in a DTD.
If the data model is constructed from a PSVI, an element or attribute will have the
is-id
property if its typed value is a single atomic value of type
xs:ID
or a type derived by restriction from xs:ID
.
No error is raised in respect of a candidate IDREF
value that does not
match the ID
of any element in the document. If no candidate
IDREF
value matches the ID
value of any element, the
function returns the empty sequence.
It is not necessary that the supplied argument should have type xs:IDREF
or xs:IDREFS
, or that it should be derived from a node with the
is-idrefs
property.
An element may have more than one ID
value. This can occur with synthetic
data models or with data models constructed from a PSVI where the element and one of its
attributes are both typed as xs:ID
.
If the source document is well-formed but not valid, it is possible for two or more
elements to have the same ID
value. In this situation, the function will
select the first such element.
It is also possible in a well-formed but invalid document to have an element or
attribute that has the is-id
property but whose value does not conform to
the lexical rules for the xs:ID
type. Such a node will never be selected by
this function.
let $emp := validate lax{ document{ <employee xml:id="ID21256" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <empnr xsi:type="xs:ID">E21256</empnr> <first>John</first> <last>Brown</last> </employee> } }
The expression $emp/id('ID21256')/name()
returns "employee"
. (The xml:id
attribute has the is-id
property,
so the employee element is selected.)
The expression $emp/id('E21256')/name()
returns "empnr"
. (Assuming the empnr
element is given the type
xs:ID
as a result of schema validation, the element will have the
is-id
property and is therefore selected. Note the difference from
the behavior of fn:element-with-id
.)
Returns the sequence of element nodes that have an ID
value matching the
value of one or more of the IDREF
values supplied in $values
.
fn:element-with-id ( |
||
$values |
as xs:string* |
|
) as element()* |
fn:element-with-id ( |
||
$values |
as xs:string* , |
|
$node |
as node() |
:= . |
) as element()* |
The one-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-dependent·.
The two-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
Note:
The effect of this function is identical to fn:id
in respect of
elements that have an attribute with the is-id
property. However, it
behaves differently in respect of element nodes with the is-id
property.
Whereas the fn:id
function, for legacy reasons, returns the element that has the
is-id
property, this function returns the element identified by the ID,
which is the parent of the element having the is-id
property.
The function returns a sequence, in document order with duplicates eliminated,
containing every element node E
that satisfies all the following
conditions:
E
is in the target document. The target document is the document
containing $node
, or the document containing the context item
(.
) if the second argument is omitted. The behavior of the
function if $node
is omitted is exactly the same as if the context
item had been passed as $node
.
E
has an ID
value equal to one of the candidate
IDREF
values, where:
An element has an ID
value equal to V
if either
or both of the following conditions are true:
The element has an child element node whose is-id
property (See Section 5.5 is-id AccessorDM40.) is true and
whose typed value is equal to V
under the rules of the
eq
operator using the Unicode code point collation
(http://www.w3.org/2005/xpath-functions/collation/codepoint
).
The element has an attribute node whose is-id
property
(See Section 5.5 is-id AccessorDM40.) is true and whose typed
value is equal to V
under the rules of the
eq
operator using the Unicode code point collation
(http://www.w3.org/2005/xpath-functions/collation/codepoint
).
Each xs:string
in $values
is parsed as if it were of
type IDREFS
, that is, each xs:string
in
$values
is treated as a whitespace-separated sequence of
tokens, each acting as an IDREF
. These tokens are then included
in the list of candidate IDREF
s. If any of the tokens is not a
lexically valid IDREF
(that is, if it is not lexically an
xs:NCName
), it is ignored. Formally, the candidate
IDREF
values are the strings in the sequence given by the
expression:
for $s in $arg return fn:tokenize(fn:normalize-space($s), ' ')[. castable as xs:IDREF]
If several elements have the same ID
value, then E
is
the one that is first in document order.
A dynamic error is raised [err:FODC0001] if $node
, or the context item if the second argument is omitted, is a node
in a tree whose root is not a document node.
The following errors may be raised when $node
is omitted:
If the context item is absentDM40, dynamic error [err:XPDY0002]XP
If the context item is not a node, type error [err:XPTY0004]XP.
This function is equivalent to the fn:id
function except when dealing with
ID-valued element nodes. Whereas the fn:id
function selects the element
containing the identifier, this function selects its parent.
If the data model is constructed from an Infoset, an attribute will have the
is-id
property if the corresponding attribute in the Infoset had an
attribute type of ID
: typically this means the attribute was declared as an
ID
in a DTD.
If the data model is constructed from a PSVI, an element or attribute will have the
is-id
property if its typed value is a single atomic value of type
xs:ID
or a type derived by restriction from xs:ID
.
No error is raised in respect of a candidate IDREF
value that does not
match the ID
of any element in the document. If no candidate
IDREF
value matches the ID
value of any element, the
function returns the empty sequence.
It is not necessary that the supplied argument should have type xs:IDREF
or xs:IDREFS
, or that it should be derived from a node with the
is-idrefs
property.
An element may have more than one ID
value. This can occur with synthetic
data models or with data models constructed from a PSVI where the element and one of its
attributes are both typed as xs:ID
.
If the source document is well-formed but not valid, it is possible for two or more
elements to have the same ID
value. In this situation, the function will
select the first such element.
It is also possible in a well-formed but invalid document to have an element or
attribute that has the is-id
property but whose value does not conform to
the lexical rules for the xs:ID
type. Such a node will never be selected by
this function.
let $emp := validate lax{ document{ <employee xml:id="ID21256" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <empnr xsi:type="xs:ID">E21256</empnr> <first>John</first> <last>Brown</last> </employee> } }
The expression $emp/fn:element-with-id('ID21256')/name()
returns "employee"
. (The xml:id
attribute has the is-id
property,
so the employee element is selected.)
The expression $emp/fn:element-with-id('E21256')/name()
returns "employee"
. (Assuming the empnr
element is given the type
xs:ID
as a result of schema validation, the element will have the
is-id
property and is therefore its parent is selected. Note the
difference from the behavior of fn:id
.)
Returns the sequence of element or attribute nodes with an IDREF
value
matching the value of one or more of the ID
values supplied in
$values
.
fn:idref ( |
||
$values |
as xs:string* |
|
) as node()* |
fn:idref ( |
||
$values |
as xs:string* , |
|
$node |
as node() |
:= . |
) as node()* |
The one-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-dependent·.
The two-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function returns a sequence, in document order with duplicates eliminated,
containing every element or attribute node $N
that satisfies all the
following conditions:
$N
is in the target document. The target document is the document
containing $node
or the document containing the context item
(.
) if the second argument is omitted. The behavior of the
function if $node
is omitted is exactly the same as if the context
item had been passed as $node
.
$N
has an IDREF
value equal to one of the candidate
ID
values, where:
A node $N
has an IDREF
value equal to
V
if both of the following conditions are true:
The is-idrefs
property (see Section 5.6 is-idrefs AccessorDM40) of $N
is true
.
The sequence
fn:tokenize(fn:normalize-space(fn:string($N)), ' ')
contains a string that is
equal to V
under the rules of the eq
operator using the Unicode code point collation
(http://www.w3.org/2005/xpath-functions/collation/codepoint
).
Each xs:string
in $values
is parsed as if it were of
lexically of type xs:ID
. These xs:string
s are then
included in the list of candidate xs:ID
s. If any of the strings
in $values
is not a lexically valid xs:ID
(that is,
if it is not lexically an xs:NCName
), it is ignored. More
formally, the candidate ID
values are the strings in the
sequence:
$values[. castable as xs:NCName]
A dynamic error is raised [err:FODC0001] if
$node
, or the context item if the second argument is omitted, is a node
in a tree whose root is not a document node.
The following errors may be raised when $node
is omitted:
If the context item is absentDM40, dynamic error [err:XPDY0002]XP
If the context item is not a node, type error [err:XPTY0004]XP.
An element or attribute typically acquires the is-idrefs
property by being
validated against the schema type xs:IDREF
or xs:IDREFS
, or
(for attributes only) by being described as of type IDREF
or
IDREFS
in a DTD.
Because the function is sensitive to the way in which the data model is constructed, calls on this function are not always interoperable.
No error is raised in respect of a candidate ID
value that does not match
the IDREF
value of any element or attribute in the document. If no
candidate ID
value matches the IDREF
value of any element or
attribute, the function returns the empty sequence.
It is possible for two or more nodes to have an IDREF
value that matches a
given candidate ID
value. In this situation, the function will return all
such nodes. However, each matching node will be returned at most once, regardless how
many candidate ID
values it matches.
It is possible in a well-formed but invalid document to have a node whose
is-idrefs
property is true but that does not conform to the lexical
rules for the xs:IDREF
type. The effect of the above rules is that
ill-formed candidate ID
values and ill-formed IDREF
values are
ignored.
If the data model is constructed from a PSVI, the typed value of a node that has the
is-idrefs
property will contain at least one atomic value of type
xs:IDREF
(or a type derived by restriction from xs:IDREF
).
It may also contain atomic values of other types. These atomic values are treated as
candidate ID
values if two conditions are met: their lexical form must be valid as an
xs:NCName
, and there must be at least one instance of xs:IDREF
in the typed value of the node. If these conditions are not satisfied, such values are ignored.
let $emp := validate lax { document { <employees xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <employee xml:id="ID21256"> <empnr xsi:type="xs:ID">E21256</empnr> <first>Anil</first> <last>Singh</last> <deputy xsi:type="xs:IDREF">E30561</deputy> </employee> <employee xml:id="ID30561"> <empnr xsi:type="xs:ID">E30561</empnr> <first>John</first> <last>Brown</last> <manager xsi:type="xs:IDREF">ID21256</manager> </employee> </employees> } }
The expression $emp/(element-with-id('ID21256')/@xml:id => fn:idref())/ancestor::employee/last => string()
returns "Brown"
. (Assuming that manager
has the is-idref property, the call on fn:idref
selects
the manager
element. If, instead, the manager
had a ref
attribute with the is-idref property, the call on fn:idref
would select the attribute node.)
The expression $emp/(element-with-id('E30561')/empnr => fn:idref())/ancestor::employee/last => string()
returns "Singh"
. (Assuming that employee/deputy
has the is-idref property, the call on fn:idref
selects
the deputy
element.)
This function returns a string that uniquely identifies a given node.
fn:generate-id () as xs:string |
fn:generate-id ( |
||
$node |
as node()? |
:= . |
) as xs:string |
The zero-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-dependent·.
The one-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If the argument is omitted, it defaults to the context item (.
). The
behavior of the function if the argument is omitted is exactly the same as if the
context item had been passed as the argument.
If the argument is the empty sequence, the result is the zero-length string.
In other cases, the function returns a string that uniquely identifies a given node.
More formally, it is guaranteed that within a single
·execution scope·,
fn:codepoint-equal(fn:generate-id($N), fn:generate-id($M))
returns true
if and only if ($M is $N)
returns true.
The returned identifier must consist of ASCII alphanumeric characters and must start with an alphabetic character. Thus, the string is syntactically an XML name.
The following errors may be raised when $node
is omitted:
If the context item is absentDM40, dynamic error [err:XPDY0002]XP
If the context item is not a node, type error [err:XPTY0004]XP.
An implementation is free to generate an identifier in any convenient way provided that it always generates the same identifier for the same node and that different identifiers are always generated from different nodes. An implementation is under no obligation to generate the same identifiers each time a document is transformed or queried.
There is no guarantee that a generated unique identifier will be distinct from any unique IDs specified in the source document.
There is no inverse to this function; it is not directly possible to find the node with
a given generated ID. Of course, it is possible to search a given sequence of nodes
using an expression such as $nodes[generate-id()=$id]
.
It is advisable, but not required, for implementations to generate IDs that are distinct even when compared using a case-blind collation.
The primary use case for this function is to generate hyperlinks. For example, when
generating HTML, an anchor for a given section $sect
can be generated by
writing (in either XSLT or XQuery):
<a name="{fn:generate-id($sect)}"/>
and a link to that section can then be produced with code such as:
see <a href="#{fn:generate-id($sect)}">here</a>
Note that anchors generated in this way will not necessarily be the same each time a document is republished.
Since the keys in a map must be atomic values, it is possible to use generated IDs
as surrogates for nodes when constructing a map. For example, in some implementations,
testing whether a node $N
is a member of a large node-set $S
using the expression fn:exists($N intersect $S)
may be expensive; there
may then be performance benefits in creating a map:
let $SMap := map:merge($S!map{fn:generate-id(.) : .})
and then testing for membership of the node-set using:
map:contains($SMap, fn:generate-id($N))
The functions in this section provide access to resources (such as files) in the external environment.
Function | Meaning |
---|---|
fn:doc |
Retrieves a document using a URI supplied as an xs:string , and returns the
corresponding document node. |
fn:doc-available |
The function returns true if and only if the function call fn:doc($href)
would return a document node. |
fn:collection |
Returns a sequence of items identified by a collection URI; or a default collection if no URI is supplied. |
fn:uri-collection |
Returns a sequence of xs:anyURI values representing the URIs in a URI
collection. |
fn:unparsed-text |
The fn:unparsed-text function reads an external resource (for example, a
file) and returns a string representation of the resource. |
fn:unparsed-text-lines |
The fn:unparsed-text-lines function reads an external resource (for
example, a file) and returns its contents as a sequence of strings, one for each line
of
text in the string representation of the resource. |
fn:unparsed-text-available |
Because errors in evaluating the fn:unparsed-text function are
non-recoverable, these two functions are provided to allow an application to determine
whether a call with particular arguments would succeed. |
fn:environment-variable |
Returns the value of a system environment variable, if it exists. |
fn:available-environment-variables |
Returns a list of environment variable names that are suitable for passing to
fn:environment-variable , as a (possibly empty) sequence of strings. |
Retrieves a document using a URI supplied as an xs:string
, and returns the
corresponding document node.
fn:doc ( |
||
$href |
as xs:string? |
|
) as document-node()? |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on available documents, and static base URI.
If $href
is the empty sequence, the result is an empty sequence.
If $href
is a relative URI reference, it is resolved relative to the value
of the static base URI property from the static context. The resulting absolute URI is
promoted to an xs:string
.
If the available documents described in Section 2.1.2 Dynamic Context XP31 provides a mapping from this string to a document node, the function returns that document node.
The URI may include a fragment identifier.
By default, this function is ·deterministic·. Two calls on this function return the same document node if the same URI Reference (after resolution to an absolute URI Reference) is supplied to both calls. Thus, the following expression (if it does not raise an error) will always be true:
doc("foo.xml") is doc("foo.xml")
However, for performance reasons, implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is implementation-defined. If the user has not selected such an option, a call of the function must either return a deterministic result or must raise a dynamic error [err:FODC0003].
Note:
If $href
is read from a source document, it is generally appropriate to
resolve it relative to the base URI property of the relevant node in the source
document. This can be achieved by calling the fn:resolve-uri
function,
and passing the resulting absolute URI as an argument to the fn:doc
function.
If two calls to this function supply different absolute URI References as arguments, the same document node may be returned if the implementation can determine that the two arguments refer to the same resource.
By defining the semantics of this function in terms of a string-to-document-node mapping in the dynamic context, the specification is acknowledging that the results of this function are outside the purview of the language specification itself, and depend entirely on the run-time environment in which the expression is evaluated. This run-time environment includes not only an unpredictable collection of resources ("the web"), but configurable machinery for locating resources and turning their contents into document nodes within the XPath data model. Both the set of resources that are reachable, and the mechanisms by which those resources are parsed and validated, are ·implementation-dependent·.
One possible processing model for this function is as follows. The resource identified by the URI Reference is retrieved. If the resource cannot be retrieved, a dynamic error is raised [err:FODC0002]. The data resulting from the retrieval action is then parsed as an XML document and a tree is constructed in accordance with the [XQuery and XPath Data Model (XDM) 3.0]. If the top-level media type is known and is "text", the content is parsed in the same way as if the media type were text/xml; otherwise, it is parsed in the same way as if the media type were application/xml. If the contents cannot be parsed successfully, a dynamic error is raised [err:FODC0002]. Otherwise, the result of the function is the document node at the root of the resulting tree. This tree is then optionally validated against a schema.
Various aspects of this processing are ·implementation-defined·. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:
The set of URI schemes that the implementation recognizes is implementation-defined. Implementations may allow the mapping of URIs to resources to be configured by the user, using mechanisms such as catalogs or user-written URI handlers.
The handling of non-XML media types is implementation-defined. Implementations may allow instances of the data model to be constructed from non-XML resources, under user control.
It is ·implementation-defined· whether DTD validation and/or schema validation is applied to the source document.
Implementations may provide user-defined error handling options that allow processing to continue following an error in retrieving a resource, or in parsing and validating its content. When errors have been handled in this way, the function may return either an empty sequence, or a fallback document provided by the error handler.
Implementations may provide user options that relax the requirement for the function to return deterministic results.
The effect of a fragment identifier in the supplied URI is ·implementation-defined·. One possible interpretation is to treat the fragment identifier as an ID attribute value, and to return a document node having the element with the selected ID value as its only child.
A dynamic error may be raised [err:FODC0005] if
$href
is not a valid URI reference.
A dynamic error is raised [err:FODC0002] if a relative URI reference is supplied, and the base-URI property in the static context is absent.
A dynamic error is raised [err:FODC0002] if the available documents provides no mapping for the absolutized URI.
A dynamic error is raised [err:FODC0002] if the resource cannot be retrieved or cannot be parsed successfully as XML.
A dynamic error is raised [err:FODC0003] if the implementation is not able to guarantee that the result of the function will be deterministic, and the user has not indicated that an unstable result is acceptable.
The function returns true if and only if the function call fn:doc($href)
would return a document node.
fn:doc-available ( |
||
$href |
as xs:string? |
|
) as xs:boolean |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on available documents, and static base URI.
If $href
is an empty sequence, this function returns false
.
If a call on fn:doc($href)
would return a document node, this function
returns true
.
In all other cases this function returns false
. This
includes the case where an invalid URI is supplied, and also the case where
a valid relative URI reference is supplied, and cannot be resolved,
for example because the static base URI is absent.
If this function returns true
, then calling fn:doc($href)
within the same ·execution scope· must return a document node. However,
if nondeterministic processing has been selected for the fn:doc
function,
this guarantee is lost.
Returns a sequence of items identified by a collection URI; or a default collection if no URI is supplied.
fn:collection () as item()* |
fn:collection ( |
||
$uri |
as xs:string? |
|
) as item()* |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on available collections, and static base URI.
This function takes an xs:string
as argument and returns a sequence of
items obtained by interpreting $uri
as an xs:anyURI
and
resolving it according to the mapping specified in available
collections described in Section
C.2 Dynamic Context Components
XP31.
If available collections provides a mapping from this string to a sequence of items, the function returns that sequence. If available collections maps the string to an empty sequence, then the function returns an empty sequence.
If $uri
is not specified, the function returns the sequence of items in
the default collection in the dynamic context. See Section
C.2 Dynamic Context Components
XP31.
If $uri
is a relative xs:anyURI
, it is resolved
against the value of the base-URI property from the static context.
If $uri
is the empty sequence, the function behaves as if it had been
called without an argument. See above.
By default, this function is ·deterministic·. This means that repeated calls on the function with the same argument will return the same result. However, for performance reasons, implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is ·implementation-defined·. If the user has not selected such an option, a call to this function must either return a deterministic result or must raise a dynamic error [err:FODC0003].
There is no requirement that any nodes in the result should be in document order, nor is there a requirement that the result should contain no duplicates.
A dynamic error is raised [err:FODC0002] if no URI is supplied and the value of the default collection is absentDM40.
A dynamic error is raised [err:FODC0002] if a relative URI reference is supplied, and the base-URI property in the static context is absent.
A dynamic error is raised [err:FODC0002] if available node collections provides no mapping for the absolutized URI.
A dynamic error may be raised [err:FODC0004] if $uri
is not
a valid xs:anyURI
.
In earlier versions of this specification, the primary use for the fn:collection
function
was to retrieve a collection of XML documents, perhaps held as lexical XML in operating
system filestore, or perhaps held in an XML database. In this release the concept has
been generalised to allow other resources to be retrieved: for example JSON documents might
be returned as arrays or maps, non-XML text files might be returned as strings, and binary
files might be returned as instances of xs:base64Binary
.
The abstract concept of a collection might be realized in different ways by different implementations, and the ways in which URIs map to collections can be equally variable. Specifying resources using URIs is useful because URIs are dynamic, can be parameterized, and do not rely on an external environment.
Returns a sequence of xs:anyURI
values representing the URIs in a URI
collection.
fn:uri-collection () as xs:anyURI* |
fn:uri-collection ( |
||
$uri |
as xs:string? |
|
) as xs:anyURI* |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on available URI collections, and static base URI.
The zero-argument form of the function returns the URIs in the default URI collection described in Section C.2 Dynamic Context Components XP31.
If $uri
is a relative xs:anyURI
, it is resolved
against the value of the base-URI property from the static context.
If $uri
is the empty sequence, the function behaves as if it had been
called without an argument. See above.
The single-argument form of the function returns the sequence of URIs corresponding to the supplied URI in the available URI collections described in Section C.2 Dynamic Context Components XP31.
By default, this function is ·deterministic·. This means that repeated calls on the function with the same argument will return the same result. However, for performance reasons, implementations may provide a user option to evaluate the function without a guarantee of determinism. The manner in which any such option is provided is ·implementation-defined·. If the user has not selected such an option, a call to this function must either return a deterministic result or must raise a dynamic error [err:FODC0003].
There is no requirement that the URIs returned by this function should all be distinct, and no assumptions can be made about the order of URIs in the sequence, unless the implementation defines otherwise.
A dynamic error is raised [err:FODC0002] if no URI is supplied (that is, if the function is called with no arguments, or with a single argument that evaluates to an empty sequence), and the value of the default resource collection is absentDM40.
A dynamic error is raised [err:FODC0002] if a relative URI reference is supplied, and the base-URI property in the static context is absent.
A dynamic error is raised [err:FODC0002] if available resource collections provides no mapping for the absolutized URI.
A dynamic error may be raised [err:FODC0004] if $uri
is not
a valid xs:anyURI
.
In some implementations, there might be a close relationship between collections (as retrieved
by the fn:collection
function), and URI collections (as retrieved by this function).
For example, a collection might return XML documents, and the corresponding URI collection might return
the URIs of those documents. However, this specification does not impose such a close relationship. For example, there
may be collection URIs accepted by one of the two functions and not by the other; a collection might contain
items that do not have any URI; or a URI collection might contain URIs that cannot be dereferenced to return any
resource.
Thus, some implementations might ensure that calling fn:uri-collection
and then
applying fn:doc
to each of the returned URIs delivers the same result as
calling fn:collection
with the same argument; however, this is not
guaranteed.
In the case where fn:uri-collection
returns the URIs of resources that
could also be retrieved directly using fn:collection
, there are several reasons why it
might be appropriate to use this function in preference
to the fn:collection
function. For example:
It allows different URIs for different kinds of resource to be dereferenced in
different ways: for
example, the returned URIs might be referenced using the
fn:unparsed-text
function rather than the fn:doc
function.
In XSLT 3.0 it allows the documents in a collection to be processed in streaming mode using the
xsl:stream
instruction.
It allows recovery from failures to read, parse, or validate individual documents,
by calling the fn:doc
(or other dereferencing) function within the scope of try/catch.
It allows selection of which documents to read based on their URI, for example
they can be filtered to select those whose URIs end in .xml
, or those
that use the https
scheme.
An application might choose to limit the number of URIs processed in a single run, for example it might process only the first 50 URIs in the collection; or it might present the URIs to the user and allow the user to select which of them need to be further processed.
It allows the URIs to be modified before they are dereferenced, for example by adding or removing query parameters, or by redirecting the request to a local cache or to a mirror site.
For some of these use cases, this assumes that the cost of calling
fn:collection
might be significant (for example, it might involving
retrieving all the documents in the collection over the network and parsing them). This
will not necessarily be true of all implementations.
The fn:unparsed-text
function reads an external resource (for example, a
file) and returns a string representation of the resource.
fn:unparsed-text ( |
||
$href |
as xs:string? |
|
) as xs:string? |
fn:unparsed-text ( |
||
$href |
as xs:string? , |
|
$encoding |
as xs:string |
|
) as xs:string? |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on static base URI.
The $href
argument must be a string in the form of a URI
reference, which must contain no fragment identifier, and
must identify a resource for which a string representation is
available. If the URI is a relative URI reference, then it is resolved relative to the
static base URI property from the static context.
The mapping of URIs to the string representation of a resource is the mapping defined in the available text resourcesXP31 component of the dynamic context.
If the value of the $href
argument is an empty sequence, the function
returns an empty sequence.
The $encoding
argument, if present, is the name of an encoding. The values
for this attribute follow the same rules as for the encoding
attribute in
an XML declaration. The only values which every
implementation is required to recognize are
utf-8
and utf-16
.
The encoding of the external resource is determined as follows:
external encoding information is used if available, otherwise
if the media type of the resource is text/xml
or
application/xml
(see [RFC 2376]), or if it matches
the conventions text/*+xml
or application/*+xml
(see
[RFC 7303] and/or its successors), then the encoding is recognized
as specified in [Extensible Markup Language (XML) 1.0 (Fifth Edition)], otherwise
the value of the $encoding
argument is used if present, otherwise
the processor may use ·implementation-defined· heuristics to determine the likely encoding, otherwise
UTF-8 is assumed.
The result of the function is a string containing the string representation of the resource retrieved using the URI.
A dynamic error is raised [err:FOUT1170] if $href
contains a fragment identifier, or if it cannot be resolved
to an absolute URI (for example, because the base-URI property in the static context is absent),
or if it cannot be used to retrieve the string
representation of a resource.
A dynamic error is raised [err:FOUT1190] if the value of the
$encoding
argument is not a valid encoding name, if the
processor does not support the specified encoding, if
the string representation of the retrieved resource contains octets that cannot be
decoded into Unicode ·characters· using the specified
encoding, or if the resulting characters are not permitted XML characters.
A dynamic error is raised [err:FOUT1200] if $encoding
is absent and the processor cannot infer the
encoding using external information and the encoding is not UTF-8.
If it is appropriate to use a base URI other than the dynamic base URI (for example,
when resolving a relative URI reference read from a source document) then it is
advisable to resolve the relative URI reference using the fn:resolve-uri
function before passing it to the fn:unparsed-text
function.
There is no essential relationship between the sets of URIs accepted by the two
functions fn:unparsed-text
and fn:doc
(a URI accepted by one
may or may not be accepted by the other), and if a URI is accepted by both there is no
essential relationship between the results (different resource representations are
permitted by the architecture of the web).
There are no constraints on the MIME type of the resource.
The fact that the resolution of URIs is defined by a mapping in the dynamic context means that in effect, various aspects of the behavior of this function are ·implementation-defined·. Implementations may provide external configuration options that allow any aspect of the processing to be controlled by the user. In particular:
The set of URI schemes that the implementation recognizes is implementation-defined. Implementations may allow the mapping of URIs to resources to be configured by the user, using mechanisms such as catalogs or user-written URI handlers.
The handling of media types is implementation-defined.
Implementations may provide user-defined error handling options that allow processing to continue following an error in retrieving a resource, or in reading its content. When errors have been handled in this way, the function may return a fallback document provided by the error handler.
Implementations may provide user options that relax the requirement for the function to return deterministic results.
The rules for determining the encoding are chosen for consistency with [XML Inclusions (XInclude) Version 1.0 (Second Edition)]. Files with an XML media type are treated specially because there are use cases for this function where the retrieved text is to be included as unparsed XML within a CDATA section of a containing document, and because processors are likely to be able to reuse the code that performs encoding detection for XML external entities.
If the text file contains characters such as <
and &
,
these will typically be output as <
and &
if
the string is serialized as XML or HTML. If these characters actually represent markup
(for example, if the text file contains HTML), then an XSLT stylesheet can attempt to
write them as markup to the output file using the disable-output-escaping
attribute of the xsl:value-of
instruction. Note, however, that XSLT
implementations are not required to support this feature.
This XSLT example attempts to read a file containing 'boilerplate' HTML and copy it directly to the serialized output file:
<xsl:output method="html"/> <xsl:template match="/"> <xsl:value-of select="unparsed-text('header.html', 'iso-8859-1')" disable-output-escaping="yes"/> <xsl:apply-templates/> <xsl:value-of select="unparsed-text('footer.html', 'iso-8859-1')" disable-output-escaping="yes"/> </xsl:template>
The fn:unparsed-text-lines
function reads an external resource (for
example, a file) and returns its contents as a sequence of strings, one for each line of
text in the string representation of the resource.
fn:unparsed-text-lines ( |
||
$href |
as xs:string? |
|
) as xs:string* |
fn:unparsed-text-lines ( |
||
$href |
as xs:string? , |
|
$encoding |
as xs:string |
|
) as xs:string* |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on static base URI.
The unparsed-text-lines
function reads an external resource (for example, a
file) and returns its string representation as a sequence of strings, separated at
newline boundaries.
The result of the single-argument function is the same as the result of the expression
fn:tokenize(fn:unparsed-text($href), '\r\n|\r|\n')[not(position()=last() and
.='')]
. The result of the two-argument function is the same as the result of
the expression fn:tokenize(fn:unparsed-text($href, $encoding),
'\r\n|\r|\n')[not(position()=last() and .='')]
.
The result is thus a sequence of strings containing the text of the resource retrieved using the URI, each string representing one line of text. Lines are separated by one of the sequences x0A, x0D, or x0Dx0A. The characters representing the newline are not included in the returned strings. If there are two adjacent newline sequences, a zero-length string will be returned to represent the empty line; but if the external resource ends with the sequence x0A, x0D, or x0Dx0A, the result will be as if this final line ending were not present.
Error conditions are the same as for the fn:unparsed-text
function.
See the notes for fn:unparsed-text
.
Because errors in evaluating the fn:unparsed-text
function are
non-recoverable, these two functions are provided to allow an application to determine
whether a call with particular arguments would succeed.
fn:unparsed-text-available ( |
||
$href |
as xs:string? |
|
) as xs:boolean |
fn:unparsed-text-available ( |
||
$href |
as xs:string? , |
|
$encoding |
as xs:string |
|
) as xs:boolean |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on static base URI.
The fn:unparsed-text-available
function determines whether a call
on the fn:unparsed-text
function with identical arguments would
return a string.
If the first argument is an empty sequence, the function returns false.
In other cases, the function returns true if a call on
fn:unparsed-text
with the same arguments would succeed, and
false if a call on fn:unparsed-text
with the same arguments would
fail with a non-recoverable dynamic error.
The functions fn:unparsed-text
and
fn:unparsed-text-available
have the same requirement for
·determinism· as the functions
fn:doc
and fn:doc-available
. This means that unless the
user has explicitly stated a requirement for a reduced level of determinism, either of
these functions if called twice with the same arguments during the course of a
transformation must return the same results each time; moreover, the
results of a call on fn:unparsed-text-available
must be consistent with the results of a subsequent call on
unparsed-text
with the same arguments.
This requires that the fn:unparsed-text-available
function should
actually attempt to read the resource identified by the URI, and check that it is
correctly encoded and contains no characters that are invalid in XML. Implementations
may avoid the cost of repeating these checks for example by caching the validated
contents of the resource, to anticipate a subsequent call on the
fn:unparsed-text
or fn:unparsed-text-lines
function. Alternatively, implementations may be able to rewrite an expression such as
if (unparsed-text-available(A)) then unparsed-text(A) else ...
to
generate a single call internally.
Since the function fn:unparsed-text-lines
succeeds or fails under
exactly the same circumstances as fn:unparsed-text
, the
fn:unparsed-text-available
function may equally be used to test
whether a call on fn:unparsed-text-lines
would succeed.
Returns the value of a system environment variable, if it exists.
fn:environment-variable ( |
||
$name |
as xs:string |
|
) as xs:string? |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on environment variables.
The set of available environment variablesXP31 is a set of (name, value) pairs forming part of the dynamic context, in which the name is unique within the set of pairs. The name and value are arbitrary strings.
If the $name
argument matches the name of one of these pairs, the function
returns the corresponding value.
If there is no environment variable with a matching name, the function returns the empty sequence.
The collation used for matching names is ·implementation-defined·, but must be the same as the collation used to ensure that the names of all environment variables are unique.
The function is ·deterministic·, which means that if it is called several times within the same ·execution scope·, with the same arguments, it must return the same result.
On many platforms, the term "environment variable" has a natural meaning in terms of facilities provided by the operating system. This interpretation of the concept does not exclude other interpretations, such as a mapping to a set of configuration parameters in a database system.
Environment variable names are usually case sensitive. Names are usually of the form
(letter|_) (letter|_|digit)*
, but this varies by platform.
On some platforms, there may sometimes be multiple environment variables with the same name; in this case, it is implementation-dependent as to which is returned; see for example [POSIX.1-2008] (Chapter 8, Environment Variables). Implementations may use prefixes or other naming conventions to disambiguate the names.
The requirement to ensure that the function is deterministic means in practice that the implementation must make a snapshot of the environment variables at some time during execution, and return values obtained from this snapshot, rather than using live values that are subject to change at any time.
Operating system environment variables may be associated with a particular process, while queries and stylesheets may execute across multiple processes (or multiple machines). In such circumstances implementations may choose to provide access to the environment variables associated with the process in which the query or stylesheet processing was initiated.
Security advice: Queries from untrusted sources should not be permitted unrestricted
access to environment variables. For example, the name of the account under which the
query is running may be useful information to a would-be intruder. An implementation may
therefore choose to restrict access to the environment, or may provide a facility to
make fn:environment-variable
always return the empty sequence.
Returns a list of environment variable names that are suitable for passing to
fn:environment-variable
, as a (possibly empty) sequence of strings.
fn:available-environment-variables () as xs:string* |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on environment variables.
The function returns a sequence of strings, being the names of the environment variables in the dynamic context in some ·implementation-dependent· order.
The function is ·deterministic·: that is, the set of available environment variables does not vary during evaluation.
The function returns a list of strings, containing no duplicates.
It is intended that the strings in this list should be suitable for passing to
fn:environment-variable
.
See also the note on security under the definition of the
fn:environment-variable
function. If access to environment variables has
been disabled, fn:available-environment-variables
always returns the empty
sequence.
These functions convert between the lexical representation of XML and the tree representation.
Function | Meaning |
---|---|
fn:parse-xml |
This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document. |
fn:parse-xml-fragment |
This function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment. |
fn:serialize |
This function serializes the supplied input sequence $input as described in
[XSLT and XQuery Serialization 3.1], returning the serialized representation
of the sequence as a string. |
This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document.
fn:parse-xml ( |
||
$value |
as xs:string? |
|
) as document-node(element(*))? |
This function is ·nondeterministic·, ·context-dependent·, and ·focus-independent·. It depends on static base URI.
If $value
is the empty sequence, the function returns the empty sequence.
The precise process used to construct the XDM instance is ·implementation-defined·. In particular, it is implementation-defined whether DTD and/or schema validation is invoked, and it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used.
The static base URI property from the static context of the fn:parse-xml
function call is used both as the base URI used by the XML parser to resolve relative
entity references within the document, and as the base URI of the document node that is
returned.
The document URI of the returned node is absentDM40.
The function is not ·deterministic·: that is, if the function is called twice with the same arguments, it is ·implementation-dependent· whether the same node is returned on both occasions.
A dynamic error is raised [err:FODC0006] if the content of
$value
is not a well-formed and namespace-well-formed XML document.
A dynamic error is raised [err:FODC0006] if DTD-based validation is
carried out and the content of $value
is not valid against its DTD.
Since the XML document is presented to the parser as a string, rather than as a sequence of octets, the encoding specified within the XML declaration has no meaning. If the XML parser accepts input only in the form of a sequence of octets, then the processor must ensure that the string is encoded as octets in a way that is consistent with rules used by the XML parser to detect the encoding.
The primary use case for this function is to handle input documents that contain nested
XML documents embedded within CDATA sections. Since the content of the CDATA section are
exposed as text, the receiving query or stylesheet may pass this text to the
fn:parse-xml
function to create a tree representation of the nested
document.
Similarly, nested XML within comments is sometimes encountered, and lexical XML is sometimes returned by extension functions, for example, functions that access web services or read from databases.
A use case arises in XSLT where there is a need to preprocess an input document before
parsing. For example, an application might wish to edit the document to remove its
DOCTYPE declaration. This can be done by reading the raw text using the
fn:unparsed-text
function, editing the resulting string, and then
passing it to the fn:parse-xml
function.
The expression fn:parse-xml("<alpha>abcd</alpha>")
returns a newly
created document node, having an alpha
element as its only child; the
alpha
element in turn is the parent of a text node whose string value
is "abcd"
.
This function takes as input an XML external entity represented as a string, and returns the document node at the root of an XDM tree representing the parsed document fragment.
fn:parse-xml-fragment ( |
||
$value |
as xs:string? |
|
) as document-node()? |
This function is ·nondeterministic·, ·context-dependent·, and ·focus-independent·. It depends on static base URI.
If $value
is the empty sequence, the function returns the empty sequence.
The input must be a namespace-well-formed external general parsed entity. More specifically, it must be a string conforming to the production rule extParsedEntXML in [Extensible Markup Language (XML) 1.0 (Fifth Edition)], it must contain no entity references other than references to predefined entities, and it must satisfy all the rules of [Namespaces in XML] for namespace-well-formed documents with the exception that the rule requiring it to be a well-formed document is replaced by the rule requiring it to be a well-formed external general parsed entity.
The string is parsed to form a sequence of nodes which become children of the new document node, in the same way as the content of any element is converted into a sequence of children for the resulting element node.
Schema validation is not invoked, which means that the nodes in the returned document will all be untyped.
The precise process used to construct the XDM instance is ·implementation-defined·. In particular, it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used.
The static base URI from the static context of the fn:parse-xml-fragment
function call is used as the base URI of the document node that is returned.
The document URI of the returned node is absentDM40.
The function is not ·deterministic·: that is, if the function is called twice with the same arguments, it is ·implementation-dependent· whether the same node is returned on both occasions.
A dynamic error is raised [err:FODC0006] if the content of
$value
is not a well-formed external general parsed entity, if it contains
entity references other than references to predefined entities, or if a document that
incorporates this well-formed parsed entity would not be namespace-well-formed.
See also the notes for the fn:parse-xml
function.
The main differences between fn:parse-xml
and
fn:parse-xml-fragment
are that for fn:parse-xml
, the
children of the resulting document node must contain exactly one element node and no
text nodes, wheras for fn:parse-xml-fragment
, the resulting document node
can have any number (including zero) of element and text nodes among its children. An
additional difference is that the text declaration at the start of an
external entity has slightly different syntax from the XML declaration at
the start of a well-formed document.
Note that all whitespace outside the text declaration is significant, including whitespace that precedes the first element node.
One use case for this function is to handle XML fragments stored in databases, which
frequently allow zero-or-more top level element nodes. Another use case is to parse the
contents of a CDATA
section embedded within another XML document.
The expression
fn:parse-xml-fragment("<alpha>abcd</alpha><beta>abcd</beta>")
returns a newly created document node, having two elements named alpha
and beta
as its children; each of these elements in turn is the parent
of a text node.
The expression fn:parse-xml-fragment("He was <i>so</i> kind")
returns a newly created document node having three children: a text node whose string
value is "He was "
, an element node named i
having a child
text node with string value "so"
, and a text node whose string value is
" kind"
.
The expression fn:parse-xml-fragment("")
returns a document node having
no children.
The expression fn:parse-xml-fragment(" ")
returns a document node whose
children comprise a single text node whose string value is a single space.
The expression fn:parse-xml-fragment('<?xml version="1.0" encoding="utf8"
standalone="yes"?><a/>')
results in a dynamic error [err:FODC0006] because the "standalone" keyword is not permitted in the text
declaration that appears at the start of an external general parsed entity. (Thus, it
is not the case that any input accepted by the fn:parse-xml
function
will also be accepted by fn:parse-xml-fragment
.)
This function serializes the supplied input sequence $input
as described in
[XSLT and XQuery Serialization 3.1], returning the serialized representation
of the sequence as a string.
fn:serialize ( |
||
$input |
as item()* |
|
) as xs:string |
fn:serialize ( |
||
$input |
as item()* , |
|
$options |
as item()? |
|
) as xs:string |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The value of the first argument $input
acts as the input sequence to the serialization process,
which starts with sequence normalization.
The second argument $options
, if present, provides serialization parameters. These may be supplied in either
of two forms:
As an output:serialization-parameters
element, having the format described in Section
3.1 Setting Serialization Parameters by Means of a Data Model Instance
SER31. In this case the type of the supplied
argument must match the required type element(output:serialization-parameters)
.
As a map. In this case the type of the supplied argument must match the required type map(*)
The single-argument version of this function has the same effect as the two-argument
version called with $options
set to an empty sequence. This in turn is the
same as the effect of passing an output:serialization-parameters
element
with no child elements.
The final stage of serialization, that is, encoding, is skipped. If the serializer does not allow this phase to be skipped, then the sequence of octets returned by the serializer is decoded into a string by reversing the character encoding performed in the final stage.
If the second argument is omitted, or is supplied in the form of an output:serialization-parameters
element, then the values of any serialization parameters that are not explicitly specified is ·implementation-defined·,
and may depend on the context.
If the second argument is supplied as a map, then the ·option parameter conventions· apply. In this case:
Each entry in the map defines one serialization parameter.
The key of the entry is an xs:string
value in the cases of parameter names defined in these specifications, or an
xs:QName
(with non-absent namespace) in the case of implementation-defined serialization parameters.
The required type of each parameter, and its default value, are defined by the following table. The default value is used when the map contains no entry for the parameter in question, and also when an entry is present, with the empty sequence as its value. The table also indicates how the value of the map entry is to be interpreted in cases where further explanation is needed.
Parameter | Required type | Interpretation | Default Value |
---|---|---|---|
allow-duplicate-names
|
xs:boolean?
|
true() means "yes", false() means "no" |
no
|
byte-order-mark
|
xs:boolean?
|
true() means "yes", false() means "no" |
no
|
cdata-section-elements
|
xs:QName*
|
()
|
|
doctype-public
|
xs:string?
|
Zero-length string and () both represent "absent" |
absent |
doctype-system
|
xs:string?
|
Zero-length string and () both represent "absent" |
absent |
encoding
|
xs:string?
|
utf-8
|
|
escape-uri-attributes
|
xs:boolean?
|
true() means "yes", false() means "no" |
yes
|
html-version
|
xs:decimal?
|
5
|
|
include-content-type
|
xs:boolean?
|
true() means "yes", false() means "no" |
yes
|
indent
|
xs:boolean?
|
true() means "yes", false() means "no" |
no
|
item-separator
|
xs:string?
|
absent | |
json-node-output-method
|
union(xs:string, xs:QName)?
|
See Notes 1, 2 |
xml
|
media-type
|
xs:string?
|
(a media type suitable for the chosen method ) |
|
method
|
union(xs:string, xs:QName)?
|
See Notes 1, 2 |
xml
|
normalization-form
|
xs:string?
|
none
|
|
omit-xml-declaration
|
xs:boolean?
|
true() means "yes", false() means "no" |
yes
|
standalone
|
xs:boolean?
|
true() means "yes", false() means "no", () means "omit" |
omit
|
suppress-indentation
|
xs:QName*
|
()
|
|
undeclare-prefixes
|
xs:boolean?
|
true() means "yes", false() means "no" |
no
|
use-character-maps
|
map(xs:string, xs:string)?
|
See Note 3 |
map{}
|
version
|
xs:string?
|
1.0
|
Notes to the table:
The notation union(A, B)
is used to represent a union type whose member types are A
and B
.
If an xs:QName
is supplied for the method
or json-node-output-method
options, then it must have a non-absent namespace URI. This
means that system-defined serialization methods such as xml
and json
are defined as strings, not as xs:QName
values.
For the use-character-maps
option, the value is a map, whose keys
are the characters to be mapped (as xs:string
instances),
and whose corresponding values are the strings to be substituted for these characters.
A type error [err:XPTY0004]XP occurs if the $options
argument
is present and does not match either of the types element(output:serialization-parameters)?
or map(*)
.
Note:
This is defined as a type error so that it can be enforced via the function signature by implementations that generalize the type system in a suitable way.
If the host language makes serialization an optional feature and the implementation does not support serialization, then a dynamic error [err:FODC0010] is raised.
The serialization process will raise an error if $input
is an attribute or
namespace node.
When the second argument is supplied as a map,
and the supplied value is of the wrong type for the particular parameter, for example if the value of indent
is a string rather than a boolean, then as defined by the ·option parameter conventions·,
a type error [err:XPTY0004]XP is raised.
If the value is of the correct type, but does not satisfy the rules for that
parameter defined in [XSLT and XQuery Serialization 3.1], then a dynamic error
[err:SEPM0016]SER31 is raised. (For example, this occurs if the map supplied to
use-character-maps
includes a key that is a string whose length is not one (1)).
If any serialization error occurs, including the detection of an invalid value for a
serialization parameter as described above, this results in the fn:serialize
call failing with
a dynamic error.
One use case for this function arises when there is a need to construct an XML document
containing nested XML documents within a CDATA section (or on occasions within a
comment). See fn:parse-xml
for further details.
Another use case arises when there is a need to call an extension function that expects a lexical XML document as input.
There are also use cases where the application wants to post-process the output of a
query or transformation, for example by adding an internal DTD subset, or by inserting
proprietary markup delimiters such as the <% ... %>
used by some
templating languages.
The ability to specify the serialization parameters in an output:serialization-parameters
element provides backwards compatibility with the 3.0 version of this specification; the ability to
use a map takes advantage of new features in the 3.1 version. The default parameter values are
implementation-defined when an output:serialization-parameters
element is used (or when the argument is omitted), but are fixed by this specification in the
case where a map (including an empty map) is supplied for the argument.
Given the variables:
let $params := <output:serialization-parameters xmlns:output="http://www.w3.org/2010/xslt-xquery-serialization"> <output:omit-xml-declaration value="yes"/> </output:serialization-parameters>
let $data := <a b="3"/>
The following call might produce the output shown:
The expression fn:serialize($data, $params)
returns '<a b="3"/>'
.
The following call would also produce the output shown (though the second argument could equally well be supplied
as an empty map (map{}
), since both parameters are given their default values):
The expression fn:serialize($data, map{"method":"xml", "omit-xml-declaration":true()})
returns '<a b="3"/>'
.