This section specifies functions that manipulate URI values, either as instances
of xs:anyURI
or as strings.
Function | Meaning |
---|---|
fn:resolve-uri |
Resolves a relative IRI reference against an absolute IRI. |
fn:encode-for-uri |
Encodes reserved characters in a string that is intended to be used in the path segment of a URI. |
fn:iri-to-uri |
Converts a string containing an IRI into a URI according to the rules of [RFC 3987]. |
fn:escape-html-uri |
Escapes a URI in the same way that HTML user agents handle attribute values expected to contain URIs. |
Resolves a relative IRI reference against an absolute IRI.
fn:resolve-uri ( |
||
$relative |
as xs:string? |
|
) as xs:anyURI? |
fn:resolve-uri ( |
||
$relative |
as xs:string? , |
|
$base |
as xs:string |
|
) as xs:anyURI? |
The one-argument form of this function is ·deterministic·, ·context-dependent·, and ·focus-independent·. It depends on static base URI.
The two-argument form of this function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function is defined to operate on IRI references as defined in [RFC 3987], and the implementation must permit all arguments that are valid according to that specification. In addition, the implementation may accept some or all strings that conform to the rules for (absolute or relative) Legacy Extended IRI references as defined in [Legacy extended IRIs for XML resource identification]. For the purposes of this section, the terms IRI and IRI reference include these extensions, insofar as the implementation chooses to support them.
The following rules apply in order:
If $relative
is the empty sequence, the function returns the empty
sequence.
If $relative
is an absolute IRI (as defined above), then it is returned
unchanged.
If the $base
argument is not supplied, then:
If the static base URI in the static context is not absent, it is used as the effective
value of $base
.
Otherwise, a dynamic error is raised: [err:FONS0005].
The function resolves the relative IRI reference $relative
against the base IRI $base
using the algorithm defined in [RFC 3986], adapted by treating any ·character·
that would not be valid in an RFC3986 URI or relative reference in the same way that
RFC3986 treats unreserved characters. No percent-encoding takes place.
The first form of this function resolves $relative
against the value of the
base-uri property from the static context. A dynamic error is raised [err:FONS0005] if the base-uri property is not initialized in the static
context.
A dynamic error is raised [err:FORG0002] if $relative
is not a valid IRI according to the rules of RFC3987, extended with an
implementation-defined subset of the extensions permitted in LEIRI, or if it is not a
suitable relative reference to use as input to the RFC3986 resolution algorithm extended
to handle additional unreserved characters.
A dynamic error is raised [err:FORG0002] if $base
is
not a valid IRI according to the rules of RFC3987, extended with an
implementation-defined subset of the extensions permitted in LEIRI, or if it is not a
suitable IRI to use as input to the chosen resolution algorithm (for example, if it is a
relative IRI reference, if it is a non-hierarchic URI, or if it contains a fragment
identifier).
A dynamic error is raised [err:FORG0009] if the chosen resolution algorithm fails for any other reason.
Resolving a URI does not dereference it. This is merely a syntactic operation on two ·strings·.
The algorithms in the cited RFCs include some variations that are optional or recommended rather than mandatory; they also describe some common practices that are not recommended, but which are permitted for backwards compatibility. Where the cited RFCs permit variations in behavior, so does this specification.
Throughout this family of specifications, the phrase "resolving a relative URI (or IRI) reference" should be understood as using the rules of this function, unless otherwise stated.
RFC3986 defines an algorithm for resolving relative references in the context of the URI syntax defined in that RFC. RFC3987 describes a modification to that algorithm to make it applicable to IRIs (specifically: additional characters permitted in an IRI are handled the same way that RFC3986 handles unreserved characters). The LEIRI specification does not explicitly define a resolution algorithm, but suggests that it should not be done by converting the LEIRI to a URI, and should not involve percent-encoding. This specification fills this gap by defining resolution for LEIRIs in the same way that RFC3987 defines resolution for IRIs, that is by specifying that additional characters are handled as unreserved characters.
Encodes reserved characters in a string that is intended to be used in the path segment of a URI.
fn:encode-for-uri ( |
||
$value |
as xs:string? |
|
) as xs:string |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $value
is the empty sequence, the function returns the zero-length
string.
This function applies the URI escaping rules defined in section 2 of [RFC 3986] to the xs:string
supplied as $value
. The
effect of the function is to escape reserved characters. Each such character in the
string is replaced with its percent-encoded form as described in [RFC 3986].
Since [RFC 3986] recommends that, for consistency, URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings, this function must always generate hexadecimal values using the upper-case letters A-F.
All characters are escaped except those identified as "unreserved" by [RFC 3986], that is the upper- and lower-case letters A-Z, the digits 0-9, HYPHEN-MINUS ("-"), LOW LINE ("_"), FULL STOP ".", and TILDE "~".
This function escapes URI delimiters and therefore cannot be used indiscriminately to encode "invalid" characters in a path segment.
This function is invertible but not idempotent. This is because a string containing a
percent character will be modified by applying the function: for example
100%
becomes 100%25
, while 100%25
becomes
100%2525
.
The expression fn:encode-for-uri("http://www.example.com/00/Weather/CA/Los%20Angeles#ocean")
returns "http%3A%2F%2Fwww.example.com%2F00%2FWeather%2FCA%2FLos%2520Angeles%23ocean"
. (This is probably not what the user intended because all of the
delimiters have been encoded.)
The expression concat("http://www.example.com/",
encode-for-uri("~bébé"))
returns "http://www.example.com/~b%C3%A9b%C3%A9"
.
The expression concat("http://www.example.com/", encode-for-uri("100% organic"))
returns "http://www.example.com/100%25%20organic"
.
Converts a string containing an IRI into a URI according to the rules of [RFC 3987].
fn:iri-to-uri ( |
||
$value |
as xs:string? |
|
) as xs:string |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $value
is the empty sequence, the function returns the zero-length
string.
Otherwise, the function converts $value
into a URI according to
the rules given in Section 3.1 of [RFC 3987] by percent-encoding characters
that are allowed in an IRI but not in a URI. If $value
contains a character
that is invalid in an IRI, such as the space character (see note below), the invalid
character is replaced by its percent-encoded form as described in [RFC 3986] before the conversion is performed.
Since [RFC 3986] recommends that, for consistency, URI producers and normalizers should use uppercase hexadecimal digits for all percent-encodings, this function must always generate hexadecimal values using the upper-case letters A-F.
The function is idempotent but not invertible. Both the inputs My Documents
and My%20Documents
will be converted to the output
My%20Documents
.
This function does not check whether $iri
is a valid IRI. It treats it as
an ·string· and operates on the ·characters· in the string.
The following printable ASCII characters are invalid in an IRI: "<", ">", "
" " (double quote), space, "{", "}", "|", "\", "^", and "`". Since these
characters should not appear in an IRI, if they do appear in $iri
they will
be percent-encoded. In addition, characters outside the range x20-x7E will be
percent-encoded because they are invalid in a URI.
Since this function does not escape the PERCENT SIGN "%" and this character is not allowed in data within a URI, users wishing to convert character strings (such as file names) that include "%" to a URI should manually escape "%" by replacing it with "%25".
The expression fn:iri-to-uri
("http://www.example.com/00/Weather/CA/Los%20Angeles#ocean")
returns "http://www.example.com/00/Weather/CA/Los%20Angeles#ocean"
.
The expression fn:iri-to-uri ("http://www.example.com/~bébé")
returns "http://www.example.com/~b%C3%A9b%C3%A9"
.
Escapes a URI in the same way that HTML user agents handle attribute values expected to contain URIs.
fn:escape-html-uri ( |
||
$value |
as xs:string? |
|
) as xs:string |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
If $value
is the empty sequence, the function returns the zero-length
string.
Otherwise, the function escapes all ·characters· except
printable characters of the US-ASCII coded character set, specifically the ·codepoints· between 32 and 126 (decimal) inclusive. Each
character in $uri
to be escaped is replaced by an escape sequence, which is
formed by encoding the character as a sequence of octets in UTF-8, and then representing
each of these octets in the form %HH, where HH is the hexadecimal representation of the
octet. This function must always generate hexadecimal values using the upper-case
letters A-F.
The behavior of this function corresponds to the recommended handling of non-ASCII characters in URI attribute values as described in [HTML 4.0] Appendix B.2.1.
The expression fn:escape-html-uri("http://www.example.com/00/Weather/CA/Los Angeles#ocean")
returns "http://www.example.com/00/Weather/CA/Los Angeles#ocean"
.
The expression fn:escape-html-uri("javascript:if (navigator.browserLanguage == 'fr') window.open('http://www.example.com/~bébé');")
returns "javascript:if (navigator.browserLanguage == 'fr') window.open('http://www.example.com/~b%C3%A9b%C3%A9');"
.
This section specifies functions that parse strings as URIs, to identify their structure, and construct URI strings from their structured representation.
Function | Meaning |
---|---|
fn:parse-uri |
Parses the URI provided and returns a map of its parts. |
fn:build-uri |
Constructs a URI from the parts provided. |
The structured representation of a URI is described by the
uri-structure-record
:
†uri-structure-record : |
|
record( |
|
uri? | as xs:string , |
scheme? | as xs:string , |
authority? | as xs:string , |
userinfo? | as xs:string , |
host? | as xs:string , |
port? | as xs:string , |
path? | as xs:string , |
query? | as xs:string , |
fragment? | as xs:string , |
path-segments? | as array(xs:string) , |
query-segments? | as array(record(key? as xs:string, value? as xs:string, *)) , |
* | |
) |
The parts of this structure are:
uri | The original URI. This element is returned by fn:parse-uri ,
but ignored by fn:build-uri . |
scheme | The URI scheme (e.g., “https” or “file”). |
authority | The authority portion of the URI (e.g., “example.com:8080”). |
userinfo | Any userinfo that was passed as part of the authority. |
host | The host passed as part of the authority (e.g., “example.com”). |
port | The port passed as part of the authority (e.g., “8080”). |
path | The path portion of the URI. |
query | Any query string. |
fragment | Any fragment identifier. |
path-segments | Parsed and unescaped path segments. |
query-segments | Parsed and unescaped query terms |
* | Additional, information defined structures are allowed. |
The segmented forms of the path and query parameters provide convenient access to commonly used information. They’re represented in the map as arrays, instead of sequences, just for the convenience of serializing the structure.
The path, if there is one, is tokenized on “/” characters and
each segment is unesaped. Consider the URI http://example.com/path/to/a%2fb
. The path portion has to be returned as /path/to/a%2fb
because
decoding the %2f
would change the nature of the path.
The unescaped form is easily accessible from the path-segments array:
[ "", "path", "to", "a/b" ]
Note that the presence or absence of a leading slash on the path will effect whether or not the array begins with an empty string.
The query parameters are similarly decoded. Consider the URI:
http://example.com/path?a=1&b=2%264&a=3
.
Here the decoded form in the query-segments gives quick access to
the parameter values:
[ { "key": "a", "value": "1" }, { "key": "b", "value": "2&4" }, { "key": "a", "value": "3" } ]
Note that both keys and values are unescaped and that it’s an array
of maps because key values can be repeated, as seen for a
in this example.
Parses the URI provided and returns a map of its parts.
fn:parse-uri ( |
||
$uri |
as xs:string , |
|
$options |
as map(*) |
:= map{} |
) as †uri-structure-record |
This function is ·deterministic·, ·context-independent·, and ·focus-independent·.
The function parses the $uri
provided,
returning a map containing its constituent parts: scheme,
authority components, path, etc.
In addition to parsing URIs as defined by [RFC 3986]
(and [RFC 3987]), this function also attempts to
account for strings that are not valid URIs but that often appear
in URI-adjacent spaces, such as file names.
This function is described as a series of transformations over the input string to identify the parts of a URI that are present. Some portions of the URI are identified by matching with a regular expression. This approach is designed to make the description clear and unambiguous, it is not implementation advice.
Begin with a string that is equal to the $uri
.
If the string contains any backlashes
(“\
”), replace them with forward slashes
(“/
”).
If the string matches ^(.*)#([^#]*)$
,
the string is the first match group and the
fragment is the second match group. Otherwise,
the string is unchanged and the fragment is the empty
sequence.
If the string matches ^(.*)\?([^\?]*)$
,
the string is the first match group and the
query is the second match group. Otherwise,
the string is unchanged and the query is the empty
sequence.
If the string matches ^[a-zA-Z]:
,
the scheme is file
and the
string is unchanged. Otherwise, if the
string matches
^([a-zA-Z][A-Za-z0-9\+\-\.]*):(.*)$
, the
scheme is the first match group and the
string is the second match group. If the
string does not match either expression, the scheme
is the empty sequence and the string is unchanged.
If the string matches ^//*([a-zA-Z]:.*)$
,
the authority is empty and the string is
the first match group. Otherwise, if the string
matches ^///*([^/]+)(/.*)?$
then the authority
is the first match group and the string is the second
match group. If the string does not match either
regular expression, the authority is the empty sequence
and the string is unchanged.
If the authority matches
(([^@]*)@)(.*)(:([^:]*))?$
,
then the userinfo is match group 2, otherwise
userinfo is the empty sequence.
If the authority matches
(([^@]*)@)?(.+)(:([^:]*))?$
,
then the host is match group 3, otherwise
host is the empty sequence.
If the authority matches
(([^@]*)@)?(.*)(:([^:]*))$
,
then the port is match group 5, otherwise
port is the empty sequence.
If the string is the empty string, then path is the empty sequence, otherwise the path is the whole string.
If the $options
map contains a key named
“path-separator
”, the value of that key is the
path separator otherwise the separator is
a single slash (“/
”). It is a dynamic error XXXX if the key
is present and it’s value is not a string of length one.
A path-segments array is constructed as follows: tokenize the string on the path separator, apply uri decoding on each token, and convert the result to an array.
Applying uri decoding replaces all occurrences of
plus (“+
”) with spaces and all occurrences of
%[a-fA-F0-9][a-fA-F0-9]
with a single character with the
codepoint represented by the two digit hexadecimal number that
follows the “%
”. In other words, “A%42C
” becomes
“ABC
”. If there are any occurrences of %
followed
by up to two characters that are not hexadecimal digits, they are
replaced by the single character with the
codepoint 0xfffd. In other words “A%XYC%Z
” becomes
“A�C�
”.
If the $options
map contains a key named
“query-separator
”, the value of that key is the
query separator otherwise the separator is
a single ampersand (“&
”). It is a dynamic error XXXX if the key
is present and it’s value is not a string of length one.
A query-segments is constructed as follows: tokenize
the query on the query separator. For each
token, construct a map. If the token contains an equal sign (“=”),
the map contains a key named key
with a value equal to the
string preceding the first equal sign and a key named value
with a value equal to the string following the first equal sign. If the
token does not contain an equal sign, the map contains a single key named
value
with a value equal to the token. In every case,
uri decoding is applied to each value add to the map.
The resulting sequence of maps is converted into an array.
The following map is returned:
{ "uri": $uri, "scheme": scheme, "authority": authority, "userinfo": userinfo, "host": host, "port": port, "path": path, "query": query, "fragment": fragment, "path-segments": path-segments, "query-segments": query-segments }
The map should only be populated with keys that have a non-empty value (keys who’s value is the empty sequence or an empty array should be omitted).
Implementations may implement additional or different rules for URIs that
have a scheme or pattern that they recognize. An implementation might choose
to parse jar:
URIs with special rules, for example, since they extend the
syntax in ways not defined by [RFC 3986]. Implementations may add
additional keys to the map. The meaning of those keys is implementation-defined.
TODO: In order to better support implementation extensibility, should the keys in the map be QNames with the requirement that implementation-defined keys be in a non-empty namespace?
An error is raised XXXX if the supplied path separator is not a single character.
An error is raised XXXX if the supplied query separator is not a single character.
Like fn:resolve-uri
, this function handles the additional characters
allowed in [RFC 3987] IRIs in the same way that other unreserved
characters are handled.
Unlike fn:resolve-uri
, this function is not attempting to resolve
one URI against another and consequently, the errors that can arise under those
circumstances do not apply here. The fn:parse-uri
function will
accept strings that would raise errors if resolution was attempted,
see fn:build-uri
.
In the examples that follow, keys with values that are null, or an empty array, are elided for editorial clarity.
The expression fn:parse-uri("http://qt4cg.org/specifications/xpath-functions-40/Overview.html#parse-uri")
returns
map { "uri": "http://qt4cg.org/specifications/xpath-functions-40/Overview.html#parse-uri", "scheme": "http", "authority": "qt4cg.org", "host": "qt4cg.org", "path": "/specifications/xpath-functions-40/Overview.html", "fragment": "parse-uri", "path-segments": array { "", "specifications", "xpath-functions-40", "Overview.html" } }
The expression fn:parse-uri("http://www.ietf.org/rfc/rfc2396.txt")
returns
map { "uri": "http://www.ietf.org/rfc/rfc2396.txt", "scheme": "http", "authority": "www.ietf.org", "host": "www.ietf.org", "path": "/rfc/rfc2396.txt", "path-segments": array { "", "rfc", "rfc2396.txt" } }
The expression fn:parse-uri("https://example.com/path/to/file")
returns
map { "uri": "https://example.com/path/to/file", "scheme": "https", "authority": "example.com", "host": "example.com", "path": "/path/to/file", "path-segments": array { "", "path", "to", "file" } }
The expression fn:parse-uri("https://example.com:8080/path?s=%22hello world%22&sort=relevance")
returns
map { "uri": "https://example.com:8080/path?s=%22hello world%22&sort=relevance", "scheme": "https", "authority": "example.com:8080", "host": "example.com", "port": "080", "path": "/path", "query": "s=%22hello world%22&sort=relevance", "query-segments": array { map { "key": "s", "value": """hello world""" }, map { "key": "sort", "value": "relevance" } }, "path-segments": array { "", "path" } }
The expression fn:parse-uri("https://user@example.com/path/to/file")
returns
map { "uri": "https://user@example.com/path/to/file", "scheme": "https", "authority": "user@example.com", "userinfo": "user", "host": "example.com", "path": "/path/to/file", "path-segments": array { "", "path", "to", "file" } }
The expression fn:parse-uri("ftp://ftp.is.co.za/rfc/rfc1808.txt")
returns
map { "uri": "ftp://ftp.is.co.za/rfc/rfc1808.txt", "scheme": "ftp", "authority": "ftp.is.co.za", "host": "ftp.is.co.za", "path": "/rfc/rfc1808.txt", "path-segments": array { "", "rfc", "rfc1808.txt" } }
The expression fn:parse-uri("file:////uncname/path/to/file")
returns
map { "uri": "file:////uncname/path/to/file", "scheme": "file", "authority": "uncname", "host": "uncname", "path": "/path/to/file", "path-segments": array { "", "path", "to", "file" } }
The expression fn:parse-uri("file:///c:/path/to/file")
returns
map { "uri": "file:///c:/path/to/file", "scheme": "file", "path": "c:/path/to/file", "path-segments": array { "c:", "path", "to", "file" } }
The expression fn:parse-uri("file:/C:/Program%20Files/test.jar")
returns
map { "uri": "file:/C:/Program%20Files/test.jar", "scheme": "file", "path": "C:/Program%20Files/test.jar", "path-segments": array { "C:", "Program Files", "test.jar" } }
The expression fn:parse-uri("file:\\c:\path\to\file")
returns
map { "uri": "file:\\c:\path\to\file", "scheme": "file", "path": "c:/path/to/file", "path-segments": array { "c:", "path", "to", "file" } }
The expression fn:parse-uri("file:\c:\path\to\file")
returns
map { "uri": "file:\c:\path\to\file", "scheme": "file", "path": "c:/path/to/file", "path-segments": array { "c:", "path", "to", "file" } }
The expression fn:parse-uri("c:\path\to\file")
returns
map { "uri": "c:\path\to\file", "scheme": "file", "path": "c:/path/to/file", "path-segments": array { "c:", "path", "to", "file" } }
The expression fn:parse-uri("/path/to/file")
returns
map { "uri": "/path/to/file", "path": "/path/to/file", "path-segments": array { "", "path", "to", "file" } }
The expression fn:parse-uri("#testing")
returns
map { "uri": "#testing", "path": "", "fragment": "testing" }
The expression fn:parse-uri("?q=1")
returns
map { "uri": "?q=1", "path": "", "query": "q=1", "query-segments": array { map { "key": "q", "value": "1" } } }
The expression fn:parse-uri("ldap://[2001:db8::7]/c=GB?objectClass?one")
returns
map { "uri": "ldap://[2001:db8::7]/c=GB?objectClass?one", "scheme": "ldap", "authority": "[2001:db8::7]", "host": "[2001:db8::7]", "path": "/c=GB", "query": "objectClass?one", "query-segments": array { map { "value": "objectClass?one" } }, "path-segments": array { "", "c=GB" } }
The expression fn:parse-uri("mailto:John.Doe@example.com")
returns
map { "uri": "mailto:John.Doe@example.com", "scheme": "mailto", "path": "John.Doe@example.com", "path-segments": array { "John.Doe@example.com" } }
The expression fn:parse-uri("news:comp.infosystems.www.servers.unix")
returns
map { "uri": "news:comp.infosystems.www.servers.unix", "scheme": "news", "path": "comp.infosystems.www.servers.unix", "path-segments": array { "comp.infosystems.www.servers.unix" } }
The expression fn:parse-uri("tel:+1-816-555-1212")
returns
map { "uri": "tel:+1-816-555-1212", "scheme": "tel", "path": "+1-816-555-1212", "path-segments": array { " 1-816-555-1212" } }
The expression fn:parse-uri("telnet://192.0.2.16:80/")
returns
map { "uri": "telnet://192.0.2.16:80/", "scheme": "telnet", "authority": "92.0.2.16:80", "host": "92.0.2.16", "port": "0", "path": "/", "path-segments": array { "", "" } }
The expression fn:parse-uri("urn:oasis:names:specification:docbook:dtd:xml:4.1.2")
returns
map { "uri": "urn:oasis:names:specification:docbook:dtd:xml:4.1.2", "scheme": "urn", "path": "oasis:names:specification:docbook:dtd:xml:4.1.2", "path-segments": array { "oasis:names:specification:docbook:dtd:xml:4.1.2" } }
The expression fn:parse-uri("tag:textalign.net,2015:ns")
returns
map { "uri": "tag:textalign.net,2015:ns", "scheme": "tag", "path": "textalign.net,2015:ns", "path-segments": [ "textalign.net,2015:ns" ] }
The expression fn:parse-uri("tag:jan@example.com,1999-01-31:my-uri")
returns
map { "uri": "tag:jan@example.com,1999-01-31:my-uri" "scheme": "tag", "path": "jan@example.com,1999-01-31:my-uri", "path-segments": [ "jan@example.com,1999-01-31:my-uri" ], }
This example uses the algorithm described above, not an algorithm that is
specifically aware of the jar:
scheme.
The expression fn:parse-uri("jar:file:/C:/Program%20Files/test.jar!/foo/bar")
returns
map { "uri": "jar:file:/C:/Program%20Files/test.jar!/foo/bar", "scheme": "jar", "path": "file:/C:/Program%20Files/test.jar!/foo/bar", "path-segments": array { "file:", "C:", "Program Files", "test.jar!", "foo", "bar" } }
This example demonstrates that parsing the URI treats non-URI characters in
lexical IRIs as “unreserved characters”. The rationale for this is given in the
description of fn:resolve-uri
.
The expression fn:parse-uri("http://www.example.org/Dürst")
returns
map { "uri": "http://www.example.org/Dürst", "scheme": "http", "authority": "www.example.org", "host": "www.example.org", "path": "/Dürst", "path-segments": [ "","Dürst" ] }
This example demonstrates a non-standard query separator.
The expression
fn:parse-uri("https://example.com:8080/path?s=%22hello world%22;sort=relevance",
map { "query-separator": ";" })
returns
map { "uri": "https://example.com:8080/path?s=%22hello world%22;sort=relevance", "scheme": "https", "authority": "example.com:8080", "host": "example.com", "port": "080", "path": "/path", "query": "s=%22hello world%22;sort=relevance", "query-segments": array { map { "key": "s", "value": """hello world""" }, map { "key": "sort", "value": "relevance" } }, "path-segments": array { "", "path" } }
This example uses an invalid query separator so raises an error.
The expression fn:parse-uri("https://example.com:8080/path?s=%22hello world%22;;sort=relevance",
map { "query-separator": ";;" })
raises error FOXX0000
.
Proposed on 17 Oct 2022 to resolve issue #72. Accepted in principle on 15 Nov 2022, with some details still to be resolved.
Constructs a URI from the parts provided.
fn:build-uri ( |
||
$parts |
as †uri-structure-record, |
|
$options |
as map(*) |
:= map{} |
) as xs:string |
This function is ·deterministic·, ·context-dependent·, and ·focus-independent·.
A URI is composed from a scheme, authority, path, query, and fragment.
These components are derived from the contents of the $parts
map in the following way:
If the scheme
key is present in the map, the URI begins
with the value of that key concatenated with //
, otherwise
it begins //
.
If any of userinfo
, host
, or port
are present in the map, the following authority is added to the URI
under construction:
concat((if (exists($parts?userinfo)) then $parts?userinfo || "@" else ""), $host, (if (exists($parts?port)) then ":" || $parts?port else ""))
If none of userinfo
, host
, or port
is present, and authority
is present, the value of the
authority
key is added to the URI.
If the path-segments
key exists in the map, then the
path is constructed
with string-join($parts?path-segments ! encode-for-uri(.), "/")
,
otherwise the value of the path
key is used.
If the path
value is the empty sequence,
the empty string is used for the path. The path is added to the URI.
If the query-segments
key exists in the map, then
a sequence of strings is constructed from each segment in turn.
If the segment contains both a key
and a value
,
the string is the concatenation of the value of the key
,
an equal sign (“=
”), and the value of the value
. If it contains
only one of those keys, then it is the value of that key. If it contains
neither, it is ignored. The query is constructed by joining the resulting
strings into a single string, separated by ampersands (“&
”).
If the query-segments
key does not exist in the map, but
the query
key does, then the query is the value of the
query
key. If there’s a query, it is added to the URI with
a preceding question mark (“?
”).
If the fragment
key exists in the map, then
the value of that key is added to the URI with
a preceding hash mark (“#
”).
The resulting URI is returned.
The expression fn:build-uri(map {
"scheme": "https",
"host": "qt4cg.org",
"port": (),
"path": "/specifications/index.html"
})
returns https://qt4cg.org/specifications/index.html
.
Proposed on 17 Oct 2022 to resolve issue #72. Accepted in principle on 15 Nov 2022, with some details still to be resolved.