21 Maps

Maps are defined in the XDM Data Model.

21.2 Map Instructions

Two instructions are added to XSLT to facilitate the construction of maps.

<!-- Category: instruction -->
<xsl:map
  on-duplicates? = expression >
  <!-- Content: sequence-constructor -->
</xsl:map>

The instruction xsl:map constructs and returns a new map.

The contained sequence constructor must evaluate to a sequence of maps: call this $maps.

In the absense of duplicate keys, the result of the instruction is then given by the XPath 3.1 expression:

map:merge($maps)

Note:

Informally: in the absence of duplicate keys the resulting map contains the union of the map entries from the supplied sequence of maps.

The handling of duplicate keys is described in 21.2.1 Handling of duplicate keys below.

There is no requirement that the supplied input maps should have the same or compatible types. The type of a map (for example map(xs:integer, xs:string)) is descriptive of the entries it currently contains, but is not a constraint on how the map may be combined with other maps.

[ERR XTTE3375] A type error occurs if the result of evaluating the sequence constructor is not an instance of the required type map(*)*.

Note:

In practice, the effect of this rule is that the sequence constructor contained in the xsl:map instruction is severely constrained: it doesn’t make sense, for example, for it to contain instructions such as xsl:element that create new nodes. As with other type errors, processors are free to signal the error statically if they are able to determine that the sequence constructor would always fail when evaluated.

<!-- Category: instruction -->
<xsl:map-entry
  key = expression
  select? = expression >
  <!-- Content: sequence-constructor -->
</xsl:map-entry>

The instruction xsl:map-entry constructs and returns a singleton map: that is, a map which contains one key and one value. Such a map is primarily used as a building block when constructing maps using the xsl:map instruction.

The select attribute and the contained sequence constructor are mutually exclusive: if a select attribute is present, then the content must be empty except optionally for xsl:fallback instructions.

[ERR XTSE3280] It is a static error if the select attribute of the xsl:map-entry element is present unless the element has no children other than xsl:fallback elements.

The key of the entry in the new map is the value obtained by evaluating the expression in the key attribute, converted to the required type xs:anyAtomicType by applying the coercion rules. If the supplied key (after conversion) is of type xs:untypedAtomic, it is cast to xs:string.

The associated value is the value obtained by evaluating the expression in the select attribute, or the contained sequence constructor, with no conversion. If there is no select attribute and the sequence constructor is empty, the associated value is the empty sequence.

Example: Using XSLT instructions to create a fixed map

The following example binds a variable to a map whose content is statically known:

<xsl:variable name="week" as="map(xs:string, xs:string)">
  <xsl:map>
    <xsl:map-entry key="'Mo'" select="'Monday'"/>
    <xsl:map-entry key="'Tu'" select="'Tuesday'"/>
    <xsl:map-entry key="'We'" select="'Wednesday'"/>
    <xsl:map-entry key="'Th'" select="'Thursday'"/>
    <xsl:map-entry key="'Fr'" select="'Friday'"/>
    <xsl:map-entry key="'Sa'" select="'Saturday'"/>
    <xsl:map-entry key="'Su'" select="'Sunday'"/>
  </xsl:map>
</xsl:variable>  

 

Example: Using XSLT instructions to create a computed map

The following example binds a variable to a map acting as an index into a source document:

<xsl:variable name="index" as="map(xs:string, element(employee))">
  <xsl:map>
    <xsl:for-each select="//employee">
      <xsl:map-entry key="@empNr" select="."/>
    </xsl:for-each>
  </xsl:map>
</xsl:variable>  

21.2.1 Handling of duplicate keys

This section describes what happens when two or more maps returned by the sequence constructor within an xsl:map instruction contain duplicate keys: that is, when one of these maps contains an entry with key K, and another contains an entry with key L, and op:same-key(K, L) is true.

[ERR XTDE3365] In the absence of the on-duplicates attribute, a dynamic error occurs if the set of keys in the maps resulting from evaluating the sequence constructor contains duplicates.

The result of evaluating the on-duplicates attribute, if present, must be a function with arity 2. When the xsl:map instruction encounters two map entries having the same key, the two values associated with this key are passed as arguments to this function, and the function returns the value that should be associated with this key in the final map.

The order of the arguments passed to the function reflects the order of the maps in which the duplicate entries appear: if map M and map N contain values VM and VN for the same key, and M precedes N in the sequence of maps returned by the sequence constructor, then the callback function is called with arguments VM and VN in that order.

If more than two maps contain values for the same key, then the callback function is invoked repeatedly. Let F be the callback function. Then if (for example) four maps supply the values A, B, C, and D for a given key K, in that order, the evaluation is as follows:

  1. F(A, B) is called; let its return value be X.

  2. F(X, C) is called; let its return value be Y.

  3. F(Y, D) is called; let its return value be Z.

  4. The value that is associated with key K in the final map will be Z.

Thus, if the values are all singleton items (which is not necessarily the case), and if the sequence of values is S, then the final result is fold-left(tail(S), head(S), F).

For example, the following table shows some useful callback functions that might be supplied, and explains their effect:

Function Effect
function($a, $b){$a} The first of the duplicate values is used.
function($a, $b){$b} The last of the duplicate values is used.
function($a, $b){$a, $b} The sequence-concatenation of the duplicate values is used.
function($a, $b){max(($a, $b))} The highest of the duplicate values is used.
function($a, $b){min(($a, $b))} The lowest of the duplicate values is used.
function($a, $b){string-join(($a, $b), ', ')} The comma-separated string concatenation of the duplicate values is used.
function($a, $b){error()} Duplicates are rejected as an error (this is the default in the absence of a callback function).
Example: Combining Duplicates into an Array

This example takes as input an XML document such as:

<data>
   <event id="A23" value="12"/>
   <event id="A24" value="5"/>
   <event id="A25" value="9"/>
   <event id="A23" value="2"/>
 </data>

and constructs a map whose JSON representation is:

{"A23": [12, 2], "A24": [5], "A23": [9]}

The logic is:

<xsl:template match="data">
   <xsl:map on-duplicates="function($a, $b){array:join(($a, $b))}">
     <xsl:for-each select="event">
        <xsl:map-entry key="@id" select="[xs:integer(@value)]"/>
     </xsl:for-each>
   </xsl:map>

21.4 Maps and Streaming

Maps have many uses, but their introduction to XSLT 3.0 was strongly motivated by streaming use cases. In essence, when a source document is processed in streaming mode, data that is encountered in the course of processing may need to be retained in variables for subsequent use, because the nodes cannot be revisited. This creates a need for a flexible data structure to accommodate such temporary data, and maps were designed to fulfil this need.

The entries in a map are not allowed to contain references to streamed nodes. This is achieved by ensuring that for all constructs that supply content to be included in a map (for example the third argument of map:put, and the select attribute of xsl:map-entry), the relevant operand is defined to have operand usage navigation. Because maps cannot contain references to streamed nodes, they are effectively grounded, and can therefore be used freely in contexts (such as parameters to functions or templates) where only grounded operands are permitted.

The xsl:map instruction, and the XPath MapConstructor construct, are exceptions to the general rule that during streaming, only one downward selection (one consuming subexpression) is permitted. They share this characteristic with xsl:fork. As with xsl:fork, a streaming processor is expected to be able to construct the map during a single pass of the streamed input document, which may require multiple expressions to be evaluated in parallel.

In the case of the xsl:map instruction, this exemption applies only in the case where the instruction consists exclusively of xsl:map-entry (and xsl:fallback) children, and not in more complex cases where the map entries are constructed dynamically (for example using a control flow implemented using xsl:choose, xsl:for-each, or xsl:call-template). Such cases may, of course, be streamable if they only have a single consuming subexpression.

For example, the following XPath expression is streamable, despite making two downward selections:

let $m := map{'price':xs:decimal(price), 'discount':xs:decimal(discount)} 
return ($m?price - $m?discount)

Analysis:

  1. Because the return clause is motionless, the sweep of the let expression is the sweep of the map expression (the expression in curly brackets).

  2. The sweep of a map expression is the maximum sweep of its key/value pairs.

  3. For both key/value pairs, the key is motionless and the value is consuming.

  4. The expression carefully atomizes both values, because retaining references to streamed nodes in a map is not permitted.

  5. Therefore the map expression, and hence the expression as a whole, is grounded and consuming.

See also: 19.8.8.17 Streamability of Map Constructors, 19.8.4.23 Streamability of xsl:map, 19.8.4.24 Streamability of xsl:map-entry

21.5 Examples using Maps

This section gives some examples of where maps can be useful.

Example: Using Maps with xsl:iterate

This example uses maps in conjunction with the xsl:iterate instruction to find the highest-earning employee in each department, in a single streaming pass of an input document containing employee records.

<xsl:source-document streamable="yes" href="employees.xml">
  <xsl:iterate select="*/employee">
    <xsl:param name="highest-earners" 
               as="map(xs:string, element(employee))" 
               select="map{}"/>
    <xsl:on-completion>
      <xsl:for-each select="map:keys($highest-earners)">
        <department name="{.}">
          <xsl:copy-of select="$highest-earners(.)"/>
        </department>
      </xsl:for-each>
    </xsl:on-completion>           
    <xsl:variable name="this" select="copy-of(.)" as="element(employee)"/> 
    <xsl:next-iteration>
      <xsl:with-param name="highest-earners"
          select="let $existing := $highest-earners($this/department)
                  return if ($existing/salary gt $this/salary)
                         then $highest-earners
                         else map:put($highest-earners, $this/department, $this)"/>
    </xsl:next-iteration>
  </xsl:iterate>
</xsl:source-document>

 

Example: Using Maps to Implement Complex Numbers

A complex number might be represented as a map with two entries, the keys being the xs:boolean value true for the real part, and the xs:boolean value false for the imaginary part. A library for manipulation of complex numbers might include functions such as the following:

<xsl:variable name="REAL" static="yes" as="xs:int" select="0"/> 
<xsl:variable name="IMAG" static="yes" as="xs:int" select="1"/> 
                     
<xsl:function name="i:complex" as="map(xs:int, xs:double)">
  <xsl:param name="real" as="xs:double"/>
  <xsl:param name="imaginary" as="xs:double"/>
  <xsl:sequence select="map{ $REAL : $real, $IMAG : $imaginary }"/>
</xsl:function>

<xsl:function name="i:real" as="xs:double">
  <xsl:param name="complex" as="map(xs:int, xs:double)"/>
  <xsl:sequence select="$complex($REAL)"/>
</xsl:function>

<xsl:function name="i:imaginary" as="xs:double">
  <xsl:param name="complex" as="map(xs:int, xs:double)"/>
  <xsl:sequence select="$complex($IMAG)"/>
</xsl:function>

<xsl:function name="i:add" as="map(xs:int, xs:double)">
  <xsl:param name="arg1" as="map(xs:int, xs:double)"/>
  <xsl:param name="arg2" as="map(xs:int, xs:double)"/>
  <xsl:sequence select="i:complex(i:real($arg1)+i:real($arg2), 
                                  i:imaginary($arg1)+i:imaginary($arg2)"/>
</xsl:function>

<xsl:function name="i:multiply" as="map(xs:boolean, xs:double)">
  <xsl:param name="arg1" as="map(xs:boolean, xs:double)"/>
  <xsl:param name="arg2" as="map(xs:boolean, xs:double)"/>
  <xsl:sequence select="i:complex(
      i:real($arg1)*i:real($arg2) - i:imaginary($arg1)*i:imaginary($arg2),
      i:real($arg1)*i:imaginary($arg2) + i:imaginary($arg1)*i:real($arg2))"/>
</xsl:function>

 

Example: Using a Map as an Index

Given a set of book elements, it is possible to construct an index in the form of a map allowing the books to be retrieved by ISBN number.

Assume the book elements have the form:

<book>
  <isbn>0470192747</isbn>
  <author>Michael H. Kay</author>
  <publisher>Wiley</publisher>
  <title>XSLT 2.0 and XPath 2.0 Programmer's Reference</title>
</book>

An index may be constructed as follows:

<xsl:variable name="isbn-index" as="map(xs:string, element(book))"
    select="map:merge(for $b in //book return map{$b/isbn : $b})"/>

This index may then be used to retrieve the book for a given ISBN using either of the expressions map:get($isbn-index, "0470192747") or $isbn-index("0470192747").

In this simple form, this replicates the functionality available using xsl:key and the key function. However, it also provides capabilities not directly available using the key function: for example, the index can include book elements in multiple source documents. It also allows processing of all the books using a construct such as <xsl:for-each select="map:keys($isbn-index)">

 

Example: A Map containing Named Functions

As in JavaScript, a map whose keys are strings and whose associated values are function items can be used in a similar way to a class in object-oriented programming languages.

Suppose an application needs to handle customer order information that may arrive in three different formats, with different hierarchic arrangements:

  1. Flat structure:

    <customer id="c123">...</customer>
    <product id="p789">...</product>
    <order customer="c123" product="p789">...</order>
  2. Orders within customer elements:

    <customer id="c123">
       <order product="p789">...</order>
    </customer>
    <product id="p789">...</product>
  3. Orders within product elements:

    <customer id="c123">...</customer>
    <product id="p789">
      <order customer="c123">...</order>
    </product>

An application can isolate itself from these differences by defining a set of functions to navigate the relationships between customers, orders, and products: orders-for-customer, orders-for-product, customer-for-order, product-for-order. These functions can be implemented in different ways for the three different input formats. For example, with the first format the implementation might be:

<xsl:variable name="flat-input-functions" as="map(xs:string, function(*))*"
  select="map{
            'orders-for-customer' : 
                 function($c as element(customer)) as element(order)* 
                    {$c/../order[@customer=$c/@id]},
            'orders-for-product' : 
                 function($p as element(product)) as element(order)* 
                    {$p/../order[@product=$p/@id]},
            'customer-for-order' : 
                 function($o as element(order)) as element(customer) 
                    {$o/../customer[@id=$o/@customer]},
            'product-for-order' : 
                 function($o as element(order)) as element(product) 
                    {$o/../product[@id=$o/@product]} }                    
         "/>

Having established which input format is in use, the application can bind the appropriate implementation of these functions to a variable such as $input-navigator, and can then process the input using XPath expressions such as the following, which selects all products for which there is no order: //product[empty($input-navigator("orders-for-product")(.))]