Chapter 2. Language reference

XProc is a data flow programming language described with an XML vocabulary. This chapter provides an overview of the features of the language. At a high level, XProc allows you to combine steps, units of computation, in a variety of ways to achieve your goal. The p:for-each step, for example, will iterate over a set of documents and the p:xslt step will perform XSLT transformations.

Broadly speaking, the features are:

Structures for declaring pipelines,
structures for connecting steps to inputs,
compound steps,
atomic steps,
options,
variables,
and extra information

The following sections give a brief overview of the elements in the XProc vocabulary.

ⓘ

Note from the author

This section is something of a work-in-progress. At the moment, it’s neither a comprehensive description of every aspect of the vocabulary, nor is it tutorial in nature. But it’s useful to have every element in the vocabulary present in the reference. Suggestions for improvements are welcome.

In the summaries that follow, {any-name}* generally means any number of additional namespace qualified names. These are roughly extension attributes and are ignored unless the processor uses them for some implementation-defined purpose.

2.1. Declaring pipelines

The most common pipeline declaration specifies the inputs, outputs, and options that the step accepts, followed by the steps that implement the pipeline. Pipelines may also import libraries and functions, and may declare steps.

<p:declare-step
name? = `NCName`	The step name
type? = `EQName`	The step type (for reuse)
psvi-required? = `boolean`	Is XML Schema validated input required?
xpath-version? = `decimal`	The XPath version required
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
version? = `decimal`	The XProc version (3.0 or 3.1)
visibility? = `private\|public`	Visible outside the library?
{any-name}* = `string`	Additional attributes
>
(import \| import-functions), (input \| output \| option), declare-step*, `subpipeline`?
</p:declare-step>

The example pipeline in Example 2.1, “A compound step declaration” shows a typical compound step declaration. In brief: it iterates over the files in a directory replacing selected copyright elements. We’ll look at several of its features in more detail below.

Notes on the attributes of p:declare-step

name: The step name is only used by the subpipeline in the declaration. It’s how a step in the subpipeline can refer, for example with p:pipe, to one of the step’s inputs.
1 |<p:declare-step name="main"> | <p:input port="source"/> | … | <p:identity> 5 | <p:with-input> | <p:pipe step="main" port="source"/> | </p:with-input> | </p:identity> | … 10 |</p:declare-step>
type: The step type is how you reuse a step. If you declare a step with the type ex:my-step, then you can subsequently use it as an atomic step: <ex:my-step> in other steps, even recursively.
1 |<p:declare-step name="main" xmlns:ex="http://example.com/ns"> | <p:input port="source"/> | | <p:declare-step type="ex:my-step"> 5 | <p:input port="source"/> | … | </p:declare-step> | … | <ex:my-step> 10 | <p:with-input pipe="source@main"/> | </ex:my-step> | … |</p:declare-step>
psvi-required: If this is true, you’re telling the processor that XML Schema validated inputs are required. This will require Saxon EE.
xpath-version: This specifies the XPath version. The only version that you can use today is “3.1”, but in the future, it might be possible to specify other versions.
exclude-inline-prefixes: When you put XML in a p:inline element, all of the in-scope namespaces will apply to those elements. You can use exclude-inline-prefixes to exclude some of them. The value of the attribute must be a space-separated list of in-scope namespace prefixes. It’s an error to refer to a prefix that doesn’t have an in-scope declaration. The tokens #default and #all may also be used to exclude the default namespace and all namespaces, respectively.
version: The XProc version of the step. Only 3.0 or 3.1 are accepted and they’re equivalent.
visibility: The visibility of a step only makes sense when it occurs in a p:library. Inside a p:library a “private” step is not visible to pipelines that import the library. It can only be used by other steps declared in the library.

ⓘ

Note

The example pipelines and some input documents to demonstrate how they work are available from an examples directory in the repository.

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:c="http://www.w3.org/ns/xproc-step"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 5 |                name="main" version="3.1">
   |  <p:documentation>
   |    <div xmlns="http://www.w3.org/1999/xhtml">
   |      <p>This pipeline reads all of the files in a directory and
   |      updates the copyright element.</p>
10 |    </div>
   |  </p:documentation>
   | 
   |  <p:input port="copyright" content-types="xml"/>
   |  <p:output port="result" content-types="text"/>
15 |  <p:option name="path" required="true" as="xs:string"/>
   |  <p:option name="output-path" required="true" as="xs:anyURI"/>
   |  <p:option name="recurse" select="false()" as="xs:boolean"/>
   | 
   |  <p:directory-list name="listing" path="{$path}"
20 |                    include-filter=".*\.xml$">
   |    <p:with-option name="max-depth"
   |                   select="if ($recurse) then 'unbounded' else '1'"/>
   |  </p:directory-list>
   | 
25 |  <p:for-each name="loop">
   |    <p:with-input select="//c:file"/>
   |    <p:variable name="filename" select="/*/@name"/>
   | 
   |    <p:load href="{resolve-uri(/*/@name, base-uri(/*))}"/>
30 | 
   |    <p:viewport match="copyright[. = 'Someone Random']">
   |      <p:identity>
   |        <p:with-input pipe="copyright@main"/>
   |      </p:identity>
35 |    </p:viewport>
   | 
   |    <p:store href="{resolve-uri($filename, resolve-uri($output-path, static-base-uri()))}"/>
   |  </p:for-each>
   | 
40 |  <p:variable name="total" select="count(//c:file)">
   |    <p:pipe step="listing"/>
   |  </p:variable>
   |    
   |  <p:identity>
45 |    <p:with-input xmlns:f="http://example.com/ns/functions">
   |      <p:inline content-type="text/plain">Processed {$total} files; {f:is-leap-day()}&#10;</p:inline>
   |    </p:with-input>
   |  </p:identity>
   |</p:declare-step>

Example 2.1. A compound step declaration

A simpler form of declaration specifies the inputs, outputs, and options that the step accepts, but relies on the implementation having been provided through some other means. The XML Calabash extension steps, or extension steps that you write in a JVM language yourself, follow this pattern.

<p:declare-step
name? = `NCName`	The step name
type? = `EQName`	The step type (for reuse)
psvi-required? = `boolean`	Is XML Schema validated input required?
xpath-version? = `decimal`	The XPath version required
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
version? = `decimal`	The XProc version (3.0 or 3.1)
visibility? = `private\|public`	Visible outside the library?
{any-name}* = `string`	Additional attributes
>
(input \| output \| option)*
</p:declare-step>

An example atomic declaration is shown in Example 2.2, “An atomic step declaration”. This is the declaration for the extension step cx:fileset.

 1 |<p:declare-step type="cx:fileset" version="3.1"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema">
 5 |  <p:input port="source" content-types="xml" sequence="true">
   |    <p:empty/>
   |  </p:input>
   |  <p:output port="result" content-types="xml" sequence="true"/>
   |  <p:option name="path" as="xs:string" required="true"/>
10 |  <p:option name="default-excludes" as="xs:boolean" select="true()"/>
   |  <p:option name="case-sensitive" as="xs:boolean" select="true()"/>
   |  <p:option name="error-on-missing-dir" as="xs:boolean" select="true()"/>
   |  <p:option name="follow-symlinks" as="xs:boolean" select="true()"/>
   |  <p:option name="includes" as="xs:string?"/>
15 |  <p:option name="excludes" as="xs:string?"/>
   |  <p:option name="detailed" as="xs:boolean" select="false()"/>
   |</p:declare-step>

Example 2.2. An atomic step declaration

2.1.1. Pipeline inputs

The p:input element describes a step input.

<p:input
port = `NCName`	The port name
sequence? = `boolean`	Accept a (possibly empty) sequence of documents?
primary? = `boolean`	This is the primary port?
select? = `XPathExpression`	XPath selection from the inputs
content-types? = `ContentTypes`	Acceptable content types
href? = { `anyURI` }	A document binding
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
>
((empty \| (document \| inline)) \| `anyElement`)
</p:input>

The input in Example 2.3, “A pipeline input” has the port name “copyright” and accepts only XML documents. If it’s the only input, it will be primary. The input doesn’t specify sequence="true", so a single document is required.

Notes on the attributes of p:input (and p:output)

port: The port name is how you refer to an input or output port. Its name must be unique.
sequence: If a sequence is allowed, any number of documents can be used on that port. If not, exactly one document must be used.
primary: If a port is primary, that’s where implicit connections are made. If there’s only one input or output port, it will be primary be default. If there’s more than one, none are primary unless indicated explicitly. At most one input port and one output port may be declared primary.
select: A select expression on an input matches the selected nodes and creates an input document for each. Matching, for example, //chapter will make the input a sequence of documents, one for each chapter element that appears on the original input.
content-types: If a list of content types is provided, only documents that are of those types are allowed. Note that XProc allows both positive and negative content types.
href: Shortcut for a single p:document binding.
exclude-inline-prefixes: You can use exclude-inline-prefixes to exclude some namespaces from a p:inline, as noted about the attributes of p:declare-step. If the exclude-inline-prefixes element occurs multiple times among the ancestors of an inline, the effect is the union of all such prefixes.

  |<p:input port="copyright" content-types="xml"/>

Example 2.3. A pipeline input

2.1.2. Pipeline outputs

The p:output element describes a step output. There are three slightly different forms, depending on where the output is being declared. The pipeline and compound step forms are the same except that on a pipeline, serialization parameters may also be declared.

<p:output
port? = `NCName`	The port name
sequence? = `boolean`	Accept a (possibly empty) sequence of documents?
primary? = `boolean`	This is the primary port?
content-types? = `ContentTypes`	Acceptable content types
href? = { `anyURI` }	A document binding
pipe? = `string`	A pipe binding
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
serialization? = `map(xs:QName,item()*)`	Serialization options
>
((empty \| (document \| pipe \| inline)) \| `anyElement`)
</p:output>

The output in Example 2.4, “A pipeline output” has the port name “result”. The pipeline result must be a single text document.

The notes about the attributes on p:input apply to p:output. In addition, p:output has a pipe attribute that is a shortcut for p:pipe bindings. On a p:declare-step it may also have a map of serialization parameters.

  |<p:output port="result" content-types="text"/>

Example 2.4. A pipeline output

On a compound step, no serialization can occur, so it would be pointless to specify them.

<p:output
port? = `NCName`	The port name
sequence? = `boolean`	Accept a (possibly empty) sequence of documents?
primary? = `boolean`	This is the primary port?
content-types? = `ContentTypes`	Acceptable content types
href? = { `anyURI` }	A document binding
pipe? = `string`	A pipe binding
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
>
((empty \| (document \| pipe \| inline)) \| `anyElement`)
</p:output>

On the declaration of an atomic step, you also cannot provide any connections.

<p:output
port? = `NCName`	The port name
sequence? = `boolean`	Accept a (possibly empty) sequence of documents?
primary? = `boolean`	This is the primary port?
content-types? = `ContentTypes`	Acceptable content types
{any-name}* = `string`	Additional attributes
/>

The output on the cx:fileset declaration indicates that the step can produce a sequence of XML documents.

  |<p:output port="result" content-types="xml" sequence="true"/>

Example 2.5. An atomic step output

2.1.3. Pipeline options

Options are declared with the p:option element.

<p:option
name = `EQName`	The option name
as? = `XPathSequenceType`	The required value type
values? = `string`	Allowed values
static? = `boolean`	Static value?
required? = `boolean`	Option value is required?
select? = `XPathExpression`	Default value
{any-name}* = `string`	Additional attributes
visibility? = `private\|public`	Visible outside the library?
/>

Options can be required or have a default value and they can be typed.

Notes on the attributes of p:option

name: The option name. Options cannot shadow earlier options, they must have unique names. Non-static options can be shadowed in the subpipeline by p:variables.
as: The type of the option, for example “xs:integer” or “map(xs:string, xs:dateTime)”.
values: A list of atomic values. This forms an enumeration and the option must be one of these values.
static: Is the option evaluated at compile time? Static options can be used in use-when expressions for conditional element exclusion.
required: If an attribute is required, the caller must provide a value for it. If it isn’t required, and no value is provided, the default value is taken from the select attribute. (If there’s no select attribute, the default value is the empty sequence.)
select: Provides a default value for the option. Option default values can refer to preceding options, but not to the step inputs.
visibility: The visibility of an option only makes sense when it occurs in a p:library. Inside a p:library a “private” option is not visible to pipelines that import the library. It can only be used by other options and steps declared in the library.

The compound step declaration in Example 2.1, “A compound step declaration” declares three options:

  |<p:option name="path" required="true" as="xs:string"/>
  |<p:option name="output-path" required="true" as="xs:anyURI"/>
  |<p:option name="recurse" select="false()" as="xs:boolean"/>

Example 2.6. Pipeline options

The path and output-path options are required and must be a string and a URI, respectively (there’s no practical reason to make them different types in this case, it’s just to make the example more interesting). The recurse option is not required and will default to “false”.

Options can be declared static:

<:option
name = `EQName`	The option name
as? = `XPathSequenceType`	The required value type
values? = `string`	Allowed values
static = "true"	Static value?
select = `XPathExpression`	Default value
{any-name}* = `string`	Additional attributes
visibility? = `private\|public`	Visible outside the library?
/>

Options inside a p:library must be declared static.

It must be possible to evaluate a static option without reference to any pipeline input documents. It is evaluated “at compile time”. You may not shadow a static option with another option or p:variable.

2.1.4. Declaring libraries of steps

Several steps (and static options) can be bundled together in a library.

<p:library
psvi-required? = `boolean`	Is XML Schema validated input required?
xpath-version? = `decimal`	The XPath version required
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
version? = `decimal`	The XProc version (3.0 or 3.1)
{any-name}* = `string`	Additional attributes
>
(import \| import-functions), option, declare-step*
</p:library>

A collection of step declarations and (static) options can be put inside a p:library so that they can all be imported together.

The notes about the attributes on p:declare-step apply to p:library.

2.1.5. Importing libraries and function libraries

A library (or a single step) can be imported.

<p:import
{any-name}* = `string`	Additional attributes
href = `anyURI`	Document URI
/>

The document URI must identify a pipeline or library document.

XML Calabash provides URIs for importing its extension steps. For example, https://xmlcalabash.com/ext/library/fileset.xpl for the cx:fileset extension step. Generally speaking, the specified URI is retrieved and parsed for its declarations. The library URIs that XML Calabash provides for its extension steps are resolved internally, without accessing the internet.

XML Calabash can also import functions defined in XSLT or XQuery.

<p:import-functions
{any-name}* = `string`	Additional attributes
href = `anyURI`	Document URI
content-type? = `ContentType`	The content type
namespace? = `string`	The namespace(s) to import
/>

Importing functions allows them to be used in expressions in the pipeline.

Notes on the attributes of p:import-functions

href: The document URI must identify a library of functions.
content-type: If a content-type is provided, it informs the processor what is expected from the imported library.
namespace: A whitespace-separated list of namespace URIs. If provided, only functions declared in one of those namespaces will be imported.

The stylesheet in Example 2.7, “A function library in XSLT” defines two functions, f:is-leap-day with no arguments and f:is-leap-day with a single date argument.

 1 |<?xml version="1.0" encoding="utf-8"?>
   |<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema"
   |                xmlns:f="http://example.com/ns/functions"
 5 |                exclude-result-prefixes="xs"
   |                version="3.0">
   | 
   |<xsl:function name="f:is-leap-day">
   |  <xsl:sequence select="f:is-leap-day(current-date())"/>
10 |</xsl:function>
   | 
   |<xsl:function name="f:is-leap-day">
   |  <xsl:param name="date"/>
   |  <xsl:choose>
15 |    <xsl:when test="$date instance of xs:date">
   |      <xsl:sequence select="month-from-date($date) = 2
   |                            and day-from-date($date) = 29"/>
   |    </xsl:when>
   |    <xsl:when test="$date instance of xs:dateTime">
20 |      <xsl:sequence select="month-from-dateTime($date) = 2
   |                            and day-from-dateTime($date) = 29"/>
   |    </xsl:when>
   |    <xsl:when test="$date castable as xs:date">
   |      <xsl:variable name="dt" select="$date cast as xs:date"/>
25 |      <xsl:sequence select="month-from-date($dt) = 2
   |                            and day-from-date($dt) = 29"/>
   |    </xsl:when>
   |    <xsl:when test="$date castable as xs:dateTime">
   |      <xsl:variable name="dt" select="$date cast as xs:dateTime"/>
30 |      <xsl:sequence select="month-from-date($dt) = 2
   |                            and day-from-date($dt) = 29"/>
   |    </xsl:when>
   |    <xsl:otherwise>
   |      <xsl:sequence select="false()"/>
35 |    </xsl:otherwise>
   |  </xsl:choose>
   |</xsl:function>
   | 
   |</xsl:stylesheet>

Example 2.7. A function library in XSLT

After the library has been imported, the functions that it defines can be used in XPath expressions in the pipeline. The pipeline in Example 2.8, “Using the function library” outputs a single document that answers the questions, “is today a leap day and is 29 February 2028 a leap day?”

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:f="http://example.com/ns/functions"
   |                name="main" version="3.1">
   |  <p:documentation>
 5 |    <div xmlns="http://www.w3.org/1999/xhtml">
   |      <p>Example of importing functions. This requires Saxon EE.</p>
   |    </div>
   |  </p:documentation>
   | 
10 |  <p:import-functions href="is-leap-day.xsl"/>
   | 
   |  <p:output port="result" serialization="map{'indent':true()}"/>
   |    
   |  <p:identity>
15 |    <p:with-input exclude-inline-prefixes="#all">
   |      <leap-days>
   |        <today date="{substring(string(current-date()), 1, 10)}"
   |               >{f:is-leap-day()}</today>
   |        <other date="2028-02-29"
20 |               >{f:is-leap-day('2028-02-29')}</other>
   |      </leap-days>
   |    </p:with-input>
   |  </p:identity>
   |</p:declare-step>

Example 2.8. Using the function library

If you have Saxon EE and you run the pipeline, the output will be something like:

<leap-days>
   <today date="2025-07-26">false</today>
   <other date="2028-02-29">true</other>
</leap-days>

Note that the p:output element in the pipeline uses the serialization options to pretty-print the output and the p:with-input uses exclude-inline-prefixes to avoid having the namespace declaration for “f” on the output.

2.2. Connecting steps to inputs

Atomic and compound steps use p:with-input to describe how their inputs are connected.

<p:with-input
port? = `NCName`	The port name
select? = `XPathExpression`	XPath selection from the inputs
href? = { `anyURI` }	Document URI
pipe? = `string`	A pipe binding
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
>
((empty \| (document \| pipe \| inline)) \| `anyElement`)
</p:with-input>

The notes about the attributes on p:input apply to p:with-input.

The input on the p:for-each step in the example above does not have any explicit bindings:

  |<p:with-input select="//c:file"/>

That means it connects to the “default readable port”, usually the primary output of the preceding step.

2.2.1. Document inputs

The p:document element (or the href attribute on p:with-input) connects an input to document identified with a URI.

<p:document
href = { `anyURI` }	Document URI
content-type? = `string`	The required content type
document-properties? = `map(xs:QName,item()*)`	Document properties map
parameters? = `map(xs:QName,item()*)`	Parameters map
{any-name}* = `string`	Additional attributes
/>

The document properties will be applied to the result. The parameters me be used during document access.

Notes on the attributes of p:document

href: The document will be retrieved from this URI.
content-type: Identifies the (required) content type of the document, for example application/json or text/plain. If not specified, the content type is inferred from the URI.
document-properties: Document properties are name/value pairs associated with the document. Unqualified property names, like base-uri are defined by the XProc specification. You can add arbitrary namespace qualified properties.
parameters: The p:document instruction is defined in terms of the p:load step. The parameters passed to it can be used by the load step to aid in retrieving the document. For example, a username and password might be passed as parameters.

2.2.2. Inline inputs

The p:inline element connects an input to a document placed into the pipeline directly.

<p:inline
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
content-type? = `string`	The content type of the inline
document-properties? = `map(xs:QName,item()*)`	Document properties map
encoding? = `string`	Encoded content (base64)
>
`anyNode`*
</p:inline>

The p:inline element can be omitted in the simple case of a single XML document.

Notes on the attributes of p:document

content-type: Identifies the (required) content type of the content, for example application/json or text/plain. If not specified, the content type is assumed to be XML.
document-properties: Document properties are name/value pairs associated with the document. Unqualified property names, like base-uri are defined by the XProc specification. You can add arbitrary namespace qualified properties.
encoding: The only supported value for encoding is base64. This encoding allows an inline to be in a different character set than the XML or to contain non-XML characters.

This p:inline element specifies that it is text. Attribute value templates in the body are used to evaluate expressions.

  |<p:inline content-type="text/plain">Processed {$total} files; {f:is-leap-day()}&#10;</p:inline>

2.2.3. Pipe inputs

The p:pipe element (or the pipe attribute on p:with-input) connects an input to the output from some other step. If step is omitted, the step associated with the default readable port is assumed. If port is omitted, the primary output port of the step is assumed. (Consequently, <p:pipe/> is a connection to the default readable port.) It’s an error to attempt to refer to the default readable port if there isn’t one.

<p:pipe
step? = `NCName`	The step name
port? = `NCName`	The port name
{any-name}* = `string`	Additional attributes
/>

The p:identity step in the viewport users a pipe attribute to make a pipe binding:

  |<p:with-input pipe="copyright@main"/>

The variable declaration for $total uses the p:pipe step to make a pipe binding. In the context of a variable, this establishes the context item used when evaluating the expression.

  |<p:pipe step="listing"/>

2.2.4. Empty inputs

The p:empty element explicitly binds the input to an empty sequence of documents.

<p:empty
{any-name}* = `string`	Additional attributes
/>

2.3. Connecting option values to steps

In much the same way as inputs are provided using p:with-input, option values are provided using p:with-option.

<p:with-option
name = `EQName`	The option name
as? = `XPathSequenceType`	The required value type
select = `XPathExpression`	Option value
collection? = `boolean`	Inputs as default collection?
href? = { `anyURI` }	Document URI
pipe? = `string`	A pipe binding
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
>
((empty \| (document \| pipe \| inline)) \| `anyElement`)
</p:with-option>

Options can also be specified as attributes on the step itself, in which case the attribute name is the option name and its value is interprted as an attribute value template. For example, in Example 2.1, “A compound step declaration”, the $path and $include-filter options are set this way:

  |<p:directory-list name="listing" path="{$path}"
  |                  include-filter=".*\.xml$">

Notes on the attributes of p:with-option

name: The option name. This must be the name of an option declared for the step on which it is used.
as: The type of the option, for example “xs:integer” or “map(xs:string, xs:dateTime)”.
select: The select expression is evaluated to provide a value for the option.
collection: If the collection attribute is true, all of the documents that appear on the context binding are placed in the default collection for the expression. An expression can only refer to the context item if it is a single value, but by using the default collection, an option can handle a sequence of values.
href: Shortcut for a single p:document binding.
pipe: Shortcut for one or more p:pipe bindings.
exclude-inline-prefixes: When you put XML in a p:inline element, all of the in-scope namespaces will apply to those elements. You can use exclude-inline-prefixes to exclude some of them. The value of the attribute must be a space-separated list of in-scope namespace prefixes. It’s an error to refer to a prefix that doesn’t have an in-scope declaration. The tokens #default and #all may also be used to exclude the default namespace and all namespaces, respectively.

In Example 2.1, “A compound step declaration”, the $max-depth option is set on the p:directory-list using p:with-option:

  |<p:with-option name="max-depth"
  |               select="if ($recurse) then 'unbounded' else '1'"/>

2.4. Compound steps

The XProc specification defines several compound steps, steps that contain subpipelines. XML Calabash also implements a couple of additional compound steps. (Pipelines that use extension compound steps are probably not, strictly speaking, conformant with the specification.)

2.4.1. Looping over inputs (p:for-each)

The p:for-each step loops over sequence of documents, processing each with its subpipeline.

<p:for-each
name? = `NCName`	The step name
{any-name}* = `string`	Additional attributes
>
(with-input?, output*, `subpipeline`)
</p:for-each>

☞

Tip

Unlike XPath, if a select expression is used on the p:with-input the nodes selected from the original document(s) are not what the loop iterates over. Instead, a whole new document is constructed for each selection and that document is processed.

This means, for example, that “@name” can’t be used to test an attribute on the loop input. You need to use “/*/@name” or something similar instead.

The p:for-each in Example 2.9, “Looping with for-each” loops over the c:file elements from the p:directory-list step, but each one will be in its own document.

  |<p:for-each name="loop">
  |  <p:with-input select="//c:file"/>
  |…
  |</p:for-each>

Example 2.9. Looping with for-each

In principle, a p:for-each can process its inputs in parallel. XML Calabash does not do so at this time.

2.4.2. Changing internal structures (p:viewport)

The p:viewport step replaces sections of a document with the result of processing (a document containing) those sections using the subpipeline.

<p:viewport
name? = `NCName`	The step name
match = `XSLTSelectionPattern`	Match pattern for content to replace
{any-name}* = `string`	Additional attributes
>
(with-input?, output?, `subpipeline`)
</p:viewport>

The p:viewport in Example 2.10, “Matching with viewport” matches each copyright element where the holder is “Someone Random” and replaces that element with the result of its subpipeline, in this case, with the contents of the document on the copyright input port.

1 |<p:viewport match="copyright[. = 'Someone Random']">
  |  <p:identity>
  |    <p:with-input pipe="copyright@main"/>
  |  </p:identity>
5 |</p:viewport>

Example 2.10. Matching with viewport

☞

Tip

Like select on p:for-each, the elements matched by p:viewport are available to the subpipeline as new documents.

2.4.3. Choosing among alternatives (p:choose)

The p:choose step allows you to select processing among a number of alternatives. At most one alternative will be used: either the first p:when (in document order) for which the test expression is true or the p:otherwise.

<p:choose
name? = `NCName`	The step name
{any-name}* = `string`	Additional attributes
>
(with-input?, ((when+, otherwise?) \| (when*, otherwise)))
</p:choose>

The example pipeline in Example 2.11, “An example choice” uses p:choose to select which stylesheet to use when formatting a document based on the document status.

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:c="http://www.w3.org/ns/xproc-step"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 5 |                xmlns:ex="http://example.com/ns"
   |                name="main" version="3.1" type="ex:format">
   |  <p:input port="source"/>
   |  <p:output port="result"/>
   | 
10 |  <p:choose>
   |    <p:when test="/*/@status = 'draft'">
   |      <p:xslt>
   |        <p:with-input port="stylesheet" href="draft.xsl"/>
   |      </p:xslt>
15 |    </p:when>
   |    <p:when test="/*/@status = 'final'">
   |      <p:with-input pipe="source"/>
   |      <p:xslt>
   |        <p:with-input port="stylesheet" href="final.xsl"/>
20 |      </p:xslt>
   |    </p:when>
   |    <p:otherwise>
   |      <p:error code="ex:bad-status">
   |        <p:with-input>
25 |          <p:inline content-type="text/plain">Unexpected status</p:inline>
   |        </p:with-input>
   |      </p:error>
   |    </p:otherwise>
   |  </p:choose>
30 |</p:declare-step>

Example 2.11. An example choice

Each alternative is a p:when which contains the subpipeline to run if this alternative is selected.

<p:when
name? = `NCName`	The step name
test = `XPathExpression`	The test expression
collection? = `boolean`	Inputs as default collection?
{any-name}* = `string`	Additional attributes
>
(with-input?, output*, `subpipeline`)
</p:when>

The first (and only the first) p:when where the test expression evaluates to true is used.

Notes on the attributes of p:when

name: The step name is only used by the subpipeline in the p:when. It’s how a step in the subpipeline can refer, for example with p:pipe, to the context input.
test: The test expression. This p:when is selected if it is the first p:when, in document order, where the expression evaluates to true. It’s an error if the expression refers to the context item if there is not exactly one context item provided by the p:with-input.
collection: If the collection attribute is true, all of the documents that appear on the context binding are placed in the default collection for the expression. An expression can only refer to the context item if it is a single value, but by using the default collection, an option can handle a sequence of values.

The first p:when selects documents with a status of “draft”:

1 |<p:when test="/*/@status = 'draft'">
  |  <p:xslt>
  |    <p:with-input port="stylesheet" href="draft.xsl"/>
  |  </p:xslt>
5 |</p:when>

The second p:when selects documents with a status of “final”. This example includes an explicit binding for the input that will be used to set the context item for the test expression. It’s unnecessary as it’s the same as the default readable port in this case.

1 |<p:when test="/*/@status = 'final'">
  |  <p:with-input pipe="source"/>
  |  <p:xslt>
  |    <p:with-input port="stylesheet" href="final.xsl"/>
5 |  </p:xslt>
  |</p:when>

The p:otherwise contains the subpipeline to run if no other alternative is selected.

<p:otherwise
name? = `NCName`	The step name
{any-name}* = `string`	Additional attributes
>
(output*, `subpipeline`)
</p:otherwise>

The p:otherwise in this example raises an error if the status is neither draft nor final:

1 |<p:otherwise>
  |  <p:error code="ex:bad-status">
  |    <p:with-input>
  |      <p:inline content-type="text/plain">Unexpected status</p:inline>
5 |    </p:with-input>
  |  </p:error>
  |</p:otherwise>

2.4.4. Simple conditionals (p:if)

The p:if is a simplified form of p:choose. If the test expression is true, then the subpipeline is run and that determines the output from the step. If the expression is false, p:if operates as an identity step, passing its input through unchanged.

<p:if
name? = `NCName`	The step name
test = `XPathExpression`	The test expression
collection? = `boolean`	Inputs as default collection?
{any-name}* = `string`	Additional attributes
>
(with-input?, output*, `subpipeline`)
</p:if>

The notes about the attributes on p:when apply to p:if.

The p:if in Example 2.12, “Using an if instruction” deletes the count attribute if it has the value “0”.

  |<p:if test="xs:integer(/doc/@count) = 0">
  |  <p:delete match="/doc/@count"/>
  |</p:if>

Example 2.12. Using an if instruction

If the attribute has any other value (or isn’t present), the document passes through as if it was an p:identity step.

2.4.5. Grouping (p:group)

The p:group step is just a wrapper around a subpipeline.

<p:group
name? = `NCName`	The step name
{any-name}* = `string`	Additional attributes
>
(output*, `subpipeline`)
</p:group>

2.4.6. Exception handling (p:try)

The p:try step allows a pipeline author to catch errors and recover from them. The subpipeline is run. If no errors occur, the result of the step is the result of that subpipeline. If an error does occur, each p:catch is tested in turn and the result of the step is the result of the first matching p:catch.

<p:try
name? = `NCName`	The step name
{any-name}* = `string`	Additional attributes
>
(output, `subpipeline`, ((catch+, finally?) \| (catch, finally)))
</p:try>

The example for p:choose raises an error if the status is neither “draft” nor “final”. We can use p:try to recover from that error and treat any such document as if it had draft status, as shown in Example 2.13, “An example try/catch”.

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:c="http://www.w3.org/ns/xproc-step"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 5 |                xmlns:ex="http://example.com/ns"
   |                name="main" version="3.1">
   |  <p:import href="choose.xpl"/>
   | 
   |  <p:input port="source"/>
10 |  <p:output port="result"/>
   | 
   |  <p:try>
   |    <ex:format/>
   |    <p:catch code="ex:bad-status">
15 |      <p:xslt>
   |        <p:with-input port="source" pipe="source@main"/>
   |        <p:with-input port="stylesheet" href="draft.xsl"/>
   |      </p:xslt>
   |    </p:catch>
20 |  </p:try>
   |</p:declare-step>

Example 2.13. An example try/catch

A p:catch matches an error if the error code is in its code list, or if it does not have a code attribute at all.

If there are no matching catches, the p:try step fails with the error (which may be caught and handled by some p:try among its ancestors, if it has any.)

<p:catch
name? = `NCName`	The step name
code? = `EQNameList`	The error codes to catch
{any-name}* = `string`	Additional attributes
>
(output*, `subpipeline`)
</p:catch>

Note that the default readable port inside the p:catch is the error document produced by the failed pipeline. You have to make an explicit binding if you want something else.

Irrespective of whether the subpipeline succeeds or fails and whether or not a catch is invoked (and whether or not it succeeds or fails), the p:finally subpipeline will be run.

<p:finally
name? = `NCName`	The step name
{any-name}* = `string`	Additional attributes
>
(output*, `subpipeline`)
</p:finally>

It is very uncommon for this to be useful. One plausible use case is for the finally step to clean up any side effects that might have been introduced by the subpipeline or the catch expressions, for example, deleting a temporary file or closing a database connection.

2.4.7. Loop until a condition is true (cx:until)

This is an extension compound step that processes single documents, applying its subpipeline until the test expression is true.

<cx:until
name? = `NCName`	The step name
test = `XPathExpression`	The test expression
{any-name}* = `string`	Additional attributes
>
(with-input?, output?, `subpipeline`)
</cx:until>

The test attribute specifies an XPath expression. The subpipeline is always run at least once and the condition is only tested at the end of the loop. The result of the subpipeline is provided as the context item. The previous result is provided in the variable cx:previous.

The pipeline Example 2.14, “Looping until a condition is true” demonstrates a horribly inefficient way to add explicit numbers to a list.

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:c="http://www.w3.org/ns/xproc-step"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 5 |                exclude-inline-prefixes="#all"
   |                name="main" version="3.1">
   |  <p:output port="result" serialization="map{'indent':true()}"/>
   | 
   |  <p:identity name="identity">
10 |    <p:with-input>
   |      <list>
   |        <item/>
   |        <item/>
   |        <item/>
15 |      </list>
   |    </p:with-input>
   |  </p:identity>
   | 
   |  <cx:until test="deep-equal(., $cx:previous)">
20 |    <p:replace match="/list/item[1]">
   |      <p:with-input port="replacement">
   |        <li number="{p:iteration-position()}"/>
   |      </p:with-input>
   |    </p:replace>
25 |  </cx:until>
   | 
   |</p:declare-step>

Example 2.14. Looping until a condition is true

The first time through the loop, the first item is replaced. The next time through, the next item is replaced, etc. Looping stops when nothing changes in the document (that is, when all the items have been replaced).

The result of the cx:until step is first document for which the test expression is true. In this case, the result is:

1 |<list>
  |   <li number="1"/>
  |   <li number="2"/>
  |   <li number="3"/>
5 |</list>

It is a dynamic error (err:XD0001) if the source is not a single document.

2.4.8. Loop while a condition is true (cx:while)

This is an extension compound step that processes single documents, applying its subpipeline while the test expression is true.

<cx:while
name? = `NCName`	The step name
test = `XPathExpression`	The test expression
{any-name}* = `string`	Additional attributes
>
(with-input?, output?, `subpipeline`)
</cx:while>

The somewhat contrived example in Example 2.15, “Looping while a condition is true” loops over the document adding a new first child until the count reaches zero.

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:c="http://www.w3.org/ns/xproc-step"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:xs="http://www.w3.org/2001/XMLSchema" 
 5 |                exclude-inline-prefixes="#all"
   |                name="main" version="3.1">
   |  <p:output port="result" serialization="map{'indent':true()}"/>
   | 
   |  <p:identity name="identity">
10 |    <p:with-input>
   |      <doc count="3"/>
   |    </p:with-input>
   |  </p:identity>
   | 
15 |  <cx:while test="/doc/@count and xs:integer(/doc/@count) gt 0">
   |    <p:insert position="first-child">
   |      <p:with-input port="insertion">
   |        <insertion for="{/doc/@count}"/>
   |      </p:with-input>
20 |    </p:insert>
   | 
   |    <p:add-attribute attribute-name="count" attribute-value="{xs:integer(/doc/@count) - 1}"/>
   | 
   |    <p:if test="xs:integer(/doc/@count) = 0">
25 |      <p:delete match="/doc/@count"/>
   |    </p:if>
   |  </cx:while>
   | 
   |</p:declare-step>

Example 2.15. Looping while a condition is true

The result of the cx:while step is first document for which the test expression did not have an effective boolean value of true. In this case, the result is:

1 |<doc>
  |   <insertion for="1"/>
  |   <insertion for="2"/>
  |   <insertion for="3"/>
5 |</doc>

It is a dynamic error (err:XD0001) if the source is not a single document.

The test attribute specifies an XPath expression. The document is provided as the context item. If the expression is false, the loop is not run (or run again).

2.5. Atomic steps

A great many pipelines that you write will be like shell scripts or “main” functions in other programming languages: they run, they do a thing, and they end. But in fact, with a little extra markup, every pipeline that you declare can also be reused an atomic step elsewhere. In this way, there are an unbounded number of atomic steps: there are all of the standard ones (summarized in Part I, “Standard steps”), there are all of the extension steps that ship with XML Calabash (summarized in Part III, “XML Calabash extension steps”), and then there are all the steps that you write.

2.6. Variables

A subpipeline may use p:variable to hold the result of a computation.

<p:variable
name = `EQName`	The variable name
as? = `XPathSequenceType`	The required value type
select = `XPathExpression`	Variable value
collection? = `boolean`	Inputs as default collection?
href? = { `anyURI` }	Document URI
pipe? = `string`	A pipe binding
{any-name}* = `string`	Additional attributes
exclude-inline-prefixes? = `string`	A space-separated list of namespace prefixes
>
((empty \| (document \| pipe \| inline)) \| `anyElement`)
</p:variable>

The notes about the attributes on p:with-option apply to p:variable.

Expressions which occur later in the subpipeline may refer to the variable.

2.7. Extra information

Documentation can be placed anywhere in the pipeline with the p:documentation element. It’s ignored by the processor.

<p:documentation
{any-name}* = `string`	Additional attributes
>
`any-well-formed-content`*
</p:documentation>

The p:pipeinfo element is intended for additional information that a particular processor might use. XML Calabash uses them for assertions, for example.

<p:pipeinfo
{any-name}* = `string`	Additional attributes
>
`any-well-formed-content`*
</p:pipeinfo>

ⓘ

Note

The p:documentation and p:pipeinfo elements have no special significance inside a p:inline; in that context, they’re just inline elements.

Prev	Up	Next
Chapter 1. Dependencies	Home	Part I. Standard steps