Appendix AConfiguration

XML Calabash can read a configuration file to establish some default settings. The configuration file is an XML document. All of the elements in the configuration file must be in the https://xmlcalabash.com/ns/configuration namespace. The conventional prefix for this namespace in the documentation is cc:.

cc:xml-calabash

The document element of the configuration file is cc:xml-calabash:

<cc:xml-calabash xmlns:cc="https://xmlcalabash.com/ns/configuration"
  version? = 1.0
  saxon-configuration? = string
  licensed? = boolean
  verbosity? = trace|debug|progress|info|warn|error>
    (cc:graphviz |
     cc:inline |
     cc:mimetype |
     cc:paged-media |
     cc:proxy |
     cc:saxon-configuration-property |
     cc:send-mail |
     cc:serialization |
     cc:system-property |
     cc:message-reporter |
     cc:visualizer |
     cc:threading)*
</cc:xml-calabash>

version (string)

The configuration file version, must be 1.0.

saxon-configuration (filename)

The filename of a Saxon configuration file. This file will be loaded to initialize the Saxon configuration.

licensed (boolean)

If true, a licensed Saxon configuration will be requested. In practice, a licensed processor is used by default, if one is available. However, setting this property to false will explicitly request an unlicensed processor when Saxon PE or Saxon EE are on the classpath.

This can also be specified on the command line. The command-line setting takes precedence.

Schema-aware processing requires Saxon EE and a valid Saxon license.

verbosity

The default “verbosity” setting. This can also be specified on the command line. The command-line setting takes precedence.

For simplicity, the content model of cc:xml-calabash allows every element to occur an arbitrary number of times. Where an element defines a single, global setting, the last value in document order applies.

cc:graphviz

Identifies the location of the Graphviz executable. Making SVG diagrams of pipelines or graphs requires Graphviz.

<cc:graphviz
  dot = string />

dot (filename)

Location of the Graphviz “dot” executable.

cc:inline

Properties related to p:inline elements.

<cc:inline
  trim-whitespace = boolean />

trim-whitespace (boolean)

It’s often convenient to use indentation in a pipeline document:

1 |
  |  <p:with-input port="source">
  |    <p:inline>
  |      <document/>
5 |    </p:inline>
  |  </p:with-input>
  |

But that introduces whitespace at the beginning and end of the inline document. As written, the document that is provided on the source port consists of: a newline, six spaces, the <document/> element, a newline and four spaces. Sometimes that’s annoying. It’s posssible to rewrite the example so that there’s no insignificant whitespace, but that makes the pipeline harder to read.

If trim-whitespace is true, leading and trailing whitespace in p:inline elements is removed. This setting does not apply to implicit inlines because they never have leading or trailing whitespace.

cc:mimetype

Define additional filename extension to content type mappings.

<cc:mimetype
  content-type = string
  extensions = string />

XML Calabash uses javax.activation to lookup mime types. You can define new types by creating an appropriately formatted .mime.types file in your home directory. This will work for all applications that read the .mime.types file.

Alternatively, you can define them in the configuration file.

content-type (MIME type)

The content-type.

extensions (extension+)

A space-separated list of filename extensions to associate with the content type.

For example, this entry:

  |<cc:mimetype content-type="application/xml" extensions="xpl xproc"/>

Will tell XML Calabash that filenames (or URIs, generally, in the absence of server metadata) that end with .xpl or .xproc should be interpreted as files with the application/xml content type.

cc:paged-media

Select and configure paged media providers.

<cc:paged-media
  css-formatter? = string
  xsl-formatter? = string
  {any-name} = string />

At least one of css-formatter or xsl-formatter must be provided. The value of the attribute should be the URI that identifies the processor that you want to select.

When searching for a CSS or XSL FO formatter, XML Calabash will try to instantiate the processors in the order you specify them, selecting the first one that’s successfully instantiated. To indicate that any acceptable processor can be used, specify https://xmlcalabash.com/paged-media/css-formatter for a CSS processor, or https://xmlcalabash.com/paged-media/xsl-formatter for an XSL FO processor.

Any additional attribute/value pairs on the element are passed to the processor as configuration data. The accepted attributes and their valid values vary depending on the processor. No configuration properties are supported for the generic processors.

See p:css-formatter and p:xsl-formatter in the Reference Guide for details.

cc:proxy

Define proxy URIs for internet protocol requests.

<cc:proxy
  scheme = string
  uri = anyURI />

scheme (protocol scheme)

The protocol scheme.

uri (anyURI)

The proxy URI.

If your network configuration requires the use of a proxy, you can define them with cc:proxy. For example, this establishes that requests for http: URIs should use the http://localhost:8888/ proxy.

  |<cc:proxy scheme="http" uri="http://localhost:8888"/>

cc:saxon-configuration-property

Sets a Saxon configuration property.

<cc:saxon-configuration-property
  name = string
  value = string />

name (property name)

The Saxon configuration property name.

value

The property value.

XML Calabash does not maintain a list of valid properties. Those are defined by Saxon. Attempting to set a property that doesn’t exist will throw an exception. Boolean valued properties must have the value true or false.

cc:send-mail

Define properties for the p:send-mail step.

<cc:send-mail
  host = string
  port? = integer
  username? = string
  password? = string />

host (string)

The SMTP server host.

port (integer)

The server port.

username (string)

The user name, if login is required.

password (string)

The password, if login is required.

In order to send mail, the p:send-mail step needs to know the location of the SMTP server and login credentials, if they are required.

cc:serialization

Default serialization properties for particular content types.

<cc:serialization
  content-type = string
  {any-name}* = string />

content-type (MIME type)

The content type.

any-name

Any attributes on the cc:serialization element other than content-type define the default serialization properties for documents with the corresponding content type.

For example, adding this to your configuration file:

  |<cc:serialization content-type="text/html"
  |                  method="html" html-version="5"/>

Will serialize text/html documents using HTML 5 serialization by default. The serialization properties on a document take precendence over these defaults.

cc:system-property

Set Java system properties before running a pipeline.

<cc:system-property
  name = string
  value = string />

name (property name)

The Java system property name.

value

The property value.

Any properties specified in the configuration file will be set before the pipeline runs.

cc:message-reporter

Configure the message reporter. The only option is buffer-size.

<cc:message-reporter
  buffer-size? = integer />

buffer-size (integer)

Sets the number of messages buffered. These can be retrieved in a pipeline with cx:pipeline-messagers. If the value is negative, there is no limit on the number of messages buffered. The default value is 32.

cc:visualizer

Control which visualizer is used and its options.

<cc:visualizer
  name = silent|plain|detail
  {any-name} = string />

The name must be specified. Additional attributes provide options for the visualizer.

There are three options for the name:

silent

Silent, no progress is reported.

plain

Plain, the name of each step is reported when it begins running. Most steps manufactured automatically during graph construction are omitted. There is one option, indent which determines whether or not, and to what extent, reports are indented when they are nested inside compound steps.

detail

Detailed, the start and end of each step is identified and the documents that they produce can also be identified.

If the steps option is true, the progress of steps is recorded. (Defaults to true.)

If the documents option is true, the documents produced during execution are recorded. (Defaults to false.)

cc:threading

Control aspects of XML Calabash threading.

<cc:threading
  count = integer />

count (integer)

The size of the thread pool.

Irrelevant at the moment. XML Calabash is currently single-threaded.