Name

cx:polyglot — Evalute steps implemented in other programming languages.

Synopsis

Input portPrimarySequenceContent typesDefault binding
source✔ ✔  p:empty
program  text 
Output portPrimarySequenceContent typesDefault binding
result✔   
Option nameTypeDefault value
argsxs:string*()
languagexs:string()
parametersmap(xs:QName,item()?)?()
result-content-typexs:string?()
Declaration
 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc">
   |   <p:input port="source" primary="true" sequence="true">
   |      <p:empty/>
   |   </p:input>
 5 |   <p:input port="program" content-types="text"/>
   |   <p:output port="result"/>
   |   <p:option name="language" as="xs:string"/>
   |   <p:option name="args" as="xs:string*"/>
   |   <p:option name="result-content-type" as="xs:string?"/>
10 |   <p:option name="parameters" as="map(xs:QName,item()?)?"/>
   |</p:declare-step>

Installation

The cx:polyglot step is not included in the standard XML Calabash release. You must obtain it separately and install it on your classpath before you can use this step. (More detailed instructions, T.B.D.)

Description

The cx:polyglot step leverages the GraalVM Polyglot Programming library to evaluate other programming languages as the implementations of steps.

The polyglot extension includes cx:javascript and cx:python for running JavaScript and Python scripts, respectively. Other languages are possible if the dependencies are installed and GraalVM is configured appropriately, including Ruby, R, and Java.

If a document appears on the source port, it will be serialized and will appear on standard input for the program. It is a dynamic error (err:Xcxerr:XI0047) if more than one input appears on the source port.

The language option must be a GraalVM language identifier. For the cx:javascript and cx:python steps, this is set automatically.

The program text must appear on the program port. The step will fail if the input is not syntactically correct for the specified language.

The value of the args option will be passed as the arguments to the program. The first argument passed to scripts is conventionally the name of the script executable. The polyglot step sets that to the base URI of the invoking step.

The parameters option is used to initialize the in-scope variables for the script. The names of the parameters must not be in a namespace. The values are converted to language-appropriate values where possible. XML values are serialized and passed as strings.

The result-content-type controls how the script returns a value.

If the script ends with an non-zero exit code, the step will fail.

Returning values

There are two ways to return a value: directly as the last expression in the script, or by writing the result to standard output.

Direct results

If there is no result-content-type, the step assumes that the result will be returned directly by the script.

Returning results from a script is a bit unorthodox; scripts don’t usually return anything except an exit code. The GraalVM library treats the last expression in the script as the return value. For example, this Python script “returns” the number 42.

  |print("It doesn’t matter what you do here")
  | 
  |42

The advantage of using direct results is that they don’t have to be serialized and reparsed. Direct results can only be atomic values, maps, and arrays.

🛑
Warning

The GraalVM library always returns something from your script. If you don’t provide a final expression, the resulting value is likely to be uninterpretable. (It will probably appear to be a map that contains non-XML characters.)

Always return something.

On standard output

If a result-content-type is specified, the step assumes that the result will be written to standard output. The result of the step will be the text that appears on standard output interpreted according to the result-content-type.

This is the only way to return XML.