Name

cx:pdf-form — Programmatically fill in PDF forms.

Synopsis

This step attempts to programmatically fill in PDF forms.

Input portPrimarySequenceContent types
source  application/pdf 
data  application/xml 
Output portPrimarySequenceContent types
result✔  application/pdf 
Option nameTypeValuesDefault value
compressionxs:string('none', 'default') 'default'
passwordxs:string? ()
This is an extension step; to use it, your pipeline must include its declaration. For example, by including the extension library with an import at the top of your pipeline:
<p:import href="https://xmlcalabash.com/ext/library/pdf-steps.xpl"/>
Declaration
1 |<p:declare-step xmlns:cx="http://xmlcalabash.com/ns/extensions"
  |                xmlns:p="http://www.w3.org/ns/xproc"
  |                type="cx:pdf-form">
  |   <p:input port="source" content-types="application/pdf"/>
5 |   <p:input port="data" content-types="application/xml"/>
  |   <p:output port="result" content-types="application/pdf"/>
  |   <p:option name="password" as="xs:string?"/>
  |   <p:option name="compression" values="('none', 'default')" select="'default'"/>
  |</p:declare-step>

Description

This step uses the form data provided on the data port to update the form in the PDF. The XML document on the data should conform to the RELAX NG grammar for the http://xmlcalabash.com/ns/acro-form namespace.

The easiest way to construct an example of this grammar (for a given form) is to use the cx:pdf-info step with the form-details option set to true.

For example, the test suite contains a PDF document with a form, libreoffice-form.pdf. The form details for that PDF are:

 1 |<f:acro-form xmlns:f="http://xmlcalabash.com/ns/acro-form">
   |  <f:text name="First Name" type="Tx">Alice</f:text>
   |  <f:text name="Last Name" type="Tx"/>
   |  <f:radiobutton name="female" type="Btn">Off</f:radiobutton>
 5 |  <f:text name="Birthday" type="Tx"/>
   |  <f:checkbox name="gdpr" type="Btn">Off</f:checkbox>
   |  <f:checkbox name="other" type="Btn">Off</f:checkbox>
   |  <f:text name="First Name_2" type="Tx">Bob</f:text>
   |  <f:combobox name="Nationality" type="Ch">
10 |    <f:choice>Unknown</f:choice>
   |    <f:choice>German</f:choice>
   |    <f:choice>Indonesian</f:choice>
   |    <f:choice>US-American</f:choice>
   |    <f:choice>French</f:choice>
15 |    <f:choice>Spanish</f:choice>
   |    <f:choice>Italian</f:choice>
   |  </f:combobox>
   |</f:acro-form>

To update that form, you might provide a document like this on the data port:

1 |<f:acro-form xmlns:f="http://xmlcalabash.com/ns/acro-form">
  |  <f:text name="Last Name" type="Tx">Wonderland</f:text>
  |  <f:radiobutton name="female" type="Btn">1</f:radiobutton>
  |  <f:checkbox name="gdpr" type="Btn">1</f:checkbox>
5 |  <f:checkbox name="other" type="Btn">Off</f:checkbox>
  |  <f:combobox name="Nationality" type="Ch">
  |    <f:value>Unknown</f:value>
  |  </f:combobox>
  |</f:acro-form>

Fields not included will not be changed. It’s an error to include a field that does not exist. The PDFBox libraries also do some error checking on the values. And sometimes cx:pdf-form fails for reasons beyond XML Calabash’s control, such as missing fonts.

Do not include the f:signature field, attempting to update signatures is not supported.

AcroForm RELAX NG grammar

This is the grammar for form data.

 1 |default namespace = "https://xmlcalabash.com/ns/acro-form"
   |namespace f = "https://xmlcalabash.com/ns/acro-form"
   | 
   |start = f.acro-form
 5 | 
   |anyAttribute = attribute * { text }*
   | 
   |fieldAttributes = 
   |    attribute name { text },
10 |    attribute (* - name) { text }*
   | 
   |field = f.field | f.checkbox | f.radiobutton | f.button | f.text | f.combobox | f.listbox
   |        | f.signature
   | 
15 |buttonValues = "On" | "Off" | "Yes" | "No" | "true" | "false" | "1" | "0" 
   | 
   |f.acro-form =
   |    element f:acro-form {
   |        field+
20 |    }
   | 
   |f.field =
   |    element f:field {
   |        fieldAttributes,
25 |        (text | f.text-value)
   |    }
   | 
   |f.checkbox =
   |    element f:checkbox {
30 |        fieldAttributes,
   |        (buttonValues | f.button-value)
   |    }
   | 
   |f.radiobutton =
35 |    element f:radiobutton {
   |        fieldAttributes,
   |        (buttonValues | f.button-value)
   |    }
   | 
40 |f.button =
   |    element f:button {
   |        fieldAttributes,
   |        (buttonValues | f.button-value)
   |    }
45 | 
   |f.button-value =
   |    element f:value {
   |        anyAttribute,
   |        (buttonValues)
50 |    }
   | 
   |f.text =
   |    element f:text {
   |        fieldAttributes,
55 |        (text | f.text-value)
   |    }
   | 
   |f.text-value =
   |    element f:value {
60 |        anyAttribute,
   |        text
   |    }
   | 
   |f.combobox =
65 |    element f:combobox {
   |        fieldAttributes,
   |        f.choice*,
   |        f.value
   |    }
70 | 
   |f.listbox =
   |    element f:listbox {
   |        fieldAttributes,
   |        f.choice*,
75 |        f.value
   |    }
   | 
   |f.signature =
   |    element f:signature {
80 |        fieldAttributes,
   |        empty
   |    }
   | 
   |f.choice =
85 |    element f:choice {
   |        anyAttribute,
   |        text
   |    }
   | 
90 |f.value =
   |    element f:value {
   |        anyAttribute,
   |        text
   |    }

Document properties

No document properties are preserved.

Additional examples

The XML Calabash test suite contains examples of the cx:pdf-form step.