Name

cx:pdf-to-images — Convert PDF pages to images.

Synopsis

This step converts PDF pages into raster images.

Input portPrimarySequenceContent types
source✔  application/pdf 
Output portPrimarySequenceContent types
result✔ ✔ image/png image/jpeg image/bmp 
Option nameTypeValuesDefault value
dpixs:integer 300
formatxs:string('png', 'jpeg', 'bmp') 'png'
passwordxs:string? ()
This is an extension step; to use it, your pipeline must include its declaration. For example, by including the extension library with an import at the top of your pipeline:
<p:import href="https://xmlcalabash.com/ext/library/pdf-steps.xpl"/>
Declaration
 1 |<p:declare-step xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                xmlns:p="http://www.w3.org/ns/xproc"
   |                type="cx:pdf-to-images">
   |   <p:input port="source" content-types="application/pdf"/>
 5 |   <p:output port="result"
   |             sequence="true"
   |             content-types="image/png image/jpeg image/bmp"/>
   |   <p:option name="password" as="xs:string?"/>
   |   <p:option name="dpi" as="xs:integer" select="300"/>
10 |   <p:option name="format" values="('png','jpeg','bmp')" select="'png'"/>
   |</p:declare-step>

Description

This step converts each page in the PDF file to an image. The result is a sequence of page images.

The image resolution is specified with the dpi option; the image format with the format option.

Document properties

No document properties are preserved.

Additional examples

The XML Calabash test suite contains examples of the cx:pdf-to-images step.