Chapter 1. Installation
XML Calabash version 3.0.0-alpha14 is still very much an alpha release. While it’s still in alpha/beta, it’s not being published anywhere except on the releases page. That release is the command line application. In the future, library releases will be published as well.
Download the latest release and unzip it on your filesystem. That will
create a xmlcalabash-3.0.0-alpha14/
directory containing: README.md
,
xmlcalabash-3.0.0-alpha14.jar
, and
a lib/
directory containing
lots of jar files. The application ships with all of the dependencies
nesssary to run all of the steps, including the extension steps.
System configuration
All JVM applications are sensitive to some aspects of your system configuration. For example, what version of the Java Virtual Machine you’re using and if you need to use a proxy to connect to the internet.
MIME Types
An XProc processor is especially sensitive to the way MIME types are configured. An XProc pipeline can process resources of any type: XML, JSON, ZIP, etc. The content type of a resource accessed through HTTP (or HTTPS) is always identified by the server. But the content type of a resource loaded from the filesystem depends on its filename extension and how your JVM is configured.
Yes. To quote from the Wikipedia article on media types:
The IANA and IETF use the term “media type”, and consider the term "MIME type" to be obsolete, since media types have become used in contexts unrelated to email, such as HTTP. By contrast, the WHATWG continues to use the term “MIME type” and discourages use of the term “media type” as ambiguous, since it is used with a different meaning in connection with the CSS
@media
feature.The HTTP response header for providing the media type is
Content-Type
. The W3C has usedContentType
as an XML data-type name for a media type. XDG specifications implemented by Linux desktop environments continue to use the term “MIME type”.
Following the lead of W3C with respect to XML, XProc consistently uses the term “content type”. In this user guide the term “MIME type” is used where that’s consistent with what the JVM documentation uses.
When the JVM starts, it looks for a file named
“.mime.types
” in the users’s home directory*. If it finds one, it uses it to build a
mapping from filename extensions to MIME types. The
.mime.types
file is a text file where each line consists of
a MIME type followed by a space separated list of filename extensions. For
example:
1 |# MIME type mapping (Likes that start with “#” are comments.)
|application/json json
|application/nvdl+xml nvdl
|application/relax-ng-compact-syntax rnc
5 |application/relax-ng+xml rng
|application/schematron+xml sch
|text/plain text txt css
|application/xml xml xpl fo
|application/xquery xq xqy
10 |application/xsd+xml xsd
|application/xslt+xml xsl xslt
For XML Calabash, you can also define them in the configuration file.
If there is no MIME type defined for a particular extension, it will be identified as an “application/octet-stream” resource. That’s binary. It is very explicitly not XML or HTML or JSON or text. Most steps will reject binary resources. This will result in errors that might be very confusing if you’re not aware of the problem. To avoid that, XML Calabash takes a heavy-handed approach.
After the MIME types have been configured, from the system, from the
user’s .mime.types
file, and from the configuration file,
if any of the extensions listed in Table 1.1, “Default MIME type mappings” are identified as “application/octet-stream”
resources, XML Calabash defines a pragmatically more useful default.
Extension | Default MIME type |
---|---|
7z | application/x-7z-compressed |
a | application/x-archive |
arj | application/x-arj |
bmp | image/bmp |
bz2 | application/bzip2 |
cpio | application/x-cpio |
css | text/plain |
eps | image/eps |
epub | application/epub+zip |
fo | application/xml |
gif | image/gif |
gz | application/gzip |
gzip | application/gzip |
jar | application/java-archive |
jpg, jpeg | image/jpeg |
json | application/json |
lzma | application/lzma |
nvdl | application/nvdl+xml |
application/pdf | |
rnc | application/relax-ng-compact-syntax |
rng | application/relax-ng+xml |
sch | application/schematron+xml |
svg | image/svg+xml |
tar | application/x-tar |
text | text/plain |
txt | text/plain |
xml | application/xml |
xpl | application/xml |
xq | application/xquery |
xqy | application/xquery |
xsd | application/xsd+xml |
xsl | application/xslt+xml |
xslt | application/xslt+xml |
xz | application/xz |
zip | application/zip |
If these defaults are problematic, use one of the existing configuration mechanisms to define the mapping you prefer.
Also, if you have a different convention, perhaps using the extension
“.schematron
” for Schematron files or “.xs
” for
XML Schema documents, you will have to provide that mapping yourself. You will also
want to provide mappings for other extensions you use.