Name

cx:selenium — Drive a web browser with Selenium.

Synopsis

The cx:selenium step uses Selenium to automate a web browser. The step can drive a web browser and extract all or part of rendered pages.

Input portPrimarySequenceContent types
source✔  text 
Output portPrimarySequenceContent types
result✔ ✔  
Option nameTypeDefault value
argumentsxs:string*()
browserxs:string?()
capabilitiesmap(xs:QName, item())?()
This is an extension step; to use it, your pipeline must include its declaration. For example, by including the extension library with an import at the top of your pipeline:
<p:import href="https://xmlcalabash.com/ext/library/selenium.xpl"/>
Declaration
1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc">
  |   <p:input port="source" content-types="text"/>
  |   <p:output port="result" sequence="true"/>
  |   <p:option name="browser" as="xs:string?"/>
5 |   <p:option name="capabilities" as="map(xs:QName, item())?"/>
  |   <p:option name="arguments" as="xs:string*"/>
  |</p:declare-step>
Errors
CodeDescription
cxerr:XC0023It is a dynamic error (cxerr:XC0023) if page URI does not match at least one of the whitelist expressions.

Description

Selenium automates browsers. Steps like p:http-request can interact with the web, making selected, individual requests. The cx:selenium step fires up an actual web browser and interacts with it. In practice, what this means is that JavaScript is executed and the result is available to the step.

Selenium is widely used for testing web applications. There are lots of programming APIs that can drive it. What the cx:selenium step does is expose that functionality through a small scripting language.

The goal here is to make a language that’s easy to use for common sorts of tasks, not one that can do everything that Selenium can do. It’s also been invented in a somewhat ad hoc manner by someone with relatively little Selenium programming experience. Suggestions for improvements are welcome.

Whitelisting

The cx:selenium step is running an actual web browser. In principle, if you can do something with a web browser, you can do it with this step: login to your bank, order pizza, etc. Care is advised.

It is possible to whitelist the URIs that cx:selenium will load. Add a selenium element to your configuration:

  |<x:selenium xmlns:x="https://xmlcalabash.com/ext/ns/selenium"
  |            whitelist="http://localhost.*
  |                       https://testdata.xmlcalabash.com/.*"/>

The whitelist attribute is a space-separated list of regular expressions. If the page URI matches one of those regular expressions, the step will run. It is a dynamic error (cxerr:XC0023) if page URI does not match at least one of the whitelist expressions.

The scripting language

A cx:selenium script begins with a version declaration, identifies the page to open in the browser, and has one or more statements.

script version 0.2 . page string . statement
Figure 1Overall structure of a cx:selenium script

The scripting language is described in part with “railroad diagrams”. They indicate how a script is constructed from various constructs. In the diagrams, an oval containing bold text represents something you literally type. Words in rectangles are references to other parts of the grammar and what’s expected there is some example of that construct. Generally speaking, whitespace is expected between the ovals and boxes, except that whitespace around punctuation is often optional.

The summary in Figure 1, “Overall structure of a cx:selenium script” indicates that a script begins with the literal text “script version 0.2” followed by (optional) whitespace, the literal text “.”, whitespace, the literal text “page”, whitespace, any string, the literal text “.”, and one or more statements. For example:

script version 0.2. page "http://example.com/" .
output "Hello, world." .

Currently, the only version supported is “0.2”.

Statement

There are four blocks and about 20 different kinds of statements.

simpleStatement . block perform
Figure 2A statement

A “simple” statement stands alone. A subset of the simple statements, the “compound” statements, can be joined together and performed at once. (This is an analog for the Selenium concept of building a sequence of actions and then performing them.)

compoundStatement call close cookie find message navigate output refresh reset set waitReady window
Figure 3A simple statement
click drag key move pause release scroll send
Figure 4A compound statement
compoundStatement then compoundStatement .
Figure 5A perform statement

Blocks

There are four kinds of blocks: three conditionals (if, while, until) and subroutines.

Conditional blocks

if expression then statement endif
Figure 6An if block

The statements in an “if” block are evaluated if (and only if) the effective boolean value of the test expression is true. An expression is a quoted string containing an XPath expression.

while expression do statement done
Figure 7A while block

The statements in a “while” block are evaluated repeatedly as long as the effective boolean value of the test expression is true. If the test expression is initially false, the statements in the block are not executed at all.

until expression do statement done
Figure 8An until block

The statements in an “until” block are evaluated repeatedly as long as the effective boolean value of the test expression is true. The statements are always evaluated at least once, the expression is tested at the end of each loop.

Subroutines

sub subroutine name [#xa] statement end
Figure 9An subroutine

Subroutines are a way to group statements that you can evaluate with the call statement. Subroutines are collected before script evaluation begins, so they can appear anywhere a statement can occur, even if that’s after call statements that refers to them. All subroutine names must be unique.

The name and the first statement must be separated by at least one newline.

find statement

A find statement locates an element on the page and stores its (HTML) content in a variable. With the all keyword, it finds all of the elements that match the locator. If wait is added, the processor will wait as long as the specified duration for the locator to find at least one match. A pause specifies the duration to wait between each attempt; the default is 0.25s.

In Selenium, it’s an error if the locator doesn’t match anything. In the cx:selenium step, it’s not an error, the variable will simply hold the empty sequence. If, however, a further attempt is made to perform a Selenium action with the variable (click on it or send text to it, for example), an error will occur. You can avoid this by first testing if the variable is empty.

find all varname by name selector id link-text partial-link-text tag class xpath = string wait duration pause duration
Figure 10The find statement

The token that follows “by” identifies the kind of match to be performed and consequently the form that the following string must have:

TokenFind by …Example string
nameName attributebutton-name
selectorCSS selector.someClass
idIDsomeId
link-textExact text of a linkclick here
partial-link-textPartial text of a linkclick
tagElement nameform
classClass namesomeClass
xpathXPath (1.0) expression/html/body/h1[2]

There is one, global scope for variable names and they are mutable. Whether they are set with find or set, whether they are set in the main body of the script or in a subroutine, they always have the last value set.

set statement

A set statement sets a variable to some value. This can be some property of the window or page, a cookie, a string, the result of evaluating an XPath expression, or to the property of some element on the page.

set varname to window width height x y page url title cookie string name string string xpath expression element varname property
Figure 11The set statement

Where property is a synonym for name. The token that follows “to” identifies the kind of query to be performed.

window

The size or location of the browser window.

set $width to window width.
page

The URL or title of the page.

set $title to window title.
cookie

The value of the cookie named. If the cookie name doesn’t conform to the constraints of a name, put the name in a quoted string.

set $login to cookie login-id.
string

The string provided.

set $hello to string "Hello, world.".
xpath

The result of evaluating the XPath expression. Unlike the XPath expression in a find statement, which is evaluated by Selenium and must be an XPath 1.0 expression, this expression is evaluated by the step and is an XPth 3.0 expression.

set $narrow to xpath "$width lt 600".
element

Some property of the element in $varname. For example, if you used a find statement to locate an input element on the page ($input), you could use a set statement to obtain its value:

set $value to element $input value .

This differs from find element … which returns the actual element.

There is one, global scope for variable names and they are mutable. Whether they are set with find or set, whether they are set in the main body of the script or in a subroutine, they always have the last value set.

send statement

The send statement sends text to the input on the page identified by $varname. Strings cannot contain newlines, so if you want to send a longer fragment, delimit it with “¶”, “⁋”, “§”, or formfeed characters.

send string to varname [^¶] [^⁋] § [^§] § [#xc] [^#xc] [#xc] to varname
Figure 12The send statement

click statement

The click statement simulates clicking on the element identified by $varname.

click and hold doubleclick varname
Figure 13The click statement

wait statement

There are two forms of the wait statement, “wait until ready” and “wait for a duration” in the find statement.

The “wait until ready” statement waits until the page is ready. That is, it waits until the page indicates that document.readyState is “complete”.

wait until ready
Figure 14The wait until ready statement

The “wait for a duration” statement waits for a specified duration.

wait duration
Figure 15The wait statement

pause statement

The pause statement waits for a specified duration.

pause duration
Figure 16The pause statement

message statement

The message statement computes the value of the expression and sends it to the message handler at the “info” level.

message expression
Figure 17The message statement

output statement

The output statement sends output from the step. The element on the page identified by $varname, arbitrary text, or the result of evaluating an expression can be sent to the result port.

output xpath string varname [^¶] [^⁋] § [^§] § [#xc] [^#xc] [#xc] to result
Figure 18The output statement

Each output statement creates a new document on the result port.

window statement

The window statement updates aspects of the browser window.

window minimize maximize fullscreen size integer x position integer , integer
Figure 19The window statement

cookie statement

The cookie statement sets a cookie. If the name of the cookie satisfies the constraints of a name, then you can just use the name. For arbitrary names, use a string.

cookie name string = string path = string duration duration
Figure 20The cookie statement

scroll statement

The scroll statement attempts to scroll the browser window. This statement seems to be somewhat inconsistently implemented by browsers. Firefox, for example, won’t scroll to an element not already visible in the viewport.

To support scrolling arbitrarily, the cx:selenium step implements “scroll to $varname” by evaluating the JavaScript expression varname.scrollIntoView(true).

scroll to varname by from varname by integer , integer
Figure 21The scroll statement

move statement

The move command moves to the element identified by $varname.

move to varname
Figure 22The move statement

release statement

The release statement releases the mouse after a “click and hold” statement.

release
Figure 23The move statement

drag statement

The drag statement drags one element to another.

drag and drop varname to varname
Figure 24The drag statement

navigate statement

The navigate statement changes the page in the browser.

navigate forward back backwards to string
Figure 25The navigate statement

refresh statement

The refresh statement refreshes the page.

refresh
Figure 26The refresh statement

reset statement

The reset statement resets Selenium.

reset
Figure 27The reset statement

close statement

The close statement closes the browser.

close
Figure 28The close statement

This ends the script.

key statement

The key statement presses or releases a key.

key up down keyname char
Figure 29The key statement

Where a keyname is one of the names in Figure 30, “The key names” and a char is any string containing a single character.

ADD ALT ARROW_DOWN ARROW_LEFT ARROW_RIGHT ARROW_UP BACK_SPACE CANCEL CLEAR COMMAND CONTROL DECIMAL DELETE DIVIDE DOWN END ENTER EQUALS ESCAPE F1 F2 F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 HELP HOME INSERT LEFT LEFT_ALT LEFT_CONTROL LEFT_SHIFT META MULTIPLY NULL NUMPAD0 NUMPAD1 NUMPAD2 NUMPAD3 NUMPAD4 NUMPAD5 NUMPAD6 NUMPAD7 NUMPAD8 NUMPAD9 PAGE_DOWN PAGE_UP PAUSE RETURN RIGHT SEPARATOR SHIFT SPACE SUBTRACT TAB UP
Figure 30The key names

call statement

The call statement calls a defined subroutine.

call gosub name
Figure 31The call statement

Names

A name is a letter or an underscore followed by a letters, numbers, and variety of punctuation characters.

namestart namefollower
Figure 32Names
_ UnicodeL
Figure 33Name start characters

Where “UnicodeL” is any Unicode character in the “L” category (letters).

namestart - . · UnicodeNd UnicodeMn
Figure 34Name following characters

Where “UnicodeNd” is any Unicode character in the “Nd” category (decimal numbers) and “UnicodeNd” is any Unicode character in the “Mn” category (nonspacing marks).

Variable names

Like XPath, variable names begin with a $.

$ name
Figure 35Variable names

Strings

Strings begin and end with quote delimiters and must not break across lines.

" [^ "#xa] " ' [^ '#xa] ' [^ “”#xa]
Figure 36Strings

Durations

A duration is a number of milliseconds or an xs:dayTimeDuration.

number P integer D integer D T integer H integer M number S
Figure 37Durations

Integers and numbers

Positive or negative integers or decimal numbers.

+ - [0-9]
Figure 38Integers

Note that negative integers are forbidden in some contexts (for example, window sizes).

'0' - '9' . '0' - '9'
Figure 39Numbers

There are no use cases for negative decimal numbers, so signs are not allowed.

Example

The following pipeline uses the cx:selenium step to load the “cities” example page. This page displays a table cities in the United Kingdom with the country they’re in and their latitude and longitude. A “more” button loads more cities.

The Selenium script clicks the “more” button until the city of Appleton is in the table, then it returns the latitude and longitude in two text documents.

 1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
   |                xmlns:cx="http://xmlcalabash.com/ns/extensions"
   |                name="main" version="3.0">
   |  <p:import href="https://xmlcalabash.com/ext/library/selenium.xpl"/>
 5 | 
   |  <p:output port="result" serialization="map{'method':text}" sequence="true"/>
   | 
   |  <cx:selenium xmlns:h="http://www.w3.org/1999/xhtml">
   |    <p:with-option name="arguments" select="('--headless')"/>
10 |    <p:with-input>
   |      <p:inline content-type="text/plain">
   |script version 0.2 .
   |page "https://testdata.xmlcalabash.com/cities/" .
   | 
15 |# Wait until the table has been populated
   |until "not(empty($row))" do
   |  find $row by selector = "table tbody tr" .
   |  pause PT0.25S .
   |done
20 | 
   |# Search for Appleton, hit more until we find it
   |find $city by xpath = "//td[. = 'Appleton']".
   |while "empty($city)" do
   |  call clickNext .
25 |  find $city by xpath = "//td[. = 'Appleton']".
   |done
   | 
   |find $row by xpath "//tr[td[. = 'Appleton']]" .
   | 
30 |output xpath "normalize-space($row/*:td[3])" to result .
   |output xpath "normalize-space($row/h:td[4])" to result .
   | 
   |close .
   | 
35 |subroutine clickNext
   |  find $button by selector = "button" .
   |  scroll to $button .
   |  click $button .
   |  pause PT0.25S .
40 |end  
   |      </p:inline>
   |    </p:with-input>
   |  </cx:selenium>
   |</p:declare-step>

It’s not written in an especially efficient way. It’s written to demonstrate a variety of statements and features.

Dependencies

This step is included in the XML Calabash application. If you are getting XML Calabash from Maven, you will also need to include the extension dependency:

  • com.xmlcalabash:selenium:3.0.0-alpha23

The following third-party dependencies will also be included transitively:

  • org.seleniumhq.selenium:selenium-java:4.28.1