Name
cx:selenium — Drive a web browser with Selenium.
Synopsis
The cx:selenium
step uses
Selenium to automate a web
browser. The step can drive a web browser and extract all or part of rendered pages.
Input port | Primary | Sequence | Content types |
---|---|---|---|
source | ✔ | text |
Output port | Primary | Sequence | Content types |
---|---|---|---|
result | ✔ | ✔ |
Option name | Type | Default value |
---|---|---|
arguments | xs:string* | () |
browser | xs:string? | () |
capabilities | map(xs:QName, item())? | () |
<p:import href="https://xmlcalabash.com/ext/library/selenium.xpl"/>
Declaration
1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc">
| <p:input port="source" content-types="text"/>
| <p:output port="result" sequence="true"/>
| <p:option name="browser" as="xs:string?"/>
5 | <p:option name="capabilities" as="map(xs:QName, item())?"/>
| <p:option name="arguments" as="xs:string*"/>
|</p:declare-step>
Errors
Code | Description |
---|---|
cxerr:XC0023 | It is a dynamic
error (cxerr:XC0023 ) if page URI does not match at least one of the whitelist expressions. |
Description
Selenium automates browsers. Steps like p:http-request
can
interact with the web, making selected, individual requests. The
cx:selenium
step fires up an actual web browser and interacts with
it. In practice, what this means is that JavaScript is executed and the result
is available to the step.
Selenium is widely used for testing web applications. There are lots of programming APIs
that can drive it. What the cx:selenium
step does is expose that
functionality through a small scripting language.
The goal here is to make a language that’s easy to use for common sorts of tasks, not one that can do everything that Selenium can do. It’s also been invented in a somewhat ad hoc manner by someone with relatively little Selenium programming experience. Suggestions for improvements are welcome.
Whitelisting
The cx:selenium
step is running an actual web browser. In
principle, if you can do something with a web browser, you can do it with this
step: login to your bank, order pizza, etc. Care is advised.
It is possible to whitelist the URIs that cx:selenium
will
load. Add a selenium
element to your configuration:
|<x:selenium xmlns:x="https://xmlcalabash.com/ext/ns/selenium"
| whitelist="http://localhost.*
| https://testdata.xmlcalabash.com/.*"/>
The whitelist
attribute is a space-separated
list of regular expressions. If the page URI matches one of those regular expressions,
the step will run.
It is a dynamic
error (cxerr:XC0023
) if page URI does not match at least one of the whitelist expressions.
The scripting language
A cx:selenium
script begins with a version declaration,
identifies the page to open in the browser, and has one or more statements.
The scripting language is described in part with “railroad diagrams”. They indicate how a script is constructed from various constructs. In the diagrams, an oval containing bold text represents something you literally type. Words in rectangles are references to other parts of the grammar and what’s expected there is some example of that construct. Generally speaking, whitespace is expected between the ovals and boxes, except that whitespace around punctuation is often optional.
The summary in Figure 1, “Overall structure of a cx:selenium script” indicates that a
script begins with the literal text “script version 0.2
”
followed by (optional) whitespace, the literal text “.
”,
whitespace, the literal text “page
”, whitespace, any
string, the literal text
“.
”, and one or more statements. For example:
script version 0.2. page "http://example.com/" .
output "Hello, world." .
Currently, the only version supported is “0.2”.
Statement
There are four blocks and about 20 different kinds of statements.
A “simple” statement stands alone. A subset of the simple statements, the “compound” statements, can be joined together and performed at once. (This is an analog for the Selenium concept of building a sequence of actions and then performing them.)
Blocks
There are four kinds of blocks: three conditionals (if, while, until) and subroutines.
Conditional blocks
The statements in an “if” block are evaluated if (and only if) the effective boolean value of the test expression is true. An expression is a quoted string containing an XPath expression.
The statements in a “while” block are evaluated repeatedly as long as the effective boolean value of the test expression is true. If the test expression is initially false, the statements in the block are not executed at all.
The statements in an “until” block are evaluated repeatedly as long as the effective boolean value of the test expression is true. The statements are always evaluated at least once, the expression is tested at the end of each loop.
Subroutines
Subroutines are a way to group statements that you can evaluate with the
call
statement. Subroutines are collected before script evaluation begins,
so they can appear anywhere a statement can occur, even if that’s after call
statements that refers to them. All subroutine names must be unique.
The name and the first statement must be separated by at least one newline.
find statement
A find statement locates an element on the page and stores its (HTML) content
in a variable. With the all
keyword, it finds all of the elements that
match the locator. If wait
is added, the processor will wait as long
as the specified duration for the locator to find at least one match. A pause
specifies the duration to wait between each attempt; the default is 0.25s.
In Selenium, it’s an error if the locator doesn’t match anything. In the
cx:selenium
step, it’s not an error, the variable will simply hold
the empty sequence. If, however, a further attempt is made to perform a Selenium
action with the variable (click on it or send text to it, for example), an error
will occur. You can avoid this by first testing if the variable is empty.
The token that follows “by” identifies the kind of match to be performed and consequently the form that the following string must have:
Token | Find by … | Example string |
---|---|---|
name | Name attribute | button-name |
selector | CSS selector | .someClass |
id | ID | someId |
link-text | Exact text of a link | click here |
partial-link-text | Partial text of a link | click |
tag | Element name | form |
class | Class name | someClass |
xpath | XPath (1.0) expression | /html/body/h1[2] |
There is one, global scope for variable names and they are mutable. Whether they are set with find or set, whether they are set in the main body of the script or in a subroutine, they always have the last value set.
set statement
A set statement sets a variable to some value. This can be some property of the window or page, a cookie, a string, the result of evaluating an XPath expression, or to the property of some element on the page.
Where property is a synonym for name. The token that follows “to” identifies the kind of query to be performed.
window
The size or location of the browser window.
set $width to window width.
page
The URL or title of the page.
set $title to window title.
cookie
The value of the cookie named. If the cookie name doesn’t conform to the constraints of a name, put the name in a quoted string.
set $login to cookie login-id.
string
The string provided.
set $hello to string "Hello, world.".
xpath
The result of evaluating the XPath expression. Unlike the XPath expression in a find statement, which is evaluated by Selenium and must be an XPath 1.0 expression, this expression is evaluated by the step and is an XPth 3.0 expression.
set $narrow to xpath "$width lt 600".
element
Some property of the element in
$varname
. For example, if you used a find statement to locate aninput
element on the page ($input
), you could use a set statement to obtain its value:set $value to element $input value .
This differs from
find element …
which returns the actual element.
There is one, global scope for variable names and they are mutable. Whether they are set with find or set, whether they are set in the main body of the script or in a subroutine, they always have the last value set.
send statement
The send statement sends text to the input on the page identified by
$varname
. Strings cannot contain newlines, so if you want to
send a longer fragment, delimit it with
“¶”, “⁋”, “§”, or formfeed characters.
click statement
The click statement simulates clicking on the element identified
by $varname
.
wait statement
There are two forms of the wait statement, “wait until ready” and “wait for a duration” in the find statement.
The “wait until ready” statement waits until the page is ready. That is, it
waits until the page indicates that document.readyState
is
“complete”.
The “wait for a duration” statement waits for a specified duration.
pause statement
The pause statement waits for a specified duration.
message statement
The message statement computes the value of the expression and sends it to the message handler at the “info” level.
output statement
The output statement sends output from the step. The element on the page
identified by $varname
, arbitrary text, or the result of
evaluating an expression can be sent to the result port.
Each output statement creates a new document on the result port.
window statement
The window statement updates aspects of the browser window.
cookie statement
The cookie statement sets a cookie. If the name of the cookie satisfies the constraints of a name, then you can just use the name. For arbitrary names, use a string.
scroll statement
The scroll statement attempts to scroll the browser window. This statement seems to be somewhat inconsistently implemented by browsers. Firefox, for example, won’t scroll to an element not already visible in the viewport.
To support scrolling arbitrarily, the cx:selenium
step implements
“scroll to $varname” by evaluating the JavaScript expression
varname.scrollIntoView(true)
.
move statement
The move command moves to the element identified by $varname
.
release statement
The release statement releases the mouse after a “click and hold” statement.
drag statement
The drag statement drags one element to another.
navigate statement
The navigate statement changes the page in the browser.
refresh statement
The refresh statement refreshes the page.
reset statement
The reset statement resets Selenium.
close statement
The close statement closes the browser.
This ends the script.
key statement
The key statement presses or releases a key.
Where a keyname is one of the names in Figure 30, “The key names” and a char is any string containing a single character.
call statement
The call statement calls a defined subroutine.
Names
A name is a letter or an underscore followed by a letters, numbers, and variety of punctuation characters.
Where “UnicodeL
” is any
Unicode character in the “L” category (letters).
Where “UnicodeNd
” is any
Unicode character in the “Nd” category (decimal numbers) and
“UnicodeNd
” is any
Unicode character in the “Mn” category (nonspacing marks).
Variable names
Like XPath, variable names begin with a $
.
Strings
Strings begin and end with quote delimiters and must not break across lines.
Durations
A duration is a number of milliseconds or an xs:dayTimeDuration
.
Integers and numbers
Positive or negative integers or decimal numbers.
Note that negative integers are forbidden in some contexts (for example, window sizes).
There are no use cases for negative decimal numbers, so signs are not allowed.
Example
The following pipeline uses the cx:selenium
step to load the
“cities” example page.
This page displays a table cities in the United Kingdom with the country they’re in and
their latitude and longitude. A “more” button loads more cities.
The Selenium script clicks the “more” button until the city of Appleton is in the table, then it returns the latitude and longitude in two text documents.
1 |<p:declare-step xmlns:p="http://www.w3.org/ns/xproc"
| xmlns:cx="http://xmlcalabash.com/ns/extensions"
| name="main" version="3.0">
| <p:import href="https://xmlcalabash.com/ext/library/selenium.xpl"/>
5 |
| <p:output port="result" serialization="map{'method':text}" sequence="true"/>
|
| <cx:selenium xmlns:h="http://www.w3.org/1999/xhtml">
| <p:with-option name="arguments" select="('--headless')"/>
10 | <p:with-input>
| <p:inline content-type="text/plain">
|script version 0.2 .
|page "https://testdata.xmlcalabash.com/cities/" .
|
15 |# Wait until the table has been populated
|until "not(empty($row))" do
| find $row by selector = "table tbody tr" .
| pause PT0.25S .
|done
20 |
|# Search for Appleton, hit more until we find it
|find $city by xpath = "//td[. = 'Appleton']".
|while "empty($city)" do
| call clickNext .
25 | find $city by xpath = "//td[. = 'Appleton']".
|done
|
|find $row by xpath "//tr[td[. = 'Appleton']]" .
|
30 |output xpath "normalize-space($row/*:td[3])" to result .
|output xpath "normalize-space($row/h:td[4])" to result .
|
|close .
|
35 |subroutine clickNext
| find $button by selector = "button" .
| scroll to $button .
| click $button .
| pause PT0.25S .
40 |end
| </p:inline>
| </p:with-input>
| </cx:selenium>
|</p:declare-step>
It’s not written in an especially efficient way. It’s written to demonstrate a variety of statements and features.
Dependencies
This step is included in the XML Calabash application. If you are getting XML Calabash from Maven, you will also need to include the extension dependency:
com.xmlcalabash:selenium:3.0.0-alpha23
The following third-party dependencies will also be included transitively:
org.seleniumhq.selenium:selenium-java:4.28.1