WebDriver

Abstract

WebDriver is a remote control interface that enables introspection and control of user agents. It provides a platform- and language-neutral wire protocol as a way for out-of-process programs to remotely instruct the behavior of web browsers.

Provided is a set of interfaces to discover and manipulate DOM elements in web documents and to control the behavior of a user agent. It is primarily intended to allow web authors to write tests that automate a user agent from a separate controlling process, but may also be used in such a way as to allow in-browser scripts to control a — possibly separate — browser.

WebDriver remote ends must provide an HTTP compliant wire protocol where the endpoints map to different commands.

As this standard only defines the remote end protocol, it puts no demands to how local ends should be implemented. Local ends are only expected to be compatible to the extent that they can speak the remote end's protocol; no requirements are made upon their exposed user-facing API.

Various parts of this specification are written in terms of step-by-step algorithms. The details of these algorithms do not have any normative significance; implementations are free to adopt any implementation strategy that produces equivalent output to the specification. In particular, algorithms in this document are optimized for readability rather than performance.

Where algorithms that return values are fallible, they are written in terms of returning either success or error. A success value has an associated data field which encapsulates the value returned, whereas an error value has an associated error code.

When calling a fallible algorithm, the construct “Let result be the result of trying to call algorithm” is equivalent to

Let temp be the result of calling algorithm.
If temp is an error return temp, otherwise let result be temp's data field.

The result of getting a property with name from object is defined as being the same as the result of calling Object.[[GetOwnProperty]](name) on object.

The result of getting a property with default with arguments name and default from object is defined as being the same as the result of calling Object.[[GetOwnProperty]](name) on object if that results in a value other than undefined and default otherwise.

Setting a property with arguments name and value on object is defined as being the same as calling Object.[[Put]](name, value) on object.

The result of JSON serialization with object of type JSON Object is defined as the result of calling stringify(object).

The result of JSON deserialization with text is defined as the result of calling parse(text).

The WebDriver protocol is organized into commands. Each HTTP request with a method and template defined in this specification represents a single command, and therefore each command produces a single HTTP response.

In response to a command, a remote end will run a series of actions known as remote end steps. These provide the sequences of actions that a remote end takes when it receives a particular command.

The remote end is an HTTP server reading requests from the client and writing responses, typically over a TCP socket. For the purposes of this specification we model the data transmission between a particular local end and remote end with a connection to which the remote end may write bytes and read bytes. However the exact details of how this connection works and how it is established are out of scope.

After a connection is established, the remote end must run the following steps:

While the connection is not closed:
1. Read bytes from the connection until a complete HTTP request can be constructed from the data. Let request be a request constructed from the received data, according to the requirements of [RFC7230]. If it is not possible to construct a complete HTTP request, the remote end must either close the connection, return an HTTP response with status code 500, or return an error with error code unknown error.
2. Let request match be the result of the algorithm to match a request with request's method and URL as arguments.
3. If request match is of type error, send an error with request match's error code and continue.
  Otherwise, let command and URL variables be request match's data.
4. Let session be null.
5. If URL variables contains "session id":
  Note
  This condition is intended to exclude the New Session and Status commands and any extension commands which do not operate on a particular session.
  1. Let session id be URL variables["session id"].
  2. For each active session in the list of active sessions:
    1. If active session's session ID is equal to session id, then let session be active session, and break.
  3. If the session is null send an error with error code invalid session id, then continue.
6. Enqueue a task on remote end's request queue to run the following steps:
  1. If session is no longer in the list of active sessions, then send an error with error code invalid session id and return.
  2. Let parameters be null.
  3. If request's method is POST:
    1. Let parse result be the result of parsing as JSON with request's body as the argument. If this process throws an exception, return an error with error code invalid argument and jump back to step 1 in this overall algorithm.
    2. If parse result is not an Object, send an error with error code invalid argument and jump back to step 1 in this overall algorithm.
      Otherwise, let parameters be parse result.
  4. Let navigate result be the result of wait for navigation to complete with session.
  5. If navigate result is an error, send an error with error code equal to navigate result's error code and return.
  6. Let response result be the return value obtained by running the remote end steps for command with session, URL variables, and parameters.
  7. If response result is an error, send an error with error code equal to response result's error code and return.
  8. Assert: response result is a success.
  9. Let response data be response result's data.
  10. Send a response with status 200 and response data.

When required to send an error, with error code and an optional error data dictionary, a remote end must run the following steps:

Let status and name be the error response data for error code.
Let message be an implementation-defined string containing a human-readable description of the reason for the error.
Let stacktrace be an implementation-defined string containing a stack trace report of the active stack frames at the time when the error occurred.
Let body be a new JSON Object initialized with the following properties:

"error"
name
"message"
message
"stacktrace"
stacktrace
If the error data dictionary contains any entries, set the "data" field on body to a new JSON Object populated with the dictionary.
Send a response with status and body as arguments.

When required to send a response, with arguments status and data, a remote end must run the following steps:

Let response be a new response.
Set response's HTTP status to status, and status message to the string corresponding to the description of status in the status code registry.
Set the response's header with name and value with the following values:

Content-Type
"application/json; charset=utf-8"
Cache-Control
"no-cache"
Let response's body be the UTF-8 encoded JSON serialization of a JSON Object with a key "value" set to data.
Let response bytes be the byte sequence resulting from serializing response according to the rules in [RFC7230].
Write response bytes to the connection.

Request routing is the process of going from an HTTP request to the series of steps needed to implement the command represented by that request.

A remote end has an associated URL prefix, which is used as a prefix on all WebDriver-defined URLs on that remote end. This must either be undefined or a path-absolute URL.

In order to match a request given a method and URL, the following steps must be taken:

Let endpoints be a list containing each row in the table of endpoints.
Remove each entry from endpoints for which the concatenation of the URL prefix and the entry's URI template does not have a valid expansion equal to URL's path.
If there are no entries in endpoints, return error with error code unknown command.
Remove each entry in endpoints for which the method column is not equal to method.
If there are no entries in endpoints, return error with error code unknown method.
There is now exactly one entry in endpoints; let entry be this entry.
Let URI template be the concatenation of URL prefix with entry's URI template.
Let command be entry's command.
Let URL variables be a map with one entry for each variable defined in URI template, with the entry name equal to the template variable name, and the entry value being the variable value required to expand the URI template to match URL's path.
Return success with data command and URL variables.

The following table of endpoints lists the method and URI template for each endpoint node command. Extension commands are implicitly appended to this table.

Method	URI Template	Command
POST	/session	New Session
DELETE	/session/{`session id`}	Delete Session
GET	/status	Status
GET	/session/{`session id`}/timeouts	Get Timeouts
POST	/session/{`session id`}/timeouts	Set Timeouts
POST	/session/{`session id`}/url	Navigate To
GET	/session/{`session id`}/url	Get Current URL
POST	/session/{`session id`}/back	Back
POST	/session/{`session id`}/forward	Forward
POST	/session/{`session id`}/refresh	Refresh
GET	/session/{`session id`}/title	Get Title
GET	/session/{`session id`}/window	Get Window Handle
DELETE	/session/{`session id`}/window	Close Window
POST	/session/{`session id`}/window	Switch To Window
GET	/session/{`session id`}/window/handles	Get Window Handles
POST	/session/{`session id`}/window/new	New Window
POST	/session/{`session id`}/frame	Switch To Frame
POST	/session/{`session id`}/frame/parent	Switch To Parent Frame
GET	/session/{`session id`}/window/rect	Get Window Rect
POST	/session/{`session id`}/window/rect	Set Window Rect
POST	/session/{`session id`}/window/maximize	Maximize Window
POST	/session/{`session id`}/window/minimize	Minimize Window
POST	/session/{`session id`}/window/fullscreen	Fullscreen Window
GET	/session/{`session id`}/element/active	Get Active Element
GET	/session/{`session id`}/element/{`element id`}/shadow	Get Element Shadow Root
POST	/session/{`session id`}/element	Find Element
POST	/session/{`session id`}/elements	Find Elements
POST	/session/{`session id`}/element/{element id}/element	Find Element From Element
POST	/session/{`session id`}/element/{element id}/elements	Find Elements From Element
POST	/session/{`session id`}/shadow/`{shadow id}`/element	Find Element From Shadow Root
POST	/session/{`session id`}/shadow/`{shadow id}`/elements	Find Elements From Shadow Root
GET	/session/{`session id`}/element/{`element id`}/selected	Is Element Selected
GET	/session/{`session id`}/element/{`element id`}/attribute/{`name`}	Get Element Attribute
GET	/session/{`session id`}/element/{`element id`}/property/{`name`}	Get Element Property
GET	/session/{`session id`}/element/{`element id`}/css/{`property name`}	Get Element CSS Value
GET	/session/{`session id`}/element/{`element id`}/text	Get Element Text
GET	/session/{`session id`}/element/{`element id`}/name	Get Element Tag Name
GET	/session/{`session id`}/element/{`element id`}/rect	Get Element Rect
GET	/session/{`session id`}/element/{`element id`}/enabled	Is Element Enabled
GET	/session/{`session id`}/element/{`element id`}/computedrole	Get Computed Role
GET	/session/{`session id`}/element/{`element id`}/computedlabel	Get Computed Label
POST	/session/{`session id`}/element/{`element id`}/click	Element Click
POST	/session/{`session id`}/element/{`element id`}/clear	Element Clear
POST	/session/{`session id`}/element/{`element id`}/value	Element Send Keys
GET	/session/{`session id`}/source	Get Page Source
POST	/session/{`session id`}/execute/sync	Execute Script
POST	/session/{`session id`}/execute/async	Execute Async Script
GET	/session/{`session id`}/cookie	Get All Cookies
GET	/session/{`session id`}/cookie/{`name`}	Get Named Cookie
POST	/session/{`session id`}/cookie	Add Cookie
DELETE	/session/{`session id`}/cookie/{`name`}	Delete Cookie
DELETE	/session/{`session id`}/cookie	Delete All Cookies
POST	/session/{`session id`}/actions	Perform Actions
DELETE	/session/{`session id`}/actions	Release Actions
POST	/session/{`session id`}/alert/dismiss	Dismiss Alert
POST	/session/{`session id`}/alert/accept	Accept Alert
GET	/session/{`session id`}/alert/text	Get Alert Text
POST	/session/{`session id`}/alert/text	Send Alert Text
GET	/session/{`session id`}/screenshot	Take Screenshot
GET	/session/{`session id`}/element/{`element id`}/screenshot	Take Element Screenshot
POST	/session/{`session id`}/print	Print Page

Errors are represented in the WebDriver protocol by an HTTP response with an HTTP status in the 4xx or 5xx range, and a JSON body containing details of the error. The body is a JSON

Chrome	63+
Chrome Android	?
Edge	12+
Edge Mobile	?
Firefox	60+
Firefox Android	?
Opera	?
Opera Android	?
Safari	10.1+
Safari iOS	?
Samsung Internet	?
WebView Android	?

WebDriver

Abstract

Status of This Document

1. Design

1.1 Compatibility

1.2 Simplicity

1.3 Extensions

2. Conformance

3. Terminology

4. Interface

5. Nodes

6. Protocol

6.1 Algorithms

6.2 Commands

6.3 Processing model

6.4 Routing requests

6.5 Endpoints

6.6 Errors