HTML element Path


HyperText Markup Language (HTML) is the standard markup language for creating web pages and web applications. HTML elements are building blocks of HTML pages. With HTML constructs, images and other objects, such as interactive forms may be embedded into the web page. It provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes and other items. HTML elements are delineated by tags, written using angle brackets. All items in a web page are hierarchically interrelated with each other as well, and organized in a tree structure. The terms parent, child, sibling and path are used to describe the relationships.

  • In the element tree, the top element is called the root (It is the <HTML> tag as usual)
  • Every element node has exactly one parent, except the root (which has no parent)
  • An element can have a number of children
  • Siblings (brothers or sisters) are elements with the same parent
  • An element path consists of the element of all the tree elements that must be navigated to reach specific element , starting at a top element . 

First Example

<HTML>.<BODY>.<DIV>(3).<A[href ct ='about.htm']>

Example Explained

  • There are four nodes in this path, and they are separated by the period character (.) from each other
  • A node is surrounded by angle brackets, and the third node is followed by an ordinal number which enclosed in parentheses
  • The text between angle brackets are the element type and filter expression, the filter expression is surrounded by square brackets

Syntax of HTML element Path

<element type [Filter expression]>(Ordinal number).<element type [Filter expression]>(Ordinal number)...

A node in path is consist of the element tag name, filter expression and ordinal number. The node begins with a left angle bracket (<), followed immediately by the tag name. An optional filter expression surrounded by square brackets can follow the tag name. A right angle bracket (>) closes the node. An optionally ordinal number surrounded by parentheses can follow immediately the node close character (>). Multiple nodes in path are separated by the period character (.). The wildcard characters * and ? are allowed to instead of one or more nodes.

TermDescription

element type

Specifies the HTML element tag name. The wildcard character * and ? are allowed.

Filter expression

Optionally, specifies an expression for the element . Filter expression is used to block from elements that does match the condition. The filter expression must be surrounded by square brackets.

Note The variable cannot be used in the filter expression.

Ordinal number

Optionally, specifies the ordinal number that denotes a matched element . The ordinal number starts zero.

Examples

A simplest element path that points to a DIV element.

<HTML>.<BODY>.<DIV>

A path points to an element with a filter expression.

<HTML>.<BODY>.<DIV>.<A[href ct ='about.htm']>

The filter expression has two conditions, and uses the ordinal number 3 for the DIV element.

<HTML>.<BODY>.<DIV>.<A[href ct ='about.htm' and style.border=1]>

The wildcard character is used in the path.

<HTML>.<BODY>.*.<A[href ct ='about.htm']>
<HTML>.<BODY>.?.<A[href ct ='about.htm']>

Attributes in Filter Expression

All attributes of the HTML element can be used in a filter expression. For a CSS attribute, the attribute name must be preceded with "style." For example:

<HTML>.<BODY>.<DIV>.<A[style.border = 1]>
<HTML>.<BODY>.<DIV>.<A[style.color = 'red']>

Notes

  • The variable can not be used in the filter expression. For more information about the expression, click here.