XPath is a set of syntax rules for defining parts of an XML document.
XPath is a major element in the W3C XSLT standard. Without XPath knowledge you will not be able to create XSLT documents.
XPath is a set of syntax rules for defining parts of an XML document.
What is XPath?
XPath is a syntax for defining parts of an XML document
XPath uses paths to define XML elements
XPath defines a library of standard functions
XPath is a major element in XSLT
XPath is not written in XML
XPath is a W3C Standard
Like Traditional File Paths
XPath uses path expressions to identify nodes in an XML document. These path expressions look very much like the expressions you see when you work with a computer file system:
w3schools/xpath/default.asp
XPath Example
Look at this simple XML document:
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="USA">
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<price>10.90</price>
</cd>
<cd country="UK">
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<price>9.90</price>
</cd>
<cd country="USA">
<title>Greatest Hits</title>
<artist>Dolly Parton</artist>
<price>9.90</price>
</cd>
</catalog>
The XPath expression below selects the ROOT element catalog:
/catalog
The XPath expression below selects all the cd elements of the catalog element:
/catalog/cd
The XPath expression below selects all the price elements of all the cd elements of the catalog element:
/catalog/cd/price
Note: If the path starts with a slash ( / ) it represents an absolute path to an element!
XPath Defines a Library of Standard Functions
XPath defines a library of standard functions for working with strings, numbers and Boolean expressions.
The XPath expression below selects all the cd elements that have a price element with a value larger than 10.80:
/catalog/cd[price>10.80]
XPath is Used in XSLT
XPath is a major element of the XSLT standard. Without XPath knowledge you will not be able to create XSLT documents.
You can read more about XSLT in our XSLT tutorial.
XPath is a W3C Standard
XPath was released as a W3C Recommendation 16. November 1999 as a language for addressing parts of an XML document.
XPath was designed to be used by XSLT, XPointer and other XML parsing software.
You can read more about XML and XSL standards in our W3C tutorial.
XPath uses path expressions to locate nodes within XML documents.
XML Example Document
We will use this simple XML document to describe the XPath syntax:
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd country="USA">
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<price>10.90</price>
</cd>
<cd country="UK">
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<price>9.90</price>
</cd>
<cd country="USA">
<title>Greatest Hits</title>
<artist>Dolly Parton</artist>
<price>9.90</price>
</cd>
</catalog>
Locating Nodes
XML documents can be represented as a tree view of nodes (very similar to the tree view of folders you can see on your computer).
XPath uses a pattern expression to identify nodes in an XML document. An XPath pattern is a slash-separated list of child element names that describe a path through the XML document. The pattern "selects" elements that match the path.
The following XPath expression selects all the price elements of all the cd elements of the catalog element:
/catalog/cd/price
Note: If the path starts with a slash ( / ) it represents an absolute path to an element!
Note: If the path starts with two slashes ( // ) then all elements in the document that fulfill the criteria will be selected (even if they are at different levels in the XML tree)!
The following XPath expression selects all the cd elements in the document:
//cd
Selecting Unknown Elements
Wildcards ( * ) can be used to select unknown XML elements.
The following XPath expression selects all the child elements of all the cd elements of the catalog element:
/catalog/cd/*
The following XPath expression selects all the price elements that are grandchild elements of the catalog element:
/catalog/*/price
The following XPath expression selects all price elements which have 2 ancestors:
/*/*/price
The following XPath expression selects all elements in the document:
//*
Selecting Branches
By using square brackets in an XPath expression you can specify an element further.
The following XPath expression selects the first cd child element of the catalog element:
/catalog/cd[1]
The following XPath expression selects the last cd child element of the catalog element (Note: There is no function named first()):
/catalog/cd[last()]
The following XPath expression selects all the cd elements of the catalog element that have a price element:
/catalog/cd[price]
The following XPath expression selects all the cd elements of the catalog element that have a price element with a value of 10.90:
/catalog/cd[price=10.90]
The following XPath expression selects all the price elements of all the cd elements of the catalog element that have a price element with a value of 10.90:
/catalog/cd[price=10.90]/price
Selecting Several Paths
By using the | operator in an XPath expression you can select several paths.
The following XPath expression selects all the title and artist elements of the cd element of the catalog element:
/catalog/cd/title | /catalog/cd/artist
The following XPath expression selects all the title and artist elements in the document:
//title | //artist
The following XPath expression selects all the title, artist and price elements in the document:
//title | //artist | //price
The following XPath expression selects all the title elements of the cd element of the catalog element, and all the artist elements in the document:
/catalog/cd/title | //artist
Selecting Attributes
In XPath all attributes are specified by the @ prefix.
This XPath expression selects all attributes named country:
//@country
This XPath expression selects all cd elements which have an attribute named country:
//cd[@country]
This XPath expression selects all cd elements which have any attribute:
//cd[@*]
This XPath expression selects all cd elements which have an attribute named country with a value of 'UK':
//cd[@country='UK']
A location path expression results in a node-set.
Location Path Expression
A location path can be absolute or relative.
An absolute location path starts with a slash ( / ) and a relative location path does not. In both cases the location path consists of one or more location steps, each separated by a slash:
An absolute location path:
/step/step/...
A relative location path:
step/step/...
The location steps are evaluated in order one at a time, from left to right. Each step is evaluated against the nodes in the current node-set. If the location path is absolute, the current node-set consists of the root node. If the location path is relative, the current node-set consists of the node where the expression is being used. Location steps consist of:
an axis (specifies the tree relationship between the nodes selected by the location step and the current node)
a node test (specifies the node type and expanded-name of the nodes selected by the location step)
zero or more predicates (use expressions to further refine the set of nodes selected by the location step)
The syntax for a location step is:
axisname::nodetest[predicate]
Example:
child::price[price=9.90]
Axes and Node Tests
An axis defines a node-set relative to the current node. A node test is used to identify a node within an axis. We can perform a node test by name or by type.
AxisName
Description
ancestor
Contains all ancestors (parent, grandparent, etc.) of the current node
Note: This axis will always include the root node, unless the current node is the root node
ancestor-or-self
Contains the current node plus all its ancestors (parent, grandparent, etc.)
attribute
Contains all attributes of the current node
child
Contains all children of the current node
descendant
Contains all descendants (children, grandchildren, etc.) of the current node
Note: This axis never contains attribute or namespace nodes
descendant-or-self
Contains the current node plus all its descendants (children, grandchildren, etc.)
following
Contains everything in the document after the closing tag of the current node
following-sibling
Contains all siblings after the current node
Note: If the current node is an attribute node or namespace node, this axis will be empty
namespace
Contains all namespace nodes of the current node
parent
Contains the parent of the current node
preceding
Contains everything in the document that is before the starting tag of the current node
preceding-sibling
Contains all siblings before the current node
Note: If the current node is an attribute node or namespace node, this axis will be empty
self
Contains the current node
Examples
Example
Result
child::cd
Selects all cd elements that are children of the current node (if the current node has no cd children, it will select an empty node-set)
attribute::src
Selects the src attribute of the current node (if the current node has no src attribute, it will select an empty node-set)
child::*
Selects all child elements of the current node
attribute::*
Selects all attributes of the current node
child::text()
Selects the text node children of the current node
child::node()
Selects all the children of the current node
descendant::cd
Selects all the cd element descendants of the current node
ancestor::cd
Selects all cd ancestors of the current node
ancestor-or-self::cd
Selects all cd ancestors of the current node and, if the current node is a cd element, the current node as well
child::*/child::price
Selects all price grandchildren of the current node
/
Selects the document root
Predicates
A predicate filters a node-set into a new node-set. A predicate is placed inside square brackets ( [ ] ).
Examples
Example
Result
child::price[price=9.90]
Selects all price elements that are children of the current node with a price element that equals 9.90
child::cd[position()=1]
Selects the first cd child of the current node
child::cd[position()=last()]
Selects the last cd child of the current node
child::cd[position()=last()-1]
Selects the last but one cd child of the current node
child::cd[position()<6]
Selects the first five cd children of the current node
/descendant::cd[position()=7]
Selects the seventh cd element in the document
child::cd[attribute::type="classic"]
Selects all cd children of the current node that have a type attribute with value classic
Location Path Abbreviated Syntax
Abbreviations can be used when describing a location path.
The most important abbreviation is that child:: can be omitted from a location step.
Abbr
Meaning
Example
none
child::
cd is short for child::cd
@
attribute::
cd[@type="classic"] is short for
child::cd[attribute::type="classic"]
.
self::node()
.//cd is short for
self::node()/descendant-or-self::node()/child::cd
..
parent::node()
../cd is short for
parent::node()/child::cd
//
/descendant-or-self::node()/
//cd is short for
/descendant-or-self::node()/child::cd
Examples
Example
Result
cd
Selects all the cd elements that are children of the current node
*
Selects all child elements of the current node
text()
Selects all text node children of the current node
@src
Selects the src attribute of the current node
@*
Selects all the attributes of the current node
cd[1]
Selects the first cd child of the current node
cd[last()]
Selects the last cd child of the current node
*/cd
Selects all cd grandchildren of the current node
/book/chapter[3]/para[1]
Selects the first para of the third chapter of the book
//cd
Selects all the cd descendants of the document root and thus selects all cd elements in the same document as the current node
.
Selects the current node
.//cd
Selects the cd element descendants of the current node
..
Selects the parent of the current node
../@src
Selects the src attribute of the parent of the current node
cd[@type="classic"]
Selects all cd children of the current node that have a type attribute with value classic
cd[@type="classic"][5]
Selects the fifth cd child of the current node that has a type attribute with value classic
cd[5][@type="classic"]
Selects the fifth cd child of the current node if that child has a type attribute with value classic
cd[@type and @country]
Selects all the cd children of the current node that have both a type attribute and a country attribute
XPath supports numerical, equality, relational, and Boolean expressions.
Numerical Expressions
Numerical expressions are used to perform arithmetic operations on numbers.
Operator
Description
Example
Result
+
Addition
6 + 4
10
-
Subtraction
6 - 4
2
*
Multiplication
6 * 4
24
div
Division
8 div 4
2
mod
Modulus (division remainder)
5 mod 2
1
Note: XPath always converts each operand to a number before performing an arithmetic expression.
Equality Expressions
Equality expressions are used to test the equality between two values.
Operator
Description
Example
Result
=
Like (equal)
price=9.80
true (if price is 9.80)
!=
Not like (not equal)
price!=9.80
false
Testing Against a Node-Set
If the test value is tested for equality against a node-set, the result is true if the node-set contains any node with a value that matches the test value.
If the test value is tested for not equal against a node-set, the result is true if the node-set contains any node with a value that is different from the test value.
The result is that the node-set can be equal and not equal at the same time !!!
Relational Expressions
Relational expressions are used to compare two values.
Operator
Description
Example
Result
<
Less than
price<9.80
false (if price is 9.80)
<=
Less or equal
price<=9.80
true
>
Greater than
price>9.80
false
>=
Greater or equal
price>=9.80
true
Note: XPath always converts each operand to a number before performing the evaluation.
Boolean Expressions
Boolean expressions are used to compare two values.
Operator
Description
Example
Result
or
or
price=9.80 or price=9.70
true (if price is 9.80)
and
and
price<=9.80 and price=9.70
false
XPath contains a function library for converting data.
XPath Function Library
The XPath function library contains a set of core functions for converting and translating data.
Node Set Functions
Name
Description
Syntax
count()
Returns the number of nodes in a node-set
number=count(node-set)
id()
Selects elements by their unique ID
node-set=id(value)
last()
Returns the position number of the last node in the processed node list
number=last()
local-name()
Returns the local part of a node. A node usually consists of a prefix, a colon, followed by the local name
string=local-name(node)
name()
Returns the name of a node
string=name(node)
namespace-uri()
Returns the namespace URI of a specified node
uri=namespace-uri(node)
position()
Returns the position in the node list of the node that is currently being processed
number=position()
String Functions
Name
Description
Syntax & Example
concat()
Returns the concatenation of all its arguments
string=concat(val1, val2, ..)
Example:
concat('The',' ','XML')
Result: 'The XML'
contains()
Returns true if the second string is contained within the first string, otherwise it returns false
bool=contains(val,substr)
Example:
contains('XML','X')
Result: true
normalize-space()
Removes leading and trailing spaces from a string, and replaces all internal sequences of white with one white space
string=normalize-space(string)
Example:
normalize-space(' The XML ')
Result: 'The XML'
starts-with()
Returns true if the first string starts with the second string, otherwise it returns false
bool=starts-with(string,substr)
Example:
starts-with('XML','X')
Result: true
string()
Converts the value argument to a string
string(value)
Example:
string(314)
Result: '314'
string-length()
Returns the number of characters in a string
number=string-length(string)
Example:
string-length('Beatles')
Result: 7
substring()
Returns a part of the string in the string argument
string=substring(string,start,length)
Example:
substring('Beatles',1,4)
Result: 'Beat'
substring-after()
Returns the part of the string in the string argument that occurs after the substring in the substr argument
string=substring-after(string,substr)
Example:
substring-after('12/10','/')
Result: '10'
substring-before()
Returns the part of the string in the string argument that occurs before the substring in the substr argument
string=substring-before(string,substr)
Example:
substring-before('12/10','/')
Result: '12'
translate()
Performs a character by character replacement. It looks in the value argument for characters contained in string1, and replaces each character for the one in the same position in the string2
string=translate(value,string1,string2)
Examples:
translate('12:30','30','45')
Result: '12:45'
translate('12:30','03','54')
Result: '12:45'
translate('12:30','0123','abcd')
Result: 'bc:da'
Number Functions
Name
Description
Syntax & Example
ceiling()
Returns the smallest integer that is not less than the number argument
number=ceiling(number)
Example:
ceiling(3.14)
Result: 4
floor()
Returns the largest integer that is not greater than the number argument
number=floor(number)
Example:
floor(3.14)
Result: 3
number()
Converts the value argument to a number
number=number(value)
Example:
number('100')
Result: 100
round()
Rounds the number argument to the nearest integer
integer=round(number)
Example:
round(3.14)
Result: 3
sum()
Returns the total value of a set of numeric values in a node-set
number=sum(nodeset)
Example:
sum(/cd/price)
Boolean Functions
Name
Description
Syntax & Example
boolean()
Converts the value argument to Boolean and returns true or false
bool=boolean(value)
false()
Returns false
false()
Example:
number(false())
Result: 0
lang()
Returns true if the language argument matches the language of the xsl:lang element, otherwise it returns false
bool=lang(language)
not()
Returns true if the condition argument is false, and false if the condition argument is true
bool=not(condition)
Example:
not(false())
true()
Returns true
true()
Example:
number(true())
Result: 1
We will use the CD catalog from our XML tutorial to demonstrate some XPath examples.
The CD catalog
If you have studied our XML tutorial, you will remember this XML document:
(A fraction of the CD catalog)
<?xml version="1.0" encoding="ISO-8859-1"?>
<catalog>
<cd>
<title>Empire Burlesque</title>
<artist>Bob Dylan</artist>
<country>USA</country>
<company>Columbia</company>
<price>10.90</price>
<year>1985</year>
</cd>
<cd>
<title>Hide your heart</title>
<artist>Bonnie Tyler</artist>
<country>UK</country>
<company>CBS Records</company>
<price>9.90</price>
<year>1988</year>
</cd>
.
.
.
.
</catalog>
If you have IE 5 or higher you can look at the cdcatalog.xml.
Selecting Nodes
We will demonstrate how to select nodes from the XML document by using the selectNodes function in Internet Explorer. This function takes a location path expression as an argument:
xmlobject.selectNodes(XPath expression)
Selecting cd Nodes
The following example selects all the cd nodes from the CD catalog:
xmlDoc.selectNodes("/catalog/cd")
If you have IE 5 or higher you can try it yourself.
Selecting the First cd Node
The following example selects only the first cd node from the CD catalog:
xmlDoc.selectNodes("/catalog/cd[0]")
If you have IE 5 or higher you can try it yourself.
Note: IE 5 has implemented that [0] should be the first node, but according to the W3C standard it should have been [1].
Selecting price Nodes
The following example selects all the price nodes from the CD catalog:
xmlDoc.selectNodes("/catalog/cd/price")
If you have IE 5 or higher you can try it yourself.
Selecting price Text Nodes
The following example selects only the text from the price nodes:
xmlDoc.selectNodes("/catalog/cd/price/text()")
If you have IE 5 or higher you can try it yourself.
Selecting cd Nodes with Price>10.80
The following example selects all the cd nodes with a price>10.80:
xmlDoc.selectNodes("/catalog/cd[price>10.80]")
If you have IE 5 or higher you can try it yourself.
Selecting price Nodes with Price>10.80
The following example selects all the price nodes with a price>10.80:
xmlDoc.selectNodes("/catalog/cd[price>10.80]/price")
If you have IE 5 or higher you can try it yourself.