Selecting XML with XQuery and XPath

You can use XQuery and XPath to retrieve specific pieces of XML as you might retrieve data from a database. XQuery and XPath provide a syntax for specifying which elements and attributes you're interested in. The XMLBeans API provides two methods for executing XQuery and XPath expressions, and two differing ways to use them. The methods are selectPath and execQuery, and you can call them from XmlObject (or an object inheriting from it) or XmlCursor. The results for the methods differ somewhat.

Using the selectPath Method

The selectPath method is the most efficient way to execute XPath expressions. The selectPath method is optimized for XPath. When you use XPath with the selectPath method, the value returned is an array of values from the current document. In contrast, when you use execQuery, the value returned is a new document.

Calling from XmlObject

When called from XmlObject (or a type that inherits from it), this method returns an array of objects. If the expression is executed against types generated from schema, then the type for the returned array is one of the Java types corresponding to the schema.

For example, imagine you have the following XML containing employee information. You've compiled the schema describing this XML and the types generated from schema are available to your code.

<xq:employees xmlns:xq="http://openuri.org/selectPath">
    <xq:employee>
        <xq:name>Fred Jones</xq:name>
        <xq:address location="home">
            <xq:street>900 Aurora Ave.</xq:street>
            <xq:city>Seattle</xq:city>
            <xq:state>WA</xq:state>
            <xq:zip>98115</xq:zip>
        </xq:address>
        <xq:address location="work">
            <xq:street>2011 152nd Avenue NE</xq:street>
            <xq:city>Redmond</xq:city>
            <xq:state>WA</xq:state>
            <xq:zip>98052</xq:zip>
        </xq:address>
        <xq:phone location="work">(425)555-5665</xq:phone>
        <xq:phone location="home">(206)555-5555</xq:phone>
        <xq:phone location="mobile">(206)555-4321</xq:phone>
    </xq:employee>
</xq:employees>
If you wanted to find the phone numbers whose area code was 206, you could capture the XPath expression in this way:
String queryExpression =
    "declare namespace xq='http://openuri.org/selectPath'" +
    "$this/xq:employees/xq:employee/xq:phone[contains(., '(206)')]"

Notice in the query expression that the variable $this represents the current context node (the XmlObject that you are querying from). In this example you are querying from the document level XmlObject.

You could then print the results with code such as the following:

/*
 * Retrieve the matching phone elements and assign the results to the corresponding
 * generated type.
 */
PhoneType[] phones = (PhoneType[])empDoc.selectPath(queryExpression);
/*
 * Loop through the results, printing the value of the phone element.
 */
for (int i = 0; i < phones.length; i++)
{
    System.out.println(phones[i].stringValue());
}  

Calling from XmlCursor

When called from an XmlCursor instance, the selectPath method retrieves a list of selections, or locations in the XML. The selections are remembered by the cursor instance. You can use methods such as toNextSelection to navigate among them.

The selectPath method takes an XPath expression. If the expression returns any results, each of those results is added as a selection to the cursor's list of selections. You can move through these selections in the way you might use java.util.Iterator methods to move through a collection.

For example, for a path such as $this/employees/employee, the results would include a selection for each employee element found by the expression. Note that the variable $this is always bound to the current context node, which in this example is the document. After calling the selectPath method, you would use various "selection"-related methods to work with the results. These methods include:

The following example shows how you might use selectPath, in combination with the push and pop methods, to maneuver through XML, retrieving specific values.

public void printZipsAndWorkPhones(XmlObject xml)
{
    // Declare the namespace that will be used.
    String xqNamespace =
        "declare namespace xq='http://openuri.org/selectPath'";

    // Insert a cursor and move it to the first element.
    XmlCursor cursor = xml.newCursor();
    cursor.toFirstChild();
    /*
     * Save the cursor's current location by pushing it
     * onto a stack of saved locations.
     */
    cursor.push();
    // Query for zip elements.
    cursor.selectPath(xqNamespace + "$this//xq:zip");
    /*
     * Loop through the list of selections, getting the value of
     * each element.
     */
    while (cursor.toNextSelection())
    {
        System.out.println(cursor.getTextValue());
    }
    // Pop the saved location off the stack.
    cursor.pop();
    // Query again from the top, this time for work phone numbers.
    cursor.selectPath(xqNamespace + "$this//xq:phone[@location='work']");
    /*
     * Move the cursor to the first selection, them print that element's
     * value.
     */
    cursor.toNextSelection();
    System.out.println(cursor.getTextValue());
    // Dispose of the cursor.
    cursor.dispose();
}

Using selections is somewhat like tracking the locations of multiple cursors with a single cursor. This becomes especially clear when you remove the XML associated with a selection. When you do so the selection itself remains at the location where the removed XML was, but now the selection's location is immediately before the XML that was after the XML you removed. In other words, removing XML created a kind of vacuum that was filled by the XML after it, which shifted up into the space — up into position immediately after the selection location. This is exactly the same as if the selection had been another cursor.

Finally, when using selections keep in mind that the list of selections is in a sense "live". The cursor you're working with is keeping track of the selections in the list. In other words, be sure to call the clearSelections method when you're finished with the selections, just as you should call the XmlCursor.dispose() method when you're finished using the cursor.

Using the execQuery Method

Use the execQuery method to execute XQuery expressions that are more sophisticated than paths. These expressions include more sophisticated loops and FLWR (For, Let, Where, and Results) expressions.

Note: Be sure to see the simpleExpressions sample in the SamplesApp application for a sampling of XQuery expressions in use.

Calling from XmlObject

Unlike selectPath, calling execQuery from an XmlObject instance will return an XmlObject array. If the XmlObject instances resulting from the XQuery match a recognized XMLBeans type (the namespace and top level element name match up with an XMLBeans type) then the XmlObject will be typed; otherwise the XmlObject will be untyped.

Calling from XmlCursor

Calling execQuery from an XmlCursor instance returns a new XmlCursor instance. The cursor returned is positioned at the beginning of a new xml document representing the query results, and you can use it to move through the results, cursor-style (for more information, see Navigating XML with Cursors). If the document resulting from the query execution represents a recognized XMLBeans type (the namespace and top level element name match up with an XMLBeans type) then the document resulting from the xquery will have that Java type; otherwise the resulting document will be untyped.

Related Topics

Getting Started with XMLBeans