Selecting XML with XQuery and XPath
Selecting XML with XQuery and XPath
You can use XPath and XQuery to retrieve specific pieces of XML as you might retrieve data from a database. XQuery and XPath provide a syntax for specifying which elements and attributes you're interested in. The XMLBeans API provides two methods for executing XQuery and XPath expressions, and two ways to use them. The methods are selectPath for XPath and execQuery for XQuery.
You can call them from and XmlObject instance (or a generated type inheriting from it) or an XmlCursor instance. As noted below, each of the four methods works slightly differently; be sure to keep these differences in mind when choosing your approach.
Using XPath with the selectPath Method
You can execute XPath expressions use the selectPath method. When you use XPath with the selectPath method, the value returned is view of values from the current document — not a copy of those values. In other words, changes your code makes to XML returned by the selectPath method change the XML in the document queried against. In contrast, with XQuery executed using the execQuery method, the value returned is a copy of values in the XML queried against.
Note that XPath itself does not provide syntax for declaring prefix to URI bindings. For user convenience, we allow XQuery syntax to be used for such purposes. You can consult the latest XQuery draft when using syntax for declaring namespaces.
Calling XmlObject.selectPath
When called from XmlObject (or a type that inherits from it), the selectPath method returns an array of objects. If the expression is executed against types generated from schema, then the type for the returned array is one of the Java types corresponding to the schema, and you can cast it accordingly.
For example, imagine you have the following XML containing employee information. You've compiled the schema describing this XML and the types generated from schema are available to your code.
If you wanted to find the phone numbers whose area code was 206, you could capture the XPath expression in this way:
Notice in the query expression that the variable $this represents the current context node (the XmlObject that you are querying from). In this example you are querying from the document level XmlObject.
You could then print the results with code such as the following:
Calling XmlCursor.selectPath
When called from an XmlCursor instance, the selectPath method retrieves a list of selections, or locations in the XML. The selections are remembered by the cursor instance. You can use methods such as toNextSelection to navigate among them.
The selectPath method takes an XPath expression. If the expression returns any results, each of those results is added as a selection to the cursor's list of selections. You can move through these selections in the way you might use java.util.Iterator methods to move through a collection.
For example, for a path such as $this/employees/employee, the cursor instance from which you called selectPath would include a selection for each employee element found by the expression. Note that the variable $this is always bound to the current context node, which in this example is the document. After calling the selectPath method, you would use various "selection"-related methods to work with the results. These methods include:
- getSelectionCount() to retrieve the number of selections resulting from the query.
- toNextSelection() to move the cursor to the next selection in the list (such as to the one pointing at the next employee element found).
- toSelection(int) to move the cursor to the selection at the specified index (such as to the third employee element in the selection).
- hasNextSelection() to find out if there are more selections after the cursor's current position.
- clearSelections() clears the selections from the current cursor. This doesn't modify the document (in other words, it doesn't delete the selected XML); it merely clears the selection list so that the cursor is no longer keeping track of those positions.
The following example shows how you might use selectPath, in combination with the push and pop methods, to maneuver through XML, retrieving specific values.
Using selections is somewhat like tracking the locations of multiple cursors with a single cursor. This becomes especially clear when you remove the XML associated with a selection. When you do so the selection itself remains at the location where the removed XML was, but now the selection's location is immediately before the XML that was after the XML you removed. In other words, removing XML created a kind of vacuum that was filled by the XML after it, which shifted up into the space — up into position immediately after the selection location. This is exactly the same as if the selection had been another cursor.
Finally, when using selections keep in mind that the list of selections is in a sense "live". The cursor you're working with is keeping track of the selections in the list. In other words, be sure to call the clearSelections method when you're finished with the selections, just as you should call the XmlCursor.dispose() method when you're finished using the cursor.
Using XQuery with the execQuery Method
You use the execQuery method to execute XQuery expressions. With XQuery expressions, XML returned is a copy of XML in the document queried against. In other words, changes your code makes to the values returned by execQuery are not reflected in the document queried against.
Calling XmlObject.execQuery
As with selectPath, calling execQuery from an XmlObject instance will return an XmlObject array.
The following example retrieves work <zip> elements from the incoming XML, adding the elements as children to a new <zip-list> element.
Calling XmlCursor.execQuery
Unlike the selectPath method called from a cursor, the execQuery method doesn't return void. Instead it returns an XmlCursor instance positioned at the beginning of a new XML document representing the query results. Rather than accessing results as selections, you use the cursor to move through the results in typical cursor fashion (for more information, see Navigating XML with Cursors). The models are very different.
As always, you can cast the results to a type generated from schema if you know that the results conform to that type.
The following example retrieves work <phone> elements from the incoming XML, then changes the number in the results.