Getting Started with XMLBeans

XMLBeans provides intuitive ways to handle XML that make it easier for you to access and manipulate XML data and documents in Java.

Characteristics of XMLBeans approach to XML:

The starting point for XMLBeans is XML schema. A schema (contained in an XSD file) is an XML document that defines a set of rules to which other XML documents must conform. The XML Schema specification provides a rich data model that allows you to express sophisticated structure and constraints on your data. For example, an XML schema can enforce control over how data is ordered in a document, or constraints on particular values (for example, a birth date that must be later than 1900). Unfortunately, the ability to enforce rules like this is typically not available in Java without writing custom code. XMLBeans honors schema constraints.

Note: Where an XML schema defines rules for an XML document, an XML instance is an XML document that conforms to the schema.

You compile a schema (XSD) file to generate a set of Java interfaces that mirror the schema. With these types, you process XML instance documents that conform to the schema. You bind an XML instance document to these types; changes made through the Java interface change the underlying XML representation.

Previous options for handling XML include using XML programming interfaces (such as DOM or SAX) or an XML marshalling/binding tool (such as JAXB). Because it lacks strong schema-oriented typing, navigation in a DOM-oriented model is more tedious and requires an understanding of the complete object model. JAXB provides support for the XML schema specification, but handles only a subset of it; XMLBeans supports all of it. Also, by storing the data in memory as XML, XMLBeans is able to reduce the overhead of marshalling and demarshalling.

Accessing XML Using Its Schema

To get a glimpse of the kinds of things you can do with XMLBeans, take a look at an example using XML for a purchase order. The purchase order XML contains data exchanged by two parties, such as two companies. Both parties need to be able to rely on a consistent message shape, and a schema specifies the common ground.

Here's what a purchase order XML instance might look like.

<po:purchase-order xmlns:po="http://openuri.org/easypo">
    <po:customer>
        <po:name>Gladys Kravitz</po:name>
        <po:address>Anytown, PA</po:address>
    </po:customer>
    <po:date>2003-01-07T14:16:00-05:00</po:date>
    <po:line-item>
        <po:description>Burnham's Celestial Handbook, Vol 1</po:description>
        <po:per-unit-ounces>5</po:per-unit-ounces>
        <po:price>21.79</po:price>
        <po:quantity>2</po:quantity>
    </po:line-item>
    <po:line-item>
        <po:description>Burnham's Celestial Handbook, Vol 2</po:description>
        <po:per-unit-ounces>5</po:per-unit-ounces>
        <po:price>19.89</po:price>
        <po:quantity>2</po:quantity>
    </po:line-item>
<po:shipper>
        <po:name>ZipShip</po:name>
        <po:per-ounce-rate>0.74</po:per-ounce-rate>
    </po:shipper>
</po:purchase-order>

This XML includes a root element, purchase-order, that has three kinds of child elements: customer, date, line-item, and shipper. An intuitive, object-based view of this XML would provide an object representing the purchase-order element, and it would have methods for getting the date and for getting subordinate objects for customer, line-item, and shipper elements. Each of the last three would have its own methods for getting the data inside them as well.

Looking at the Schema

The following XML is the the schema for the preceding purchase order XML. It defines the XML's "shape" — what its elements are, what order they appear in, which are children of which, and so on.

<xs:schema targetNamespace="http://openuri.org/easypo"
    xmlns:po="http://openuri.org/easypo"
    xmlns:xs="http://www.w3.org/2001/XMLSchema"
    elementFormDefault="qualified">

    <xs:element name="purchase-order">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="customer" type="po:customer"/>
                <xs:element name="date" type="xs:dateTime"/>
                <xs:element name="line-item" type="po:line-item" minOccurs="0" maxOccurs="unbounded"/>
                <xs:element name="shipper" type="po:shipper" minOccurs="0"/>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
    <xs:complexType name="customer">
        <xs:sequence>
            <xs:element name="name" type="xs:string"/>
            <xs:element name="address" type="xs:string"/>
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name="line-item">
        <xs:sequence>
            <xs:element name="description" type="xs:string"/>
            <xs:element name="per-unit-ounces" type="xs:decimal"/>
            <xs:element name="price" type="xs:double"/>
            <xs:element name="quantity" type="xs:int"/>
        </xs:sequence>
    </xs:complexType>
    <xs:complexType name="shipper">
        <xs:sequence>
            <xs:element name="name" type="xs:string"/>
            <xs:element name="per-ounce-rate" type="xs:decimal"/>
        </xs:sequence>
    </xs:complexType>
</xs:schema>

This schema describes the purchase order XML instance by defining the following:

In other words, the schema defines types for the child elements and describes their position as subordinate to the root element, purchase-order.

When you use the XMLBean compiler with an XSD file such as this one, you generate a JAR file containing the interfaces generated from the schema.

Writing Java Code That Uses the Interfaces

With the XMLBeans interfaces in your application, you can write code that uses the new types to handle XML based on the schema. Here's an example that extracts information about each of the ordered items in the purchase order XML, counts the items, and calculates a total of their prices. In particular, look at the use of types generated from the schema and imported as part of the org.openuri.easypo package.

The printItems method receives a File object containing the purchase order XML file.

package docs.xmlbeans;

import java.io.File;
import org.apache.xmlbeans.*;
import org.openuri.easypo.PurchaseOrderDocument;
import org.openuri.easypo.PurchaseOrder;
import org.openuri.easypo.LineItem;

public class POHandler
{
    public static void printItems(File po) throws Exception
    {
        /*
         * All XMLBeans schema types provide a nested Factory class you can
         * use to bind XML to the type, or to create new instances of the type.
         * Note that a "Document" type such as this one is an XMLBeans
         * construct for representing a global element. It provides a way
         * for you to get and set the contents of the entire element.
         *
         * Also, note that the parse method will only succeed if the
         * XML you're parsing appears to conform to the schema.
         */
        PurchaseOrderDocument poDoc =
            PurchaseOrderDocument.Factory.parse(po);

        /*
         * The PurchaseOrder type represents the purchase-order element's
         * complex type.
         */
        PurchaseOrder po = poDoc.getPurchaseOrder();

        /*
         * When an element may occur more than once as a child element,
         * the schema compiler will generate methods that refer to an
         * array of that element. The line-item element is defined with
         * a maxOccurs attribute value of "unbounded", meaning that
         * it may occur as many times in an instance document as needed.
         * So there are methods such as getLineItemArray and setLineItemArray.
         */
        LineItem[] lineitems = po.getLineItemArray();
        System.out.println("Purchase order has " + lineitems.length + " line items.");

        double totalAmount = 0.0;
        int numberOfItems = 0;

        /*
         * Loop through the line-item elements, using generated accessors to
         * get values for child elements such a description, quantity, and
         * price.
         */
        for (int j = 0; j < lineitems.length; j++)
        {
            System.out.println(" Line item: " + j);
            System.out.println(
                "   Description: " + lineitems[j].getDescription());
            System.out.println("   Quantity: " + lineitems[j].getQuantity());
            System.out.println("   Price: " + lineitems[j].getPrice());
            numberOfItems += lineitems[j].getQuantity();
            totalAmount += lineitems[j].getPrice() * lineitems[j].getQuantity();
        }
        System.out.println("Total items: " + numberOfItems);
        System.out.println("Total amount: " + totalAmount);
    }
}

Notice that types generated from the schema reflect what's in the XML:

Capitalization and punctuation for generated type names follow Java convention. Also, while this example parses the XML from a file, other parse methods support a Java InputStream object, a Reader object, and so on.

The preceding Java code prints the following to the console:

Purchase order has 3 line items.
 Line item 0
   Description: Burnham's Celestial Handbook, Vol 1
   Quantity: 2
   Price: 21.79
 Line item 1
   Description: Burnham's Celestial Handbook, Vol 2
   Quantity: 2
   Price: 19.89
Total items: 4
Total amount: 41.68

Creating New XML Instances from Schema

As you've seen XMLBeans provides a "factory" class you can use to create new instances. The following example creates a new purchase-order element and adds a customer child element. It then inserts name and address child elements, creating the elements and setting their values with a single call to their set methods.

public PurchaseOrderDocument createPO()
{
    PurchaseOrderDocument newPODoc = PurchaseOrderDocument.Factory.newInstance();
    PurchaseOrder newPO = newPODoc.addNewPurchaseOrder();
    Customer newCustomer = newPO.addNewCustomer();
    newCustomer.setName("Doris Kravitz");
    newCustomer.setAddress("Bellflower, CA");
    return newPODoc;
}

The following is the XML that results. Note that XMLBeans assigns the correct namespace based on the schema, using an "ns1" (or, "namespace 1") prefix. For practical purposes, the prefix itself doesn't really matter — it's the namespace URI (http://openuri.org/easypo) that defines the namespace. The prefix is merely a marker that represents it.

<ns1:purchase-order xmlns:ns1="http://openuri.org/easypo">
    <ns1:customer>
        <ns1:name>Doris Kravitz</ns1:name>
        <ns1:address>Bellflower, CA</ns1:address>
    </ns1:customer>
</ns1:purchase-order>

Note that all types (including those generated from schema) inherit from XmlObject, and so provide a Factory class. For an overview of the type system in which XmlObject fits, see XMLBeans Support for Built-In Schema Types. For reference information, see XmlObject Interface.

XMLBeans Hierarchy

The generated types you saw used in the preceding example are actually part of a hierarchy of XMLBeans types. This hierarchy is one of the ways in which XMLBeans presents an intuitive view of schema. At the top of the hierarchy is XmlObject, the base interface for XMLBeans types. Beneath this level, there are two main type categories: generated types that represent user-derived schema types, and included types that represent built-in schema types.

This topic has already introduced generated types. For more information, see Java Types Generated from User-Derived Schema Types.

Built-In Type Support

In addition to types generated from a given schema, XMLBeans provides 46 Java types that mirror the 46 built-in types defined by the XML schema specification. Where schema defines xs:string, xs:decimal, and xs:int, for example, XMLBeans provides XmlString, XmlDecimal, and XmlInt. Each of these also inherits from XmlObject, which corresponds to the built-in schema type xs:anyType.

XMLBeans provides a way for you to handle XML data as these built-in types. Where your schema includes an element whose type is, for example, xs:int, XMLBeans will provide a generated method designed to return an XmlInt. In addition, as you saw in the preceding example, for most types there will also be a method that returns a natural Java type such as int. The following two lines of code return the quantity element's value, but return it as different types.

// Methods that return simple types begin with an "x".
XmlInt xmlQuantity = lineitems[j].xgetQuantity();
// Methods that return a natural Java type are unadorned.
int javaQuantity = lineitems[j].getQuantity();

In a sense both get methods navigate to the quantity element; the getQuantity method goes a step further and converts the elements value to the most appropriate natural Java type before returning it. (XMLBeans also provides a means for validating the XML as you work with it.)

If you know a bit about XML schema, XMLBeans types should seem fairly intuitive. If you don't, you'll learn a lot by experimenting with XMLBeans using your own schemas and XML instances based on them.

For more information on the methods of types generated from schema, see Methods for Types Generated From Schema. For more about the how XMLBeans represents built-in schema types, see XMLBeans Support for Built-In Schema Types.

Using XQuery Expressions

With XMLBeans you can use XQuery to query XML for specific pieces of data. XQuery is sometimes referred to as "SQL for XML" because it provides a mechanism to access data directly from XML documents, much as SQL provides a mechanism for accessing data in traditional databases.

XQuery borrows some of its syntax from XPath, a syntax for specifying nested data in XML. The following example returns all of the line-item elements whose price child elements have values less than or equal to 20.00:

PurchaseOrderDocument doc = PurchaseOrderDocument.Factory.parse(po);

/*
 * The XQuery expression is the following two strings combined. They're
 * declared separately here for convenience. The first string declares
 * the namespace prefix that's used in the query expression; the second
 * declares the expression itself.
 */
String nsText = "declare namespace po = 'http://openuri.org/easypo'";
String pathText = "$this/po:purchase-order/po:line-item[po:price <= 20.00]";
String queryText = nsText + pathText;

XmlCursor itemCursor = doc.newCursor().execQuery(queryText);
System.out.println(itemCursor.xmlText());

This code creates a new cursor at the start of the document. From there, it uses the XmlCursor interface's execQuery method to execute the query expression. In this example, the method's parameter is an XQuery expression that simply says, "From my current location, navigate through the purchase-order element and retrieve those line-item elements whose value is less than or equal to 20.00." The $this variable means "the current position."

For more information about XQuery, see XQuery 1.0: An XML Query Language at the W3C web site.

Using XML Cursors

In the preceding example you may have noticed the XmlCursor interface. In addition to providing a way to execute XQuery expression, an XML cursors offers a fine-grained model for manipulating data. The XML cursor API, analogous to the DOM's object API, is simply a way to point at a particular piece of data. So, just like a cursor helps navigate through a word processing document, the XML cursor defines a location in XML where you can perform actions on the selected XML.

Cursors are ideal for moving through an XML document when there's no schema available. Once you've got the cursor at the location you're interested in, you can perform a variety of operations with it. For example, you can set and get values, insert and remove fragments of XML, copy fragments of XML to other parts of the document, and make other fine-grained changes to the XML document.

The following example uses an XML cursor to navigate to the customer element's name child element.

PurchaseOrderDocument doc =
    PurchaseOrderDocument.Factory.parse(po);

XmlCursor cursor = doc.newCursor();
cursor.toFirstContentToken();
cursor.toFirstChildElement();
cursor.toFirstChildElement();
System.out.println(cursor.getText());

cursor.dispose();

What's happening here? As with the earlier example, the code loads the XML from a File object. After loading the document, the code creates a cursor at its beginning. Moving the cursor a few times takes it to the nested name element. Once there, the getText method retrieves the element's value.

This is just an introduction to XML cursors. For more information about using cursors, see Navigating XML with Cursors.

Where to Go Next

Note: The xbean.jar file that contains the XMLBeans library is fully functional as a standalone library.

Related Topics

XMLBeans Samples