The Complete Magazine on Open Source

Phase two delivered! Thanks to Apache XMLBeans

/ 2387 0


In this article, the author shares his experience about how the use of Apache XMLBeans helped him to overcome many development issues and deliver his project without hiccups.

Starting on a new project is usually a thrilling experience as well as a platform for learning. But when the bugs start rolling in, the enthusiasm starts waning pretty quickly. I had decided to join a project in my organisation, and was designated to take over as the project manager. As a run-up to joining the team, I was informed that the project was in Phase One of development and would be delivered shortly for testing. In fact, Phase One code was delivered for testing just a few days after I joined the team. But the delivered code had a lot of bugs, which resulted in an escalation to seniors. Though I was new in the team, as a senior, I had to take responsibility for the execution of the project and to ensure its smooth delivery.

Phase One of the project was supposed to be straightforward, but the task became complicated when the team decided to implement a custom XML parser. Today, the task of reading an XML file appears simple. We only have to locate the necessary libraries, and use them to read the input XML. But at that time (around 2006, to be precise), when few libraries were available, the team members had decided to implement XML parsing on their own. This decision turned into the monster that came back to bite us.

The code developed by the team had bugs, due to which the input XML was not being read properly. We spent quite a few long days in the office trying to resolve the issues and get the project back on track. Finally, a few days and many frayed tempers later, we managed to complete the task of reading the XML file, followed by validation of the business logic (which, interestingly, had less bugs) and delivered Phase One for integration testing.

Apache XMLBeans
Based on the experience of delivering Phase One, I knew that I had a tough task on my hands. The reason for this anxiety was the increased complexity of the input XML in Phase Two of the project. To try and avoid the difficulties faced during Phase One, I discussed this problem with a few colleagues, and one of them suggested that I look at Apache XMLBeans. After an initial exploration of XMLBeans and trying it on a sample XML, I took it for a spin on one of the XMLs of Phase Two. I was pleasantly surprised by the ease with which I was able to parse the input XML, without any issues. Based on this experience, I trained the team to use the library for delivering Phase Two of the project.

Generating the JAR
Apache XMLBeans is a Java-based library that allows us to read XML documents. To use the library, we need to generate a JAR file corresponding to the input XML. The JAR file can be created from the XSD using the ‘scomp’ utility provided by the library. In case the XSD is not available, it is possible to create one from a sample input XML, either by using one of the many XML tools like XMLSpy or by using an online application like XMLGrid.

The command used to generate the JAR from the XSD is given below (path to Javac may be different):

scomp -compiler "c:\jdk1.8.0_92\bin\javac.exe" catalog.xsd -out book.jar

On successful execution, the command creates book.jar, as specified by the -out parameter. To parse XML files that conform to the specification given in catalog.xsd, we will need to create a Java project and include the generated JAR file as a supporting library. Please note that we will need to include the XMLBeans libraries in the project.

Input XML
Before taking a look at the sample application, let us look at a snippet of the input XML. For brevity, the XML is shown with a single record:

<?xml version="1.0"?>
   <book id="bk101">
      <author>Gambardella, Matthew</author>
      <title>XML Developer's Guide</title>
      <description>An in-depth look at creating applications with XML.</description>

Sample application
Let us now look at an application to read the sample XML. For the sake of brevity, all attributes of the book have not been listed and/or added.

While creating the JAR file from the XSD, XMLBeans encapsulates the root XML element inside a ‘document’ object. In this case, the name of the object is ‘CatalogDocument’.  We need to load this object using the factory class. Once the object has been loaded, we can navigate and manipulate it as we would a typical Java object. An annotated sample for displaying elements from the input XML, as well as adding one record, is shown below.
This is a package book sample:

import org.apache.xmlbeans.XmlException;
import noNamespace.CatalogDocument;
import noNamespace.CatalogDocument.Catalog;
import noNamespace.CatalogDocument.Catalog.Book;
public class BookSample {
  public static void main(String[] args) {
    CatalogDocument bookDoc;
    try {
      // read the XML using the factory class
      bookDoc = CatalogDocument.Factory.parse(new File(args[0]));
      // get the catalog element
      Catalog catalog = bookDoc.getCatalog();
      // get books from the catalog
      Book[] books = catalog.getBookArray();
      // iterate over the books
      for ( Book bk : books ) {
        System.out.println("Id: " + bk.getId() + ", author: " + bk.getAuthor() + ", title: " + bk.getTitle());
      // Create a new book by adding it to the catalog 
      Book newBk = catalog.addNewBook();
      // populate the fields
      // save the updated catalog File(args[1]));
    } catch (XmlException | IOException e) {

You may have noted that the class CatalogDocument belongs to the noNamespace namespace. As the input XML does not use an explicit namespace, XMLBeans uses the default namespace noNamespace and puts the created objects in it. If the input XML contains a defined namespace, XMLBeans will name the Java packages accordingly.

By using Apache XMLBeans, the task of XML input parsing became a breeze. After generating the XSD from the sample XMLs and the JAR from the XSD, reading the input XML became a simple task and we could concentrate on the task of fixing bugs in the business logic. In other words, it was business as usual, in the context of a software development project, once we overcame the input XML hurdle. It was so easy to parse XMLs in Phase Two – an area of intense headaches in Phase One – that even 10 years later, I have continued to use XMLBeans in most projects.