I have been working on Java with XML since 2002. It used to be quite confusing and complex to get the value of a tag for example as it required to get into
Node level. Here’s an example:
<books> <book> <title>Book1</title> <prod id="33-657" media="paper"></prod> <chapter name="Introduction to XML"> <para>What is HTML</para> <para>What is XML</para> </chapter> <chapter name="XML Syntax"> <para>Elements must have a closing tag</para> <para>Elements must be properly nested</para> </chapter> </book> <book> <title>Book2</title> ... ... </book> </books>
From functional point of view, I may be interested to get book names. So instead of going into familiar low-level
Node route, here’s what I prefer:
List<String> bookNames = DomEditor.getTagValues("title", document);
If you see the underlying implementation of
DomEditor, you will realize that you were actually not interested in its implementation as it seems to be complex enough. We created these utility classes a long time back and because of lack of similar APIs I kept on copying these APIs in different projects. As still I couldn’t find APIs in
commons-xml the way I could for
commons-lang, I thought it will be a good idea to create a library just for that.
Some more APIs Examples
For getting the first Element with tagname “chapter” of the XML, we can use following API.
Element elmt = DomEditor.getElement("chapter", doc);
As this XML contains more than one elements for “chapter” tag we can get the array of these elements too if needed using following API.
List<Element> elements = DomEditor.getElements("chapter", doc);
Within “chapter” element we have multiple “para” elements which we can retrieve using
getElements(String, Element) method by passing elmt element.
List<Element> paraElements = DomEditor.getElements("para", elmt);
We can directly retrieve the value of all “para” tags using
DomEditor.getTagValues(Element element, String tagName) method.
List<String> paraTagValues = DomEditor.getTagValues(elmt, "para");
DomEditor also provides APIs for some main XML operations on DOM:
- deleting a tag from a XML Element
- checking whether tag exists in DOM object
- inserting a new tag in the DOM object
insertNewTag(Document, String, String)
- getting a value of a node based on XPath provided in “.” separated String. So in above XML using
DomEditor.getNodeValue("title.prod.para", elmt)will give “What is HTML” as resultant.
Useful for printing formatted XML to a file or to console.
Direct APIs for parsing a file, String, InputStream etc.