| By Jimmy Zhang | Article Rating: |
|
| February 20, 2008 02:15 PM EST | Reads: |
31,599 |
Applications Scenarios
There are at least two
different views to make sense of VTD+XML as a practical solution to
real problems. The first is a traditional view of native XML indexing.
Alternatively, you can think of VTD+XML as a binary data format
backwards-compatible with XML.
Native XML Indexing
In this view, you simply use
VTD+XML as the basis for native XML data stores that serve the backend
data needs of XML/SOA applications. By saving it as a BLOB (Binary
Large OBject) in a more traditional database table, you obtain the
additional capabilities such as concurrency and data integrity and
replication. Being vastly superior to the awkward shredding-based XML
to relational data mapping, VTD+XML fits exceptionally well in a pure
XML/SOA environment. Have a lot of XBRL (Extensible Business Reporting
Language) documents, or those big GML (Geography Markup Language)
files? VTD+XML should equip you with horsepower never before available.
Binary Enhanced XML
VTD+XML also naturally extends
the core capabilities of XML by boosting its processing efficiency to a
whole new level. In other words, as a wire format, XML now has it all:
not only is it easy to learn, human-readable, interoperable, and
loosely encoded by design, performance-wise it also leads CORBA, DCOM,
and RMI by a mile. When applied to XML pipelining, VTD+XML can
potentially eliminate the repetitive parsing at each stage of the
pipeline - an issue none of the existing XML pipeline specs (e.g.,
XProc and the XML pipeline definition language) address.
If it takes too long for you to push large documents over your DOM-based ESB (Enterprise Services Bus), how does 100MB around a single second sound?
Benchmark
This section shows you quantitatively
the performance gain achievable using VTD+XML. The benchmark code
measures the combined latency of VTD+XML index-loading (as in VTD-XML
2.0) and XPath evaluation of a specified number of nodes (the first
five nodes in the set) in the result nodeset. The same code is also
rewritten using the Xerces DOM parser and Xalan or Jaxen, both of which
are popular XPath engines. The benchmark code used for the test can be
downloaded here.
Setup
The environment for the benchmark has the following setup:
- Hardware: A Sony VAIO notebook featuring a 1.7GHz Pentium M processor with 2MB of integrated cache memory, 512MB of DDR2 RAM, and a 400MHz front-side bus.
- OS/JVM setting: The notebook runs Windows XP, and the test applications are obtained from version 1.6.0.6-b105 of JDK/JVM.
- XML parsers and XPath engines: The DOM code uses both Xalan (bundled in the JDK) and Jaxen over Xerces DOM (full node expansion). VTD-XML, on the other hand, uses the built-in XPath engine.
Three XML files of similar structure, but different sizes, are used for the test.
<?xml version="1.0"?>
<purchaseOrder orderDate="1999-10-20">
<shipTo country="US">
<name>Alice Smith</name>
<street>123 Maple Street</street>
<city>Mill Valley</city>
<state>CA</state>
<zip>90952</zip>
</shipTo>
<billTo country="US">
<name> Robert Smith </name>
<street>8 Oak Avenue</street>
<city>Old Town</city>
<state>PA</state>
<zip>95819</zip>
</billTo>
<comment>Hurry, my lawn is going wild!</comment>
<items>
<item partNum="872-AA">
<productName>Lawnmower</productName>
<quantity></quantity>
<USPrice>148.95</USPrice>
<comment>Confirm this is electric</comment>
</item>
<item partNum="926-AA">
<productName>Baby Monitor</productName>
<quantity>1</quantity>
<USPrice>39.98</USPrice>
<shipDate>1999-05-21</shipDate>
</item>
...
</items>
</purchaseOrder>
The respective file sizes are:
- "po_small.xml" ---- 6780 bytes
- "po_medium.xml" ---- 112,238 bytes
- "po_big.xml" ----- 1,219,388 bytes
The following XPath expressions are used for the test
- /*/*/*[position() mod 2 = 0]
- /purchaseOrder/items/item[USPrice<100]
- /*/*/*/quantity/text()
- //item/comment
- //item/comment/../quantity
Published February 20, 2008 Reads 31,599
Copyright © 2008 SYS-CON Media, Inc. — All Rights Reserved.
Syndicated stories and blog feeds, all rights reserved by the author.
More Stories By Jimmy Zhang
Jimmy Zhang is a cofounder of XimpleWare, a provider of high performance XML processing solutions. He has working experience in the fields of electronic design automation and Voice over IP for a number of Silicon Valley high-tech companies. He holds both a BS and MS from the department of EECS from U.C. Berkeley.
- Kindle 2 vs Nook
- Why IBM’s Server Chief Got Busted
- Is Cloud Computing Like Teenage Sex?
- Industry Experts Discuss the State of Cloud Computing
- Performance Tuning Essentials for Java
- Confessions of a Ulitzer Addict
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- It's the Java vs. C++ Shootout Revisited!
- Cloud Computing Can Revitalize Your Career as Software Developer
- IBM Could "Reinvent" Java: Mills
- Oracle & Cloud Computing: Exclusive Q&A with SVP Richard Sarwal
- A Brief History of Cloud Computing
- Kindle 2 vs Nook
- Cloud CEOs, CTOs & SVPs to Speak at 4th International Cloud Computing Expo
- Why IBM’s Server Chief Got Busted
- Is Cloud Computing Like Teenage Sex?
- Industry Experts Discuss the State of Cloud Computing
- Performance Tuning Essentials for Java
- The Difference Between Web Hosting and Cloud Computing
- Cloud Computing Expo: Exclusive Q&A with Yahoo! SVP Cloud Computing
- Ajax in RichFaces 3.3, JSF 2 and RichFaces 4
- Confessions of a Ulitzer Addict
- My Thoughts on Ulitzer
- Tactical Cloud Computing Panel at 1st Annual GovIT Expo
- A Cup of AJAX? Nay, Just Regular Java Please
- Java Developer's Journal Exclusive: 2006 "JDJ Editors' Choice" Awards
- The i-Technology Right Stuff
- JavaServer Faces (JSF) vs Struts
- Rich Internet Applications with Adobe Flex 2 and Java
- Java vs C++ "Shootout" Revisited
- Bean-Managed Persistence Using a Proxy List
- Reporting Made Easy with JasperReports and Hibernate
- Creating a Pet Store Application with JavaServer Faces, Spring, and Hibernate
- What's New in Eclipse?
- Why Do 'Cool Kids' Choose Ruby or PHP to Build Websites Instead of Java?
- i-Technology Predictions for 2007: Where's It All Headed?








































