Here are the results of my comparison of various SAX parsers and
the XML Pull Parser, a non-SAX parser. Tests were done on a AMD Athlon/1.8Ghz with 512MB RAM running Windows XP and Java 1.4.2
The SOAP test documents were obtained from Aleksander
Slominski's site.
Contents
Summary
Below is a description of the parsers I tested. Conformance reports for some of these are available at http://cafeconleche.org/SAXTest, though that page has not yet been updated for Piccolo 1.04. The latest release brings Piccolo into compliance with almost all of the XML 1.0 rules.
Crimson 1.1.3 |
The XML parser built into the JDK 1.4. Supports SAX2 extensions
1.0, DOM level 2, and JAXP (except transform). |
GNU JAXP 1.0b1
w/AElfred2 |
The classic AElfred parser now supports DOM level
2 and JAXP 1.1. |
kXML 2.1.9 |
One of the two popular XML Pull parsers |
Oracle V2
(9.2.0.6) |
Oracle's XML parser, supporting the full suite of XML
APIs. |
Piccolo 1.04 |
My parser, s upporting SAX1, SAX2 extensions
1.0, and JAXP's SAX functionality. No DOM support. |
Resin 3.0.8 XML |
Caucho's XML parser, which comes with the Resin application server |
Xerces 2.6.2 |
A reliable full-featured parser with reasonable performance
but large size. |
XP 0.5 |
James Clark's very fast small parser supporting SAX1. |
XPP3 1.1.3.4.G |
The other popular XML Pull parser. It also comes with a SAX wrapper. |
|
Testing Methodology
These performance benchmarks attempt to predict performance in
a server environment. In such an environment, the parser
will typically be instantiated once and then run many times, often
on similar documents. In such a scenario, Java's Just-In-Time
(JIT) compiler will compile the code early on, drastically improving
performance for all but the first few parse operations. For this
reason, SAXBench parses a document continuously for five seconds
before starting to measure performance.
Most of the test documents are also read into memory first, to
avoid I/O variability from the performance results. I have specifically
chosen to leave the documents in a byte array and let the parsers
do the character decoding rather than pre-read the documents into
a String. Some parsers have optimized character decoding over
Java's default decoding; I feel the decoding should be included
in the results, because the input source in a server environment
will often be an InputStream, which requires character decoding.
Downloads
Download
SAXBench source
Performance Benchmarks
Piccolo 1.04 |
0.034 |
0.194 |
1.789 |
180.64 |
Resin 3.0.8 XML |
0.063 |
0.289 |
2.734 |
251.24 |
Xerces 2.6.2 |
0.073 |
0.281 |
2.328 |
237.48 |
kXML 2.1.9 (XmlPull) |
0.079 |
0.431 |
3.969 |
401.88 |
XPP3 1.1.3.4G (XmlPull) |
0.082 |
0.317 |
2.546 |
261.28 |
Oracle 9.2.0.6 w/NS |
0.117 |
0.344 |
2.664 |
268.12 |
XP 0.5 |
0.124 |
0.325 |
2.633 |
283.12 |
Crimson JDK 1.4 |
0.156 |
0.38 |
2.734 |
270 |
GNU JAXP/AElfred2 1.0b1 |
0.234 |
0.481 |
2.828 |
263.76 |
|
Piccolo 1.04 |
0.041 |
0.225 |
2.07 |
206.24 |
Xerces 2.6.2 |
0.082 |
0.336 |
3.07 |
291.88 |
kXML 2.1.9 (XmlPull) |
0.084 |
0.456 |
4.218 |
429.36 |
XPP3 1.1.3.4G (XmlPull) |
0.084 |
0.324 |
2.68 |
263.76 |
XPP3 1.1.3.4G (SAX) |
0.095 |
0.395 |
3.39 |
337.52 |
Resin 3.0.8 XML |
0.103 |
0.656 |
6.5 |
615 |
Oracle 9.2.0.6 |
0.117 |
0.344 |
2.664 |
268.12 |
Crimson JDK 1.4 |
0.205 |
0.639 |
5.031 |
480.64 |
GNU JAXP/AElfred2 1.0b1 |
0.297 |
0.767 |
5.711 |
527.52 |
|
Piccolo 1.04 |
1.711 |
160.433 |
1743.6 |
Oracle 9.2.0.6 w/NS |
2 |
180.2 |
1890.6 |
Xerces 2.6.2 |
2.274 |
203.633 |
2443.8 |
Crimson JDK 1.4 |
2.688 |
237.5 |
2372 |
GNU JAXP/AElfred2 1.0b1 |
2.688 |
221.867 |
2350 |
Resin 3.0.8 XML |
2.914 |
264.567 |
2690.8 |
XPP3 1.1.3.4G (XmlPull) |
2.977 |
272.933 |
2856.2 |
XP 0.5 |
3.914 |
641.667 |
17890.6 |
kXML 2.1.9 (XmlPull) |
4.625 |
418.733 |
4262.4 |
|
Piccolo 1.04 |
119.38 |
Piccolo 1.04 w/NS |
129.38 |
Xerces 2.6.2 |
148.44 |
Oracle 9.2.0.6 w/NS |
150.32 |
Xerces 2.6.2 w/NS |
161.56 |
XPP3 1.1.3.4G (XmlPull) |
169.38 |
GNU JAXP/AElfred2 1.0b1 |
170.3 |
XPP3 1.1.3.4G (XmlPull) w/NS |
175 |
Crimson JDK 1.4 |
176.26 |
Resin 3.0.8 XML |
177.18 |
Resin 3.0.8 XML w/NS |
180.32 |
GNU JAXP/AElfred2 1.0b1 w/NS |
203.44 |
XP 0.5 |
204.68 |
Crimson JDK 1.4 w/NS |
209.06 |
XPP3 1.1.3.4G (SAX) w/NS |
216.26 |
kXML 2.1.9 (XmlPull) |
270.3 |
kXML 2.1.9 (XmlPull) w/NS |
283.12 |
Piccolo 1.04 |
263.76 |
XP 0.5 |
267.52 |
Piccolo 1.04 w/NS |
269.4 |
Crimson JDK 1.4 |
328.76 |
Xerces 2.6.2 |
335.64 |
Xerces 2.6.2 w/NS |
343.76 |
GNU JAXP/AElfred2 1.0b1 |
348.12 |
Crimson JDK 1.4 w/NS |
366.24 |
GNU JAXP/AElfred2 1.0b1 w/NS |
366.24 |
Oracle 9.2.0.6 w/NS |
380 |
Resin 3.0.8 XML |
476.88 |
Resin 3.0.8 XML w/NS |
483.12 |
XPP3 1.1.3.4G (XmlPull) |
485 |
XPP3 1.1.3.4G (XmlPull) w/NS |
489.36 |
XPP3 1.1.3.4G (SAX) w/NS |
518.76 |
kXML 2.1.9 (XmlPull) |
794.4 |
kXML 2.1.9 (XmlPull) w/NS |
806.24 |
|
Parser Options
Crimson JDK 1.4 |
SAX2 |
ns_off |
Crimson JDK 1.4 w/NS |
SAX2 |
ns_on |
GNU JAXP/AElfred2 1.0b1 |
SAX2 |
ns_off |
GNU JAXP/AElfred2 1.0b1 w/NS |
SAX2 |
ns_on |
kXML 2.1.9 (XmlPull) |
XmlPull |
ns_off |
kXML 2.1.9 (XmlPull) w/NS |
XmlPull |
ns_on |
Oracle 9.2.0.6 w/NS |
SAX2 |
ns_on |
Piccolo 1.04 |
SAX2 |
ns_off |
Piccolo 1.04 w/NS |
SAX2 |
ns_on |
Resin 3.0.8 XML |
JAXP |
ns_off |
Resin 3.0.8 XML w/NS |
JAXP |
ns_on |
Xerces 2.6.2 |
SAX2 |
ns_off |
Xerces 2.6.2 w/NS |
SAX2 |
ns_on |
XP 0.5 |
SAX1 |
|
XPP3 1.1.3.4G (SAX) w/NS |
SAX2 |
ns_on |
XPP3 1.1.3.4G (XmlPull) |
XmlPull |
ns_off |
XPP3 1.1.3.4G (XmlPull) w/NS |
XmlPull |
ns_on |
|
Test Options
SOAP 1 (0.5K) |
data/list_soapized_1.xml |
preload |
SOAP 10 (2.5K) |
data/list_soapized_10.xml |
preload |
SOAP 100 (26K) |
data/list_soapized_100.xml |
preload |
SOAP 10K (2.7MB) |
data/list_soapized_10000.xml |
preload |
Random 100 (33K) |
data/rand_100.xml |
preload |
Random 10K (3.6MB) |
data/rand_10000.xml |
preload |
Random 100K (36MB) |
data/rand_100000.xml |
no |
Topic Map (2MB) |
data/topicmap.xml |
preload |
Religious Text (7MB) |
data/testaments.xml |
preload |
|
|