A Table-Driven Streaming XML Parsing Methodology for High-Performance Web Services
==================================================================================
by Wei Zhang and Robert A. van Engelen, published in the proceedings of the IEEE International Conference on Web Services 2006
This paper presents a table-driven streaming XML parsing methodology, called TDX. TDX expedites XML parsing by pre-recording the states of an XML parser in tabular form and by utilizing an efficient runtime streaming parsing engine based on a push-down automaton. The parsing tables are automatically produced from the XML schemas of a WSDL service description. Because the schema constraints are pre-encoded in a parsing table, the approach effectively implements a schema-specific XML parsing technique that combines parsing and validation into a single pass. This significantly increases the performance of XML Web services, which results in better response time and may reduce the impact of the flash-crowd effect. To implement TDX, we developed a parser construction toolkit to automatically construct parsers in C code from WSDLs and XML schemas. We applied the toolkit to an example Web services application and measured the raw performance compared to popular high-performance parsers written in C/C++, such as eXpat, gSOAP, and Xerces. The performance results show that TDX can be an order of magnitude faster.
![Download](images/pdf.png) [Article](articles/icws2006tdx.pdf)
Copyright (c) 2018, Robert van Engelen, Genivia Inc. All rights reserved.