StAX(Streaming API for XML) and Other parsers
To handle xml in java, there are mainly two API were used, DOM and SAX. But now StAX API is also part of java parsing API. Here I am writing some of the differences I found while learnig parsers for writing PojoXML open source project.
StAX
The main goal of StAX API(http://jcp.org/en/jsr/detail?id=173) is to give “parsing control to the programmer” by providing a simple iterator based API. This allows the programmer to ask for the next event (pull the event) ,because of this property StAX is called as pull parser .StAX was created to address limitations in the two most prevalent parsing APIs, SAX and DOM.
DOM vs SAX vs StAX
DOM creates a fully tree based structure of xml document in the memory and traverse randomly. DOM takes lots of memory. So its inefficient for processing large xml document.
But SAX parser is more efficient in terms of memory usage. SAX is a streaming API it parse sequentially when an xml Infoset found it fires events. SAX doesn’t provide random access to xml data. We don’t have control over the xml parsing. When an event(element or attribute or text or CDATA founds) occurs SAX parser send data to your program(Push Parsing).
In case of StAX (Pull Parsing) API we have to call parser API to get next element when we wanted. You can control the parsing from your code. Using Pull parser we can filter xml document, means we can ignore the unnecessary elements. Pull parsing library is smaller than push parser.
When there is memory limitation you should go for streaming API for example cell phone application.
Woodstox is a StAX implementation.
Which parser you are using? And How you handles xml parsing in your application