The PySide.QtCore.QXmlStreamReader class provides a fast parser for reading well-formed XML via a simple streaming API.
PySide.QtCore.QXmlStreamReader is a faster and more convenient replacement for Qt’s own SAX parser (see PySide.QtXml.QXmlSimpleReader ). In some cases it might also be a faster and more convenient alternative for use in applications that would otherwise use a DOM tree (see PySide.QtXml.QDomDocument ). PySide.QtCore.QXmlStreamReader reads data either from a PySide.QtCore.QIODevice (see PySide.QtCore.QXmlStreamReader.setDevice() ), or from a raw PySide.QtCore.QByteArray (see PySide.QtCore.QXmlStreamReader.addData() ).
Qt provides PySide.QtCore.QXmlStreamWriter for writing XML.
The basic concept of a stream reader is to report an XML document as a stream of tokens, similar to SAX. The main difference between PySide.QtCore.QXmlStreamReader and SAX is how these XML tokens are reported. With SAX, the application must provide handlers (callback functions) that receive so-called XML events from the parser at the parser’s convenience. With PySide.QtCore.QXmlStreamReader , the application code itself drives the loop and pulls tokens from the reader, one after another, as it needs them. This is done by calling PySide.QtCore.QXmlStreamReader.readNext() , where the reader reads from the input stream until it completes the next token, at which point it returns the PySide.QtCore.QXmlStreamReader.tokenType() . A set of convenient functions including PySide.QtCore.QXmlStreamReader.isStartElement() and PySide.QtCore.QXmlStreamReader.text() can then be used to examine the token to obtain information about what has been read. The big advantage of this pulling approach is the possibility to build recursive descent parsers with it, meaning you can split your XML parsing code easily into different methods or classes. This makes it easy to keep track of the application’s own state when parsing XML.
A typical loop with PySide.QtCore.QXmlStreamReader looks like this:
xml = QXmlStreamReader() ... while not xml.atEnd(): xml.readNext(); ... # do processing if xml.hasError(): ... # do error handlingPySide.QtCore.QXmlStreamReader is a well-formed XML 1.0 parser that does not include external parsed entities. As long as no error occurs, the application code can thus be assured that the data provided by the stream reader satisfies the W3C’s criteria for well-formed XML. For example, you can be certain that all tags are indeed nested and closed properly, that references to internal entities have been replaced with the correct replacement text, and that attributes have been normalized or added according to the internal subset of the DTD.
If an error occurs while parsing, PySide.QtCore.QXmlStreamReader.atEnd() and PySide.QtCore.QXmlStreamReader.hasError() return true, and PySide.QtCore.QXmlStreamReader.error() returns the error that occurred. The functions PySide.QtCore.QXmlStreamReader.errorString() , PySide.QtCore.QXmlStreamReader.lineNumber() , PySide.QtCore.QXmlStreamReader.columnNumber() , and PySide.QtCore.QXmlStreamReader.characterOffset() are for constructing an appropriate error or warning message. To simplify application code, PySide.QtCore.QXmlStreamReader contains a PySide.QtCore.QXmlStreamReader.raiseError() mechanism that lets you raise custom errors that trigger the same error handling described.
The QXmlStream Bookmarks Example illustrates how to use the recursive descent technique to read an XML bookmark file (XBEL) with a stream reader.
QXmlStream understands and resolves XML namespaces. E.g. in case of a StartElement , PySide.QtCore.QXmlStreamReader.namespaceUri() returns the namespace the element is in, and PySide.QtCore.QXmlStreamReader.name() returns the element’s local name. The combination of namespaceUri and name uniquely identifies an element. If a namespace prefix was not declared in the XML entities parsed by the reader, the namespaceUri is empty.
If you parse XML data that does not utilize namespaces according to the XML specification or doesn’t use namespaces at all, you can use the element’s PySide.QtCore.QXmlStreamReader.qualifiedName() instead. A qualified name is the element’s PySide.QtCore.QXmlStreamReader.prefix() followed by colon followed by the element’s local PySide.QtCore.QXmlStreamReader.name() - exactly like the element appears in the raw XML data. Since the mapping namespaceUri to prefix is neither unique nor universal, PySide.QtCore.QXmlStreamReader.qualifiedName() should be avoided for namespace-compliant XML data.
In order to parse standalone documents that do use undeclared namespace prefixes, you can turn off namespace processing completely with the PySide.QtCore.QXmlStreamReader.namespaceProcessing() property.
PySide.QtCore.QXmlStreamReader is an incremental parser. It can handle the case where the document can’t be parsed all at once because it arrives in chunks (e.g. from multiple files, or over a network connection). When the reader runs out of data before the complete document has been parsed, it reports a PrematureEndOfDocumentError . When more data arrives, either because of a call to PySide.QtCore.QXmlStreamReader.addData() or because more data is available through the network PySide.QtCore.QXmlStreamReader.device() , the reader recovers from the PrematureEndOfDocumentError error and continues parsing the new data with the next call to PySide.QtCore.QXmlStreamReader.readNext() .
For example, if your application reads data from the network using a network access manager , you would issue a network request to the manager and receive a network reply in return. Since a PySide.QtNetwork.QNetworkReply is a PySide.QtCore.QIODevice , you connect its PySide.QtNetwork.QNetworkReply.readyRead() signal to a custom slot, e.g. slotReadyRead() in the code snippet shown in the discussion for PySide.QtNetwork.QNetworkAccessManager . In this slot, you read all available data with PySide.QtNetwork.QNetworkReply.readAll() and pass it to the XML stream reader using PySide.QtCore.QXmlStreamReader.addData() . Then you call your custom parsing function that reads the XML events from the reader.
PySide.QtCore.QXmlStreamReader is memory-conservative by design, since it doesn’t store the entire XML document tree in memory, but only the current token at the time it is reported. In addition, PySide.QtCore.QXmlStreamReader avoids the many small string allocations that it normally takes to map an XML document to a convenient and Qt-ish API. It does this by reporting all string data as PySide.QtCore.QStringRef rather than real PySide.QtCore.QString objects. PySide.QtCore.QStringRef is a thin wrapper around PySide.QtCore.QString substrings that provides a subset of the PySide.QtCore.QString API without the memory allocation and reference-counting overhead. Calling QStringRef.toString() on any of those objects returns an equivalent real PySide.QtCore.QString object.
Parameters: |
|
---|
Constructs a stream reader.
Creates a new stream reader that reads from device .
Creates a new stream reader that reads from data .
See also
PySide.QtCore.QXmlStreamReader.addData() PySide.QtCore.QXmlStreamReader.clear() PySide.QtCore.QXmlStreamReader.setDevice()
Creates a new stream reader that reads from data .
See also
PySide.QtCore.QXmlStreamReader.addData() PySide.QtCore.QXmlStreamReader.clear() PySide.QtCore.QXmlStreamReader.setDevice()
Creates a new stream reader that reads from data .
This enum specifies different error cases
Constant | Description |
---|---|
QXmlStreamReader.NoError | No error has occurred. |
QXmlStreamReader.CustomError | A custom error has been raised with PySide.QtCore.QXmlStreamReader.raiseError() |
QXmlStreamReader.NotWellFormedError | The parser internally raised an error due to the read XML not being well-formed. |
QXmlStreamReader.PrematureEndOfDocumentError | The input stream ended before a well-formed XML document was parsed. Recovery from this error is possible if more XML arrives in the stream, either by calling PySide.QtCore.QXmlStreamReader.addData() or by waiting for it to arrive on the PySide.QtCore.QXmlStreamReader.device() . |
QXmlStreamReader.UnexpectedElementError | The parser encountered an element that was different to those it expected. |
This enum specifies the type of token the reader just read.
Constant | Description |
---|---|
QXmlStreamReader.NoToken | The reader has not yet read anything. |
QXmlStreamReader.Invalid | An error has occurred, reported in PySide.QtCore.QXmlStreamReader.error() and PySide.QtCore.QXmlStreamReader.errorString() . |
QXmlStreamReader.StartDocument | The reader reports the XML version number in PySide.QtCore.QXmlStreamReader.documentVersion() , and the encoding as specified in the XML document in PySide.QtCore.QXmlStreamReader.documentEncoding() . If the document is declared standalone, PySide.QtCore.QXmlStreamReader.isStandaloneDocument() returns true; otherwise it returns false. |
QXmlStreamReader.EndDocument | The reader reports the end of the document. |
QXmlStreamReader.StartElement | The reader reports the start of an element with PySide.QtCore.QXmlStreamReader.namespaceUri() and PySide.QtCore.QXmlStreamReader.name() . Empty elements are also reported as StartElement , followed directly by EndElement . The convenience function PySide.QtCore.QXmlStreamReader.readElementText() can be called to concatenate all content until the corresponding EndElement . Attributes are reported in PySide.QtCore.QXmlStreamReader.attributes() , namespace declarations in PySide.QtCore.QXmlStreamReader.namespaceDeclarations() . |
QXmlStreamReader.EndElement | The reader reports the end of an element with PySide.QtCore.QXmlStreamReader.namespaceUri() and PySide.QtCore.QXmlStreamReader.name() . |
QXmlStreamReader.Characters | The reader reports characters in PySide.QtCore.QXmlStreamReader.text() . If the characters are all white-space, PySide.QtCore.QXmlStreamReader.isWhitespace() returns true. If the characters stem from a CDATA section, PySide.QtCore.QXmlStreamReader.isCDATA() returns true. |
QXmlStreamReader.Comment | The reader reports a comment in PySide.QtCore.QXmlStreamReader.text() . |
QXmlStreamReader.DTD | The reader reports a DTD in PySide.QtCore.QXmlStreamReader.text() , notation declarations in PySide.QtCore.QXmlStreamReader.notationDeclarations() , and entity declarations in PySide.QtCore.QXmlStreamReader.entityDeclarations() . Details of the DTD declaration are reported in in PySide.QtCore.QXmlStreamReader.dtdName() , PySide.QtCore.QXmlStreamReader.dtdPublicId() , and PySide.QtCore.QXmlStreamReader.dtdSystemId() . |
QXmlStreamReader.EntityReference | The reader reports an entity reference that could not be resolved. The name of the reference is reported in PySide.QtCore.QXmlStreamReader.name() , the replacement text in PySide.QtCore.QXmlStreamReader.text() . |
QXmlStreamReader.ProcessingInstruction | The reader reports a processing instruction in PySide.QtCore.QXmlStreamReader.processingInstructionTarget() and PySide.QtCore.QXmlStreamReader.processingInstructionData() . |
This enum specifies the different behaviours of PySide.QtCore.QXmlStreamReader.readElementText() .
Constant | Description |
---|---|
QXmlStreamReader.ErrorOnUnexpectedElement | Raise an UnexpectedElementError and return what was read so far when a child element is encountered. |
QXmlStreamReader.IncludeChildElements | Recursively include the text from child elements. |
QXmlStreamReader.SkipChildElements | Skip child elements. |
Note
This enum was introduced or modified in Qt 4.6
Parameters: | data – str |
---|
Adds more data for the reader to read. This function does nothing if the reader has a PySide.QtCore.QXmlStreamReader.device() .
Parameters: | data – unicode |
---|
Adds more data for the reader to read. This function does nothing if the reader has a PySide.QtCore.QXmlStreamReader.device() .
Parameters: | data – PySide.QtCore.QByteArray |
---|
Adds more data for the reader to read. This function does nothing if the reader has a PySide.QtCore.QXmlStreamReader.device() .
Parameters: | extraNamespaceDeclaraction – PySide.QtCore.QXmlStreamNamespaceDeclaration |
---|
Adds an extraNamespaceDeclaration . The declaration will be valid for children of the current element, or - should the function be called before any elements are read - for the entire XML document.
Parameters: | extraNamespaceDeclaractions – |
---|
Return type: | PySide.QtCore.bool |
---|
Returns true if the reader has read until the end of the XML document, or if an PySide.QtCore.QXmlStreamReader.error() has occurred and reading has been aborted. Otherwise, it returns false.
When PySide.QtCore.QXmlStreamReader.atEnd() and PySide.QtCore.QXmlStreamReader.hasError() return true and PySide.QtCore.QXmlStreamReader.error() returns PrematureEndOfDocumentError , it means the XML has been well-formed so far, but a complete XML document has not been parsed. The next chunk of XML can be added with PySide.QtCore.QXmlStreamReader.addData() , if the XML is being read from a PySide.QtCore.QByteArray , or by waiting for more data to arrive if the XML is being read from a PySide.QtCore.QIODevice . Either way, PySide.QtCore.QXmlStreamReader.atEnd() will return false once more data is available.
Return type: | PySide.QtCore.QXmlStreamAttributes |
---|
Returns the attributes of a StartElement .
Return type: | PySide.QtCore.qint64 |
---|
Returns the current character offset, starting with 0.
Removes any PySide.QtCore.QXmlStreamReader.device() or data from the reader and resets its internal state to the initial state.
Return type: | PySide.QtCore.qint64 |
---|
Returns the current column number, starting with 0.
Return type: | PySide.QtCore.QIODevice |
---|
Returns the current device associated with the PySide.QtCore.QXmlStreamReader , or 0 if no device has been assigned.
Return type: | PySide.QtCore.QStringRef |
---|
If the state() is StartDocument , this function returns the encoding string as specified in the XML declaration. Otherwise an empty string is returned.
Return type: | PySide.QtCore.QStringRef |
---|
If the state() is StartDocument , this function returns the version string as specified in the XML declaration. Otherwise an empty string is returned.
Return type: | PySide.QtCore.QStringRef |
---|
If the state() is DTD , this function returns the DTD’s name. Otherwise an empty string is returned.
Return type: | PySide.QtCore.QStringRef |
---|
If the state() is DTD , this function returns the DTD’s public identifier. Otherwise an empty string is returned.
Return type: | PySide.QtCore.QStringRef |
---|
If the state() is DTD , this function returns the DTD’s system identifier. Otherwise an empty string is returned.
Return type: |
---|
If the state() is DTD , this function returns the DTD’s unparsed (external) entity declarations. Otherwise an empty vector is returned.
The QXmlStreamEntityDeclarations class is defined to be a QVector of PySide.QtCore.QXmlStreamEntityDeclaration .
Return type: | PySide.QtCore.QXmlStreamEntityResolver |
---|
Returns the entity resolver, or 0 if there is no entity resolver.
Return type: | PySide.QtCore.QXmlStreamReader.Error |
---|
Returns the type of the current error, or NoError if no error occurred.
Return type: | unicode |
---|
Returns the error message that was set with PySide.QtCore.QXmlStreamReader.raiseError() .
Return type: | PySide.QtCore.bool |
---|
Returns true if an error has occurred, otherwise false .
Return type: | PySide.QtCore.bool |
---|
Returns true if the reader reports characters that stem from a CDATA section; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals Characters ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals Comment ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals DTD ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals EndDocument ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals EndElement ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals EntityReference ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals ProcessingInstruction ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if this document has been declared standalone in the XML declaration; otherwise returns false.
If no XML declaration has been parsed, this function returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals StartDocument ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if PySide.QtCore.QXmlStreamReader.tokenType() equals StartElement ; otherwise returns false.
Return type: | PySide.QtCore.bool |
---|
Returns true if the reader reports characters that only consist of white-space; otherwise returns false.
Return type: | PySide.QtCore.qint64 |
---|
Returns the current line number, starting with 1.
Return type: | PySide.QtCore.QStringRef |
---|
Returns the local name of a StartElement , EndElement , or an EntityReference .
Return type: |
---|
If the state() is StartElement , this function returns the element’s namespace declarations. Otherwise an empty vector is returned.
The PySide.QtCore.QXmlStreamNamespaceDeclaration class is defined to be a QVector of PySide.QtCore.QXmlStreamNamespaceDeclaration .
Return type: | PySide.QtCore.bool |
---|
Return type: | PySide.QtCore.QStringRef |
---|
Returns the namespaceUri of a StartElement or EndElement .
Return type: |
---|
If the state() is DTD , this function returns the DTD’s notation declarations. Otherwise an empty vector is returned.
The QXmlStreamNotationDeclarations class is defined to be a QVector of PySide.QtCore.QXmlStreamNotationDeclaration .
Return type: | PySide.QtCore.QStringRef |
---|
Returns the prefix of a StartElement or EndElement .
Return type: | PySide.QtCore.QStringRef |
---|
Returns the data of a ProcessingInstruction .
Return type: | PySide.QtCore.QStringRef |
---|
Returns the target of a ProcessingInstruction .
Return type: | PySide.QtCore.QStringRef |
---|
Returns the qualified name of a StartElement or EndElement ;
A qualified name is the raw name of an element in the XML data. It consists of the namespace prefix, followed by colon, followed by the element’s local name. Since the namespace prefix is not unique (the same prefix can point to different namespaces and different prefixes can point to the same namespace), you shouldn’t use PySide.QtCore.QXmlStreamReader.qualifiedName() , but the resolved PySide.QtCore.QXmlStreamReader.namespaceUri() and the attribute’s local PySide.QtCore.QXmlStreamReader.name() .
Parameters: | message – unicode |
---|
Raises a custom error with an optional error message .
Parameters: | behaviour – PySide.QtCore.QXmlStreamReader.ReadElementTextBehaviour |
---|---|
Return type: | unicode |
Convenience function to be called in case a StartElement was read. Reads until the corresponding EndElement and returns all text in-between. In case of no error, the current token (see PySide.QtCore.QXmlStreamReader.tokenType() ) after having called this function is EndElement .
The function concatenates PySide.QtCore.QXmlStreamReader.text() when it reads either Characters or EntityReference tokens, but skips ProcessingInstruction and Comment . If the current token is not StartElement , an empty string is returned.
The behaviour defines what happens in case anything else is read before reaching EndElement . The function can include the text from child elements (useful for example for HTML), ignore child elements, or raise an UnexpectedElementError and return what was read so far.
Return type: | unicode |
---|
This function overloads PySide.QtCore.QXmlStreamReader.readElementText() .
Calling this function is equivalent to calling readElementText( ErrorOnUnexpectedElement ).
Return type: | PySide.QtCore.QXmlStreamReader.TokenType |
---|
Reads the next token and returns its type.
With one exception, once an PySide.QtCore.QXmlStreamReader.error() is reported by PySide.QtCore.QXmlStreamReader.readNext() , further reading of the XML stream is not possible. Then PySide.QtCore.QXmlStreamReader.atEnd() returns true, PySide.QtCore.QXmlStreamReader.hasError() returns true, and this function returns QXmlStreamReader.Invalid .
The exception is when PySide.QtCore.QXmlStreamReader.error() returns PrematureEndOfDocumentError . This error is reported when the end of an otherwise well-formed chunk of XML is reached, but the chunk doesn’t represent a complete XML document. In that case, parsing can be resumed by calling PySide.QtCore.QXmlStreamReader.addData() to add the next chunk of XML, when the stream is being read from a PySide.QtCore.QByteArray , or by waiting for more data to arrive when the stream is being read from a PySide.QtCore.QXmlStreamReader.device() .
Return type: | PySide.QtCore.bool |
---|
Reads until the next start element within the current element. Returns true when a start element was reached. When the end element was reached, or when an error occurred, false is returned.
The current element is the element matching the most recently parsed start element of which a matching end element has not yet been reached. When the parser has reached the end element, the current element becomes the parent element.
This is a convenience function for when you’re only concerned with parsing XML elements. The QXmlStream Bookmarks Example makes extensive use of this function.
Parameters: | device – PySide.QtCore.QIODevice |
---|
Sets the current device to device . Setting the device resets the stream to its initial state.
Parameters: | resolver – PySide.QtCore.QXmlStreamEntityResolver |
---|
Makes resolver the new PySide.QtCore.QXmlStreamReader.entityResolver() .
The stream reader does not take ownership of the resolver. It’s the callers responsibility to ensure that the resolver is valid during the entire life-time of the stream reader object, or until another resolver or 0 is set.
Parameters: | arg__1 – PySide.QtCore.bool |
---|
Reads until the end of the current element, skipping any child nodes. This function is useful for skipping unknown elements.
The current element is the element matching the most recently parsed start element of which a matching end element has not yet been reached. When the parser has reached the end element, the current element becomes the parent element.
Return type: | PySide.QtCore.QStringRef |
---|
Returns the text of Characters , Comment , DTD , or EntityReference .
Return type: | unicode |
---|
Returns the reader’s current token as string.
Return type: | PySide.QtCore.QXmlStreamReader.TokenType |
---|
Returns the type of the current token.
The current token can also be queried with the convenience functions PySide.QtCore.QXmlStreamReader.isStartDocument() , PySide.QtCore.QXmlStreamReader.isEndDocument() , PySide.QtCore.QXmlStreamReader.isStartElement() , PySide.QtCore.QXmlStreamReader.isEndElement() , PySide.QtCore.QXmlStreamReader.isCharacters() , PySide.QtCore.QXmlStreamReader.isComment() , PySide.QtCore.QXmlStreamReader.isDTD() , PySide.QtCore.QXmlStreamReader.isEntityReference() , and PySide.QtCore.QXmlStreamReader.isProcessingInstruction() .