I tried using the lxml parser target interface to gradually parse the XML into a "user" tree, and I ran into the following problem: if you instantiate the parser and immediately load it into the opening tag of the root element, the callback callback of the target does not triggered until some other event occurs (for example, incoming data, closing tag, other input tag, etc.). This is not like any other (nested) elements.
Demonstration:
class EchoTarget(object):
def start(self, tag, attrib):
print("start %s %s" % (tag, attrib))
def end(self, tag):
print("end %s" % tag)
def data(self, data):
print("data %r" % data)
def comment(self, text):
print("comment %s" % text)
def close(self):
print("close")
return "closed!"
>>> p = etree.XMLParser(target=EchoTarget())
>>> p.feed('<a>')
>>> p.feed(' ')
start a {}
>>> p.feed('<b>')
data u' '
start b {}
There is a way around this:
>>> p = etree.XMLParser(target=EchoTarget())
>>> p.feed(' ')
>>> p.feed('<a>')
start a {}
? ""? , , "start"?
, :
>>> p = etree.XMLParser(target=EchoTarget())
>>> p.feed('<a')
>>> p.feed('>')
start a {}
2- , -, .