Lxml Create XML fragment without root element?

Is it possible to use lxml (or the built-in electronic library) to create an object that represents an xml fragment but contains two (or more) disjoint trees (i.e. each tree has its own separate root, but they do not have a common ancestor)?

That is, is there anything that could represent the following without creating another element to hold both of them:

<tree id="A"><anotherelement/></tree>
<tree id="B"><yetanotherelement/></tree>

I don't see anything in the lxml documentation that would allow this, and stackoverflow doesn't seem to have anything direct to the point.

In this case, I create the xml programmatically, and the fragments will be assembled into a single document for output. I need an object that I don't need to iterate over / special case, just go to lxml methods as if it were a proper tree.

(I know that such fragments by themselves will not be a complete and correct xml document, I want to store intermediate products before assembling into such a document).

+5
source share
1 answer

yes, the package lxml.htmlhas such functionality, it is called fragment_fromstringor fragments_fromstring, but in most cases, the html analyzer also does fine with xml:

from lxml import etree, html

xml = """
    <tree id="A"><anotherelement/></tree>
    <tree id="B"><yetanotherelement/></tree>
"""

fragments = html.fragments_fromstring(xml)

root = etree.Element("root")
for f in fragments:
    root.append(f)

print etree.tostring(root, pretty_print=True)

output:

<root>
  <tree id="A">
    <anotherelement/>
  </tree>
  <tree id="B">
    <yetanotherelement/>
  </tree>
</root>

if you look at what's happening under the hood , it probably wouldn't be too hard to do the same with the xml parser if you are not happy with the other result.

+4
source

All Articles