I write code where I extract XML from a web api and then parse this XML with Groovy. Unfortunately, it seems that both XmlParser and XmlSlurper for Groovy are pushing newline characters out of node attributes when .text () is called.
How can I get attribute text, including newlines?
Code example:
def xmltest = '''
<snippet>
<preSnippet att1="testatt1" code="This is line 1
This is line 2
This is line 3" >
<lines count="10" />
</preSnippet>
</snippet>'''
def parsed = new XmlParser().parseText( xmltest )
println "Parsed"
parsed.preSnippet.each { pre ->
println pre.attribute('code');
}
def slurped = new XmlSlurper().parseText( xmltest )
println "Slurped"
slurped.children().each { preSnip ->
println preSnip.@code.text()
}
whose output is:
Parsed
This is line 1 This is line 2 This is line 3
Slurped
This is line 1 This is line 2 This is line 3
Ok, I was able to convert the text before I parsed it, and then re-convert after, a la:
def newxml = xmltest.replaceAll( /code="[^"]*/ ) {
return it.replaceAll( /\n/, "~#~" )
}
def parsed = new XmlParser().parseText( xmltest )
def code = pre.attribute('code').replaceAll( "~#~", "\n" )
Not my favorite hack, but it will do until they fix their XML output.
source
share