Analyzerslxmlprovide a way to get a list of errors that occurred while trying to parse a document. Combine this with the parser recoverkeyword argument , and you get something like this:
parser = etree.XMLParser(recover=True)
it_would_be_a_tree = etree.parse(your_xml_data, parser)
total_errors = len(parser.error_log)
Then you can calculate the percentage of the file that represents total_errors. You can use a naive measure, for example, errors per line or errors per character without any problems. More complex measures are also possible if it_would_be_a_treeit is actually a structure tree( total_elements / total_errorsfor example).
source
share