xml - XMLSchema: Is it possible to calculate how valid an invalid document is (eg. as a percentage)? -


i'm using lxml in python validate number of xml documents against xml schema definition. number of these documents not validate -- , @ moment they're not expected -- useful if calculate how valid are, percentage, reporting purposes. have ability use xmllint or other command line tools, should able provide useful statistic.

lxml parsers provide way a list of errors occurred while trying parse document. combine parser's recover keyword argument , this:

# warning, untested, may not work parser = etree.xmlparser(recover=true) it_would_be_a_tree = etree.parse(your_xml_data, parser) total_errors = len(parser.error_log) 

then can calculate percentage of file total_errors represents. use naive measure, errors per line or errors per character without trouble. more sophisticated measures possible if it_would_be_a_tree tree structure (total_elements / total_errors, example).


Comments

Popular posts from this blog

Why does Ruby on Rails generate add a blank line to the end of a file? -

keyboard - Smiles and long press feature in Android -

node.js - Bad Request - node js ajax post -