Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Comparison of XML files ignoring ordering of child elements

by adikan123 (Novice)
on Jan 17, 2019 at 05:09 UTC ( #1228677=perlquestion: print w/replies, xml ) Need Help??
adikan123 has asked for the wisdom of the Perl Monks concerning the following question:

I have been currently using XML::SemanticDiff to compare 2 XML files. This module fails to check diff if there is change in ordering of child elements. I want to know the best way which can compare 2 XML files using Perl. One of the method would be to convert XML into text and than comparing text files (finding each line from 1st file in another file to compare). It would be great help if I can get the good response as I am struggling on this from past few weeks. Code should include below: 1. XML files to be provided as input to Perl script 2. XML files to be compared for differences while ignoring order of the child elements 3. Perl script should print each difference with line number of Baseline file PS: I am new to Perl and this is my first post in PM Thanks
  • Comment on Comparison of XML files ignoring ordering of child elements

Replies are listed 'Best First'.
Re: Comparison of XML files ignoring ordering of child elements
by Athanasius (Bishop) on Jan 17, 2019 at 08:06 UTC

    Hello adikan123, and welcome to the Monastery!

    Have a look at the Test::XML::Ordered module. The author says:

    This module is a test module which compares two XML files for equivalence in an ordered fashion. It was written after I ... realised that XML::SemanticDiff, which is the basis for Test::XML, ... compares two XML files for equivalence in a "semantic" fashion where elements can be present in several possible orders. (... [which] is not normally what I want.).

    Caveats:

    • The module’s interface is designed for use in conjunction with Test::More and its relatives.
    • I have no personal experience with either module.

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

      Thank you @Athanasius. I will use this approach, will let you know if it works or not.
        Need hep to understand usage of this module - Test::XML::Ordered. I am new to Perl, need help to understand with example. Thank you!
Re: Comparison of XML files ignoring ordering of child elements
by GrandFather (Sage) on Jan 17, 2019 at 20:16 UTC

    All the replies you have so far assume that you want an error generated if the two documents have the same elements but in a different order:

    <root> <data>first</data> <data>second</data> </root>

    and

    <root> <data>second</data> <data>first</data> </root>

    should fail to match. However, although it's not clear, your question implies to me that you would like those two documents to match. That is what XML::SemanticDiff's documentation and name implies to me that it does, but the current version doesn't behave as I would expect. The module's documentation doesn't spell out the er, um, semantics of the comparison process so I suspect my expectations differ from the author's intentions.

    The module documentation does suggest the possibility of changing the way detected differences are handled which opens up the possibility that we could put our own spin on the matching process. But there are no examples and, without diving into the module's source code, how the handlers get hooked up is not at all clear.

    Bottom line: being very clear about what you mean is important both when asking questions and when documenting code. Well constructed example code and data goes a very long way toward helping with clarity. If what you want is the diff I describe let us know and show us your current test code, because at present that's not the sort of answer you are getting!

    Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
      Hello @GrandFather, I agree that my question is not very much precise, I will take care about the points which you have mentioned. The answer posted by Kan is what actually I expect. Expected outcome: The code should compare (ignoring child element's order) 2 XMLs files provided as input to Perl script and also should print the differences along with line numbers. Thanks and Regrads adikan123

        GrandFather already pointed out that if you ignore structural differences (like child ordering) you are after semantic difference(s) between the two files: I haven't looked at the suggested modules, but just reading their names I guess that the one will help you out the most is...I'll let you guess which one ;)

Re: Comparison of XML files ignoring ordering of child elements
by RonW (Vicar) on Jan 24, 2019 at 00:16 UTC
    One of the method would be to convert XML into text and than comparing text files

    FYI, XML is text. If you expect the 2 XML files to be ordered and formatted identically, a simple text diff will work just fine. If you need to allow for formatting differences, you could filter the XML files through a "shallow parser", for example XML::Parser::REX, then do a text diff on the resulting files. But, this will only "normalize" the formatting of the tags. The content of attribute values and any "bare text" in a container element is left "as is". Any format normalizing of that content is outside the scope of XML.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1228677]
Approved by Athanasius
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (4)
As of 2019-02-21 05:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    I use postfix dereferencing ...









    Results (109 votes). Check out past polls.

    Notices?
    • (Sep 10, 2018 at 22:53 UTC) Welcome new users!