http://www.perlmonks.org?node_id=986065


in reply to Re^5: XML::Twig traversing tree and storing in an array
in thread XML::Twig traversing tree and storing in an array

Thanks again, learned more. Regarding your first code post where Main( @ARGV ); is used for providing multiple xml files on command line, is the most practical way of placing output from each file into a separate hash by duplicating "for my $file" and specify $_[0], $_1, etc.
for my $file( $_[0] ){ # dd $filename = $file; ## eval { $twig->parsefile( $file ); 1; } or warn "ERROR parsefile($file): $@ "; }

Replies are listed 'Best First'.
Re^7: XML::Twig traversing tree and storing in an array
by jccunning (Acolyte) on Aug 07, 2012 at 20:51 UTC
    Here is what I am trying to achieve. Take C++ API generated from Doxygen into perl module output, convert to xml, filter xml, then put old and new version of API into hashes or arrays so I can compare and report what has changed in each class, namespace, etc. and new classes that have been added. Still working on comparison ideas. What I have so far. Must be better way to put output of test files into separate hashes and arrays instead of duplicating everything like I did.
    #!/usr/bin/perl -- # use strict; use warnings; use XML::Simple; use XML::Twig; use Data::Dump qw' dd '; my @panapi; my @allclasses; my @allfiles; my @allnsp; my %elements; # require "DoxyDocs.pm"; # our $doxydocs; ######################################### # Script takes DoxyDocs.pm converts to xml # then filters out unneeded tags from xml # then lists all classes, files, namespaces # and separates each class, file, and namespace # in hash on line with its related properties # # usage: apixml.pl xml1.xml xml2.xml > out.txt # ######################################### # my $fh = 'xmlout.xml'; # my $xs = new XML::Simple(RootName => "panoply"); # add the NoAttr => 1, option to convert attributes to elements # $xs->XMLout($doxydocs, XMLDecl => 1, OutputFile => $fh); Main( @ARGV ); exit( 0 ); sub Main { my %oldfile; my %newfile; my %class; my $filename; my $ssprint = sub { my( $twig, $_ ) = @_; push @{ $oldfile{ $filename }{ $_->path } }, $_->sprint; #sto +re in hash # push (@panapi, $_->sprint); # store in array instead return; }; my $s2sprint = sub { my( $twig1, $_ ) = @_; push @{ $newfile{ $filename }{ $_->path } }, $_->sprint; #sto +re in hash # push (@panapi, $_->sprint); # store in array instead return; }; my $twig = XML::Twig->new( ignore_elts => { brief => 'discard', detailed => 'discard', in +cludes => 'discard', included_by => 'discard', reimplemented_by => 'd +iscard' }, pretty_print => 'indented', TwigHandlers => { 'panoply/classes' => $ssprint, 'panoply/files' => $ssprint, 'panoply/namespaces' => $ssprint, }, ); my $twig1 = XML::Twig->new( ignore_elts => { brief => 'discard', detailed => 'discard', in +cludes => 'discard', included_by => 'discard', reimplemented_by => 'd +iscard' }, pretty_print => 'indented', TwigHandlers => { 'panoply/classes' => $s2sprint, 'panoply/files' => $s2sprint, 'panoply/namespaces' => $s2sprint, }, ); for my $file( $_[0] ){ # dd $filename = $file; ## eval { $twig->parsefile( $file ); 1; } or warn "ERROR parsefile($file): $@ "; my $root = $twig->root; my @class = $root->children( 'classes' ); print "Previous version of API\n"; foreach my $cls (@class) { my $clsname = $cls->{'att'}->{'name'}; print "classes: $clsname\n"; push (@allclasses, $clsname); } my @files = $root->children( 'files' ); foreach my $file (@files) { my $filename = $file->{'att'}->{'name'}; print "files: $filename\n"; push (@allfiles, $filename); } my @namesp = $root->children( 'namespaces' ); foreach my $nsp (@namesp) { my $name = $nsp->{'att'}->{'name'}; print "namespaces: $name\n"; push (@allnsp, $name); } $twig->purge; } dd \%oldfile; # dd \@panapi; # store in array instead for my $file1( $_[1] ){ # dd $filename = $file1; ## eval { $twig1->parsefile( $file1 ); 1; } or warn "ERROR parsefile($file1): $@ "; my $root = $twig1->root; my @class = $root->children( 'classes' ); print "\n\nNew version of API\n"; foreach my $cls (@class) { my $clsname = $cls->{'att'}->{'name'}; print "classes: $clsname\n"; push (@allclasses, $clsname); } my @files = $root->children( 'files' ); foreach my $file (@files) { my $filename = $file->{'att'}->{'name'}; print "files: $filename\n"; push (@allfiles, $filename); } my @namesp = $root->children( 'namespaces' ); foreach my $nsp (@namesp) { my $name = $nsp->{'att'}->{'name'}; print "namespaces: $name\n"; push (@allnsp, $name); } $twig1->purge; } dd \%newfile; }
    Test files:
    <?xml version='1.0' standalone='yes'?> <panoply> <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public"/> <all_members name="DDR_VIA_JTAG" protection="public" scope="Panopl +y::AccessLogic" virtualness="non_virtual" /> <all_members name="SPR" protection="public"/> <brief></brief> <detailed> <doc type="text">Models the access logic</doc> </detailed> <includes name="AccessLogic.hpp" local="no" /> <public_methods> <members name="handledErrors" const="yes" kind="function" protec +tion="public" static="no"> </members> <members name="AccessLogic" const="no" kind="function" protectio +n="public" static="no"> <parameters declaration_name="accessLogic" type="AccessLogicTy +pes" /> </members> </public_methods> <public_typedefs> <members name="AccessLogicTypes" kind="enum" protection="public" +> <values name="IO_CF8_CFC"> </values> </members> <members name="MJTAG" kind="enumvalue"> </members> </public_typedefs> </classes> <classes name="Panoply::Details::ActionResult"> <derived name="Panoply::Details::CPUIDActionResult"/> <includes name="PlatformAction.hpp" local="no" /> </classes> <classes name="Panoply::Details::AddressIndex"> <includes name="Panoply.hpp" local="no" /> </classes> <files name="panoplydoc.hpp"> </files> <files name="PanoplyExports.hpp"> <defines> <members name="_SCL_SECURE_NO_WARNINGS" kind="define"> </members> </defines> </files> <namespaces name="AMD"> <namespaces name="AMD::RegisterDef" /> </namespaces> <namespaces name="AMD::RegisterDef"> <classes name="AMD::RegisterDef::BaseDevice" /> <enums> <members name="PermissionLevel" kind="enum" protection="public"> <values name="PERMISSION_PUBLIC" initializer=" 10"> </values> <values name="PERMISSION_NDA" initializer=" 20"> </values> </members> <members name="RegisterType" kind="enum" protection="public"> <values name="REGISTER_PCI" initializer=" 0"> </values> <values name="REGISTER_MSR" initializer=" 1"> </values> </members> </enums> <functions> <members name="Compare" const="no" kind="function" protection="p +ublic"> <parameters declaration_name="first" type="T" /> <parameters declaration_name="second" type="T" /> </members> </functions> </namespaces> </panoply>
    File 2:
    <?xml version='1.0' standalone='yes'?> <panoply> <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public"/> <all_members name="DDR_VIA_JTAG" protection="public"/> <all_members name="SPR" protection="public" scope="Panoply::Access +Logic" /> <all_members name="DBUS" protection="public"/> <brief></brief> <detailed> <doc type="text">Models the access logic</doc> </detailed> <includes name="AccessLogic.hpp" local="no" /> <public_methods> <members name="handledErrors" const="yes" kind="function" protec +tion="public" static="no"> </members> <members name="AccessLogic" const="no" kind="function" protectio +n="public" static="no"> <parameters declaration_name="accessLogic" type="AccessLogicTy +pes" /> <parameters declaration_name="handledErrors" type="const Handl +edErrors &amp;" /> </members> </public_methods> <public_typedefs> <members name="AccessLogicTypes" kind="enum" protection="public" +> <values name="IO_CF8_CFC"> </values> </members> <members name="MJTAG" kind="enumvalue"> </members> </public_typedefs> </classes> <classes name="Panoply::Details::ActionBatch"> <detailed></detailed> <includes name="PlatformActionBatch.hpp" local="no" /> </classes> <classes name="Panoply::Details::ActionResult"> <derived name="Panoply::Details::CPUIDActionResult"/> <includes name="PlatformAction.hpp" local="no" /> </classes> <classes name="Panoply::Details::AddressIndex"> <includes name="Panoply.hpp" local="no" /> </classes> <files name="panoplydoc.hpp"> </files> <files name="PanoplyExports.hpp"> <defines> <members name="NOMINMAX" kind="define" protection="public"> </members> <members name="_SCL_SECURE_NO_WARNINGS" kind="define"> </members> </defines> </files> <files name="common.hpp"> </files> <namespaces name="AMD"> <namespaces name="AMD::RegisterDef" /> </namespaces> <namespaces name="AMD::RegisterDef"> <classes name="AMD::RegisterDef::BaseDevice" /> <enums> <members name="PermissionLevel" kind="enum" protection="public"> <values name="PERMISSION_PUBLIC" initializer=" 10"> </values> <values name="PERMISSION_NDA" initializer=" 20"> </values> </members> <members name="RegisterType" kind="enum" protection="public"> <values name="REGISTER_PCI" initializer=" 0"> </values> <values name="REGISTER_MSR" initializer=" 1"> </values> </members> </enums> <functions> <members name="Compare" const="no" kind="function" protection="p +ublic"> <parameters declaration_name="first" type="T" /> <parameters declaration_name="second" type="T" /> </members> </functions> </namespaces> <namespaces name="AMD::RegisterDef::Buffalo"> <classes name="AMD::RegisterDef::Buffalo::BaseEntity" /> </namespaces> </panoply>

      Take C++ API generated from Doxygen into perl module output, convert to xml, filter xml, then put old and new version of API into hashes or arrays so I can compare and report what has changed in each class, namespace, etc. and new classes that have been added.

      Um, how about you forget about xml all together?

      You swap one giant tree type structure for another another with XML on top -- XML complicates it doesn't simplify :)

      What I have so far. Must be better way to put output of test files into separate hashes and arrays instead of duplicating everything like I did.

      Well, the code I gave you already did that, sure the arrays were stuffed in a hash, and the hashes were stuffed in another hash, but its all there. I even showed you how to retrieve any part you want, and say, put it into any hash you want.

      IMHO, a better idea, is to ditch xml, and learn to work with complex data structures (references) using straight perl, or use Data::Diver, or even JSON::Path, to get at your data -- you'll have to wrap your head around it one way or another, might as well do it now, without the headache of xml

      You might start by writing a single function called getClasses that traverses $doxydocs and pulls out classes (hashrefs), then getMembers, getParameters ... then a function called diffClasses that does the diff, or it might utilize diffMembers / diffParameters ... or even Data::Difference / Data::KeyDiff

      See do/Including files/ Re^5: Evaluating subroutines from within data for better how-to load DoxyDocs.pm into your program

      Good luck