Beefy Boxes and Bandwidth Generously Provided by pair Networks Cowboy Neal with Hat
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^5: XML::Twig traversing tree and storing in an array

by Anonymous Monk
on Aug 07, 2012 at 07:44 UTC ( #985903=note: print w/ replies, xml ) Need Help??


in reply to Re^4: XML::Twig traversing tree and storing in an array
in thread XML::Twig traversing tree and storing in an array

Tried variations of ....

Why not post what you tried?

#!/usr/bin/perl -- use strict; use warnings; my $foops = { "xmlout.xml" => { "/panoply/classes" => [ "\n <classes name=\"Panoply::AccessLog +ic\">\n <all_members name=\"accessLogic\" protection=\"public\" sc +ope=\"Panoply::AccessLogic\" virtualness=\"non_virtual\"/>\n </clas +ses>", "\n <classes name=\"Panoply::Details:: +ActionBatch\"></classes>", ], "/panoply/files" => [ "\n <files name=\"AccessLogic.hpp\"></ +files>", "\n <files name=\"BaseDevice.h\"></fil +es>", ], "/panoply/namespaces" => [ "\n <namespaces name=\"std\"></namespa +ces>", ], }, }; print $foops, "\n"; print $foops->{"xmlout.xml"}, "\n"; print $foops->{"xmlout.xml"}{"/panoply/classes"}, "\n"; print $foops->{"xmlout.xml"}{"/panoply/classes"}[0], "\n"; print $foops->{"xmlout.xml"}{"/panoply/classes"}[1], "\n\n"; use Data::Diver qw( Dive ); my $classes = Dive( $foops, "xmlout.xml", "/panoply/classes" ); print $classes, "\n"; print scalar @{$classes}, "\n"; print @{$classes}, "\n\n"; while( my( $filename, $datahash ) = each %$foops ){ print "$filename => $datahash\n"; while( my( $branchname, $brancharray ) = each %$datahash ){ printf "%-20s => n(%3d) => %s\n", $branchname, scalar(@$brancharray), $brancharray; } } __END__ HASH(0x9ad1bc) HASH(0x99a36c) ARRAY(0x3f8a8c) <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public" scope="Panoply +::AccessLogic" virtualness="non_virtual"/> </classes> <classes name="Panoply::Details::ActionBatch"></classes> ARRAY(0x3f8a8c) 2 <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public" scope="Panoply +::AccessLogic" virtualness="non_virtual"/> </classes> <classes name="Panoply::Details::ActionBatch"></classes> xmlout.xml => HASH(0x99a36c) /panoply/namespaces => n( 1) => ARRAY(0x99a2fc) /panoply/files => n( 2) => ARRAY(0x99a26c) /panoply/classes => n( 2) => ARRAY(0x3f8a8c)

TutorialsData Types and VariablesData Type: ArrayData Type: HashReferences quick reference


Comment on Re^5: XML::Twig traversing tree and storing in an array
Download Code
Re^6: XML::Twig traversing tree and storing in an array
by jccunning (Acolyte) on Aug 07, 2012 at 19:33 UTC
    Thanks again, learned more. Regarding your first code post where Main( @ARGV ); is used for providing multiple xml files on command line, is the most practical way of placing output from each file into a separate hash by duplicating "for my $file" and specify $_[0], $_1, etc.
    for my $file( $_[0] ){ # dd $filename = $file; ## eval { $twig->parsefile( $file ); 1; } or warn "ERROR parsefile($file): $@ "; }
      Here is what I am trying to achieve. Take C++ API generated from Doxygen into perl module output, convert to xml, filter xml, then put old and new version of API into hashes or arrays so I can compare and report what has changed in each class, namespace, etc. and new classes that have been added. Still working on comparison ideas. What I have so far. Must be better way to put output of test files into separate hashes and arrays instead of duplicating everything like I did.
      #!/usr/bin/perl -- # use strict; use warnings; use XML::Simple; use XML::Twig; use Data::Dump qw' dd '; my @panapi; my @allclasses; my @allfiles; my @allnsp; my %elements; # require "DoxyDocs.pm"; # our $doxydocs; ######################################### # Script takes DoxyDocs.pm converts to xml # then filters out unneeded tags from xml # then lists all classes, files, namespaces # and separates each class, file, and namespace # in hash on line with its related properties # # usage: apixml.pl xml1.xml xml2.xml > out.txt # ######################################### # my $fh = 'xmlout.xml'; # my $xs = new XML::Simple(RootName => "panoply"); # add the NoAttr => 1, option to convert attributes to elements # $xs->XMLout($doxydocs, XMLDecl => 1, OutputFile => $fh); Main( @ARGV ); exit( 0 ); sub Main { my %oldfile; my %newfile; my %class; my $filename; my $ssprint = sub { my( $twig, $_ ) = @_; push @{ $oldfile{ $filename }{ $_->path } }, $_->sprint; #sto +re in hash # push (@panapi, $_->sprint); # store in array instead return; }; my $s2sprint = sub { my( $twig1, $_ ) = @_; push @{ $newfile{ $filename }{ $_->path } }, $_->sprint; #sto +re in hash # push (@panapi, $_->sprint); # store in array instead return; }; my $twig = XML::Twig->new( ignore_elts => { brief => 'discard', detailed => 'discard', in +cludes => 'discard', included_by => 'discard', reimplemented_by => 'd +iscard' }, pretty_print => 'indented', TwigHandlers => { 'panoply/classes' => $ssprint, 'panoply/files' => $ssprint, 'panoply/namespaces' => $ssprint, }, ); my $twig1 = XML::Twig->new( ignore_elts => { brief => 'discard', detailed => 'discard', in +cludes => 'discard', included_by => 'discard', reimplemented_by => 'd +iscard' }, pretty_print => 'indented', TwigHandlers => { 'panoply/classes' => $s2sprint, 'panoply/files' => $s2sprint, 'panoply/namespaces' => $s2sprint, }, ); for my $file( $_[0] ){ # dd $filename = $file; ## eval { $twig->parsefile( $file ); 1; } or warn "ERROR parsefile($file): $@ "; my $root = $twig->root; my @class = $root->children( 'classes' ); print "Previous version of API\n"; foreach my $cls (@class) { my $clsname = $cls->{'att'}->{'name'}; print "classes: $clsname\n"; push (@allclasses, $clsname); } my @files = $root->children( 'files' ); foreach my $file (@files) { my $filename = $file->{'att'}->{'name'}; print "files: $filename\n"; push (@allfiles, $filename); } my @namesp = $root->children( 'namespaces' ); foreach my $nsp (@namesp) { my $name = $nsp->{'att'}->{'name'}; print "namespaces: $name\n"; push (@allnsp, $name); } $twig->purge; } dd \%oldfile; # dd \@panapi; # store in array instead for my $file1( $_[1] ){ # dd $filename = $file1; ## eval { $twig1->parsefile( $file1 ); 1; } or warn "ERROR parsefile($file1): $@ "; my $root = $twig1->root; my @class = $root->children( 'classes' ); print "\n\nNew version of API\n"; foreach my $cls (@class) { my $clsname = $cls->{'att'}->{'name'}; print "classes: $clsname\n"; push (@allclasses, $clsname); } my @files = $root->children( 'files' ); foreach my $file (@files) { my $filename = $file->{'att'}->{'name'}; print "files: $filename\n"; push (@allfiles, $filename); } my @namesp = $root->children( 'namespaces' ); foreach my $nsp (@namesp) { my $name = $nsp->{'att'}->{'name'}; print "namespaces: $name\n"; push (@allnsp, $name); } $twig1->purge; } dd \%newfile; }
      Test files:
      <?xml version='1.0' standalone='yes'?> <panoply> <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public"/> <all_members name="DDR_VIA_JTAG" protection="public" scope="Panopl +y::AccessLogic" virtualness="non_virtual" /> <all_members name="SPR" protection="public"/> <brief></brief> <detailed> <doc type="text">Models the access logic</doc> </detailed> <includes name="AccessLogic.hpp" local="no" /> <public_methods> <members name="handledErrors" const="yes" kind="function" protec +tion="public" static="no"> </members> <members name="AccessLogic" const="no" kind="function" protectio +n="public" static="no"> <parameters declaration_name="accessLogic" type="AccessLogicTy +pes" /> </members> </public_methods> <public_typedefs> <members name="AccessLogicTypes" kind="enum" protection="public" +> <values name="IO_CF8_CFC"> </values> </members> <members name="MJTAG" kind="enumvalue"> </members> </public_typedefs> </classes> <classes name="Panoply::Details::ActionResult"> <derived name="Panoply::Details::CPUIDActionResult"/> <includes name="PlatformAction.hpp" local="no" /> </classes> <classes name="Panoply::Details::AddressIndex"> <includes name="Panoply.hpp" local="no" /> </classes> <files name="panoplydoc.hpp"> </files> <files name="PanoplyExports.hpp"> <defines> <members name="_SCL_SECURE_NO_WARNINGS" kind="define"> </members> </defines> </files> <namespaces name="AMD"> <namespaces name="AMD::RegisterDef" /> </namespaces> <namespaces name="AMD::RegisterDef"> <classes name="AMD::RegisterDef::BaseDevice" /> <enums> <members name="PermissionLevel" kind="enum" protection="public"> <values name="PERMISSION_PUBLIC" initializer=" 10"> </values> <values name="PERMISSION_NDA" initializer=" 20"> </values> </members> <members name="RegisterType" kind="enum" protection="public"> <values name="REGISTER_PCI" initializer=" 0"> </values> <values name="REGISTER_MSR" initializer=" 1"> </values> </members> </enums> <functions> <members name="Compare" const="no" kind="function" protection="p +ublic"> <parameters declaration_name="first" type="T" /> <parameters declaration_name="second" type="T" /> </members> </functions> </namespaces> </panoply>
      File 2:
      <?xml version='1.0' standalone='yes'?> <panoply> <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public"/> <all_members name="DDR_VIA_JTAG" protection="public"/> <all_members name="SPR" protection="public" scope="Panoply::Access +Logic" /> <all_members name="DBUS" protection="public"/> <brief></brief> <detailed> <doc type="text">Models the access logic</doc> </detailed> <includes name="AccessLogic.hpp" local="no" /> <public_methods> <members name="handledErrors" const="yes" kind="function" protec +tion="public" static="no"> </members> <members name="AccessLogic" const="no" kind="function" protectio +n="public" static="no"> <parameters declaration_name="accessLogic" type="AccessLogicTy +pes" /> <parameters declaration_name="handledErrors" type="const Handl +edErrors &amp;" /> </members> </public_methods> <public_typedefs> <members name="AccessLogicTypes" kind="enum" protection="public" +> <values name="IO_CF8_CFC"> </values> </members> <members name="MJTAG" kind="enumvalue"> </members> </public_typedefs> </classes> <classes name="Panoply::Details::ActionBatch"> <detailed></detailed> <includes name="PlatformActionBatch.hpp" local="no" /> </classes> <classes name="Panoply::Details::ActionResult"> <derived name="Panoply::Details::CPUIDActionResult"/> <includes name="PlatformAction.hpp" local="no" /> </classes> <classes name="Panoply::Details::AddressIndex"> <includes name="Panoply.hpp" local="no" /> </classes> <files name="panoplydoc.hpp"> </files> <files name="PanoplyExports.hpp"> <defines> <members name="NOMINMAX" kind="define" protection="public"> </members> <members name="_SCL_SECURE_NO_WARNINGS" kind="define"> </members> </defines> </files> <files name="common.hpp"> </files> <namespaces name="AMD"> <namespaces name="AMD::RegisterDef" /> </namespaces> <namespaces name="AMD::RegisterDef"> <classes name="AMD::RegisterDef::BaseDevice" /> <enums> <members name="PermissionLevel" kind="enum" protection="public"> <values name="PERMISSION_PUBLIC" initializer=" 10"> </values> <values name="PERMISSION_NDA" initializer=" 20"> </values> </members> <members name="RegisterType" kind="enum" protection="public"> <values name="REGISTER_PCI" initializer=" 0"> </values> <values name="REGISTER_MSR" initializer=" 1"> </values> </members> </enums> <functions> <members name="Compare" const="no" kind="function" protection="p +ublic"> <parameters declaration_name="first" type="T" /> <parameters declaration_name="second" type="T" /> </members> </functions> </namespaces> <namespaces name="AMD::RegisterDef::Buffalo"> <classes name="AMD::RegisterDef::Buffalo::BaseEntity" /> </namespaces> </panoply>

        Take C++ API generated from Doxygen into perl module output, convert to xml, filter xml, then put old and new version of API into hashes or arrays so I can compare and report what has changed in each class, namespace, etc. and new classes that have been added.

        Um, how about you forget about xml all together?

        You swap one giant tree type structure for another another with XML on top -- XML complicates it doesn't simplify :)

        What I have so far. Must be better way to put output of test files into separate hashes and arrays instead of duplicating everything like I did.

        Well, the code I gave you already did that, sure the arrays were stuffed in a hash, and the hashes were stuffed in another hash, but its all there. I even showed you how to retrieve any part you want, and say, put it into any hash you want.

        IMHO, a better idea, is to ditch xml, and learn to work with complex data structures (references) using straight perl, or use Data::Diver, or even JSON::Path, to get at your data -- you'll have to wrap your head around it one way or another, might as well do it now, without the headache of xml

        You might start by writing a single function called getClasses that traverses $doxydocs and pulls out classes (hashrefs), then getMembers, getParameters ... then a function called diffClasses that does the diff, or it might utilize diffMembers / diffParameters ... or even Data::Difference / Data::KeyDiff

        See do/Including files/ Re^5: Evaluating subroutines from within data for better how-to load DoxyDocs.pm into your program

        Good luck

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://985903]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (8)
As of 2014-04-19 02:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (475 votes), past polls