Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^7: XML::Twig traversing tree and storing in an array

by jccunning (Acolyte)
on Aug 07, 2012 at 20:51 UTC ( #986084=note: print w/ replies, xml ) Need Help??


in reply to Re^6: XML::Twig traversing tree and storing in an array
in thread XML::Twig traversing tree and storing in an array

Here is what I am trying to achieve. Take C++ API generated from Doxygen into perl module output, convert to xml, filter xml, then put old and new version of API into hashes or arrays so I can compare and report what has changed in each class, namespace, etc. and new classes that have been added. Still working on comparison ideas. What I have so far. Must be better way to put output of test files into separate hashes and arrays instead of duplicating everything like I did.

#!/usr/bin/perl -- # use strict; use warnings; use XML::Simple; use XML::Twig; use Data::Dump qw' dd '; my @panapi; my @allclasses; my @allfiles; my @allnsp; my %elements; # require "DoxyDocs.pm"; # our $doxydocs; ######################################### # Script takes DoxyDocs.pm converts to xml # then filters out unneeded tags from xml # then lists all classes, files, namespaces # and separates each class, file, and namespace # in hash on line with its related properties # # usage: apixml.pl xml1.xml xml2.xml > out.txt # ######################################### # my $fh = 'xmlout.xml'; # my $xs = new XML::Simple(RootName => "panoply"); # add the NoAttr => 1, option to convert attributes to elements # $xs->XMLout($doxydocs, XMLDecl => 1, OutputFile => $fh); Main( @ARGV ); exit( 0 ); sub Main { my %oldfile; my %newfile; my %class; my $filename; my $ssprint = sub { my( $twig, $_ ) = @_; push @{ $oldfile{ $filename }{ $_->path } }, $_->sprint; #sto +re in hash # push (@panapi, $_->sprint); # store in array instead return; }; my $s2sprint = sub { my( $twig1, $_ ) = @_; push @{ $newfile{ $filename }{ $_->path } }, $_->sprint; #sto +re in hash # push (@panapi, $_->sprint); # store in array instead return; }; my $twig = XML::Twig->new( ignore_elts => { brief => 'discard', detailed => 'discard', in +cludes => 'discard', included_by => 'discard', reimplemented_by => 'd +iscard' }, pretty_print => 'indented', TwigHandlers => { 'panoply/classes' => $ssprint, 'panoply/files' => $ssprint, 'panoply/namespaces' => $ssprint, }, ); my $twig1 = XML::Twig->new( ignore_elts => { brief => 'discard', detailed => 'discard', in +cludes => 'discard', included_by => 'discard', reimplemented_by => 'd +iscard' }, pretty_print => 'indented', TwigHandlers => { 'panoply/classes' => $s2sprint, 'panoply/files' => $s2sprint, 'panoply/namespaces' => $s2sprint, }, ); for my $file( $_[0] ){ # dd $filename = $file; ## eval { $twig->parsefile( $file ); 1; } or warn "ERROR parsefile($file): $@ "; my $root = $twig->root; my @class = $root->children( 'classes' ); print "Previous version of API\n"; foreach my $cls (@class) { my $clsname = $cls->{'att'}->{'name'}; print "classes: $clsname\n"; push (@allclasses, $clsname); } my @files = $root->children( 'files' ); foreach my $file (@files) { my $filename = $file->{'att'}->{'name'}; print "files: $filename\n"; push (@allfiles, $filename); } my @namesp = $root->children( 'namespaces' ); foreach my $nsp (@namesp) { my $name = $nsp->{'att'}->{'name'}; print "namespaces: $name\n"; push (@allnsp, $name); } $twig->purge; } dd \%oldfile; # dd \@panapi; # store in array instead for my $file1( $_[1] ){ # dd $filename = $file1; ## eval { $twig1->parsefile( $file1 ); 1; } or warn "ERROR parsefile($file1): $@ "; my $root = $twig1->root; my @class = $root->children( 'classes' ); print "\n\nNew version of API\n"; foreach my $cls (@class) { my $clsname = $cls->{'att'}->{'name'}; print "classes: $clsname\n"; push (@allclasses, $clsname); } my @files = $root->children( 'files' ); foreach my $file (@files) { my $filename = $file->{'att'}->{'name'}; print "files: $filename\n"; push (@allfiles, $filename); } my @namesp = $root->children( 'namespaces' ); foreach my $nsp (@namesp) { my $name = $nsp->{'att'}->{'name'}; print "namespaces: $name\n"; push (@allnsp, $name); } $twig1->purge; } dd \%newfile; }
Test files:
<?xml version='1.0' standalone='yes'?> <panoply> <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public"/> <all_members name="DDR_VIA_JTAG" protection="public" scope="Panopl +y::AccessLogic" virtualness="non_virtual" /> <all_members name="SPR" protection="public"/> <brief></brief> <detailed> <doc type="text">Models the access logic</doc> </detailed> <includes name="AccessLogic.hpp" local="no" /> <public_methods> <members name="handledErrors" const="yes" kind="function" protec +tion="public" static="no"> </members> <members name="AccessLogic" const="no" kind="function" protectio +n="public" static="no"> <parameters declaration_name="accessLogic" type="AccessLogicTy +pes" /> </members> </public_methods> <public_typedefs> <members name="AccessLogicTypes" kind="enum" protection="public" +> <values name="IO_CF8_CFC"> </values> </members> <members name="MJTAG" kind="enumvalue"> </members> </public_typedefs> </classes> <classes name="Panoply::Details::ActionResult"> <derived name="Panoply::Details::CPUIDActionResult"/> <includes name="PlatformAction.hpp" local="no" /> </classes> <classes name="Panoply::Details::AddressIndex"> <includes name="Panoply.hpp" local="no" /> </classes> <files name="panoplydoc.hpp"> </files> <files name="PanoplyExports.hpp"> <defines> <members name="_SCL_SECURE_NO_WARNINGS" kind="define"> </members> </defines> </files> <namespaces name="AMD"> <namespaces name="AMD::RegisterDef" /> </namespaces> <namespaces name="AMD::RegisterDef"> <classes name="AMD::RegisterDef::BaseDevice" /> <enums> <members name="PermissionLevel" kind="enum" protection="public"> <values name="PERMISSION_PUBLIC" initializer=" 10"> </values> <values name="PERMISSION_NDA" initializer=" 20"> </values> </members> <members name="RegisterType" kind="enum" protection="public"> <values name="REGISTER_PCI" initializer=" 0"> </values> <values name="REGISTER_MSR" initializer=" 1"> </values> </members> </enums> <functions> <members name="Compare" const="no" kind="function" protection="p +ublic"> <parameters declaration_name="first" type="T" /> <parameters declaration_name="second" type="T" /> </members> </functions> </namespaces> </panoply>
File 2:
<?xml version='1.0' standalone='yes'?> <panoply> <classes name="Panoply::AccessLogic"> <all_members name="accessLogic" protection="public"/> <all_members name="DDR_VIA_JTAG" protection="public"/> <all_members name="SPR" protection="public" scope="Panoply::Access +Logic" /> <all_members name="DBUS" protection="public"/> <brief></brief> <detailed> <doc type="text">Models the access logic</doc> </detailed> <includes name="AccessLogic.hpp" local="no" /> <public_methods> <members name="handledErrors" const="yes" kind="function" protec +tion="public" static="no"> </members> <members name="AccessLogic" const="no" kind="function" protectio +n="public" static="no"> <parameters declaration_name="accessLogic" type="AccessLogicTy +pes" /> <parameters declaration_name="handledErrors" type="const Handl +edErrors &amp;" /> </members> </public_methods> <public_typedefs> <members name="AccessLogicTypes" kind="enum" protection="public" +> <values name="IO_CF8_CFC"> </values> </members> <members name="MJTAG" kind="enumvalue"> </members> </public_typedefs> </classes> <classes name="Panoply::Details::ActionBatch"> <detailed></detailed> <includes name="PlatformActionBatch.hpp" local="no" /> </classes> <classes name="Panoply::Details::ActionResult"> <derived name="Panoply::Details::CPUIDActionResult"/> <includes name="PlatformAction.hpp" local="no" /> </classes> <classes name="Panoply::Details::AddressIndex"> <includes name="Panoply.hpp" local="no" /> </classes> <files name="panoplydoc.hpp"> </files> <files name="PanoplyExports.hpp"> <defines> <members name="NOMINMAX" kind="define" protection="public"> </members> <members name="_SCL_SECURE_NO_WARNINGS" kind="define"> </members> </defines> </files> <files name="common.hpp"> </files> <namespaces name="AMD"> <namespaces name="AMD::RegisterDef" /> </namespaces> <namespaces name="AMD::RegisterDef"> <classes name="AMD::RegisterDef::BaseDevice" /> <enums> <members name="PermissionLevel" kind="enum" protection="public"> <values name="PERMISSION_PUBLIC" initializer=" 10"> </values> <values name="PERMISSION_NDA" initializer=" 20"> </values> </members> <members name="RegisterType" kind="enum" protection="public"> <values name="REGISTER_PCI" initializer=" 0"> </values> <values name="REGISTER_MSR" initializer=" 1"> </values> </members> </enums> <functions> <members name="Compare" const="no" kind="function" protection="p +ublic"> <parameters declaration_name="first" type="T" /> <parameters declaration_name="second" type="T" /> </members> </functions> </namespaces> <namespaces name="AMD::RegisterDef::Buffalo"> <classes name="AMD::RegisterDef::Buffalo::BaseEntity" /> </namespaces> </panoply>


Comment on Re^7: XML::Twig traversing tree and storing in an array
Select or Download Code
Re^8: XML::Twig traversing tree and storing in an array
by Anonymous Monk on Aug 08, 2012 at 10:06 UTC

    Take C++ API generated from Doxygen into perl module output, convert to xml, filter xml, then put old and new version of API into hashes or arrays so I can compare and report what has changed in each class, namespace, etc. and new classes that have been added.

    Um, how about you forget about xml all together?

    You swap one giant tree type structure for another another with XML on top -- XML complicates it doesn't simplify :)

    What I have so far. Must be better way to put output of test files into separate hashes and arrays instead of duplicating everything like I did.

    Well, the code I gave you already did that, sure the arrays were stuffed in a hash, and the hashes were stuffed in another hash, but its all there. I even showed you how to retrieve any part you want, and say, put it into any hash you want.

    IMHO, a better idea, is to ditch xml, and learn to work with complex data structures (references) using straight perl, or use Data::Diver, or even JSON::Path, to get at your data -- you'll have to wrap your head around it one way or another, might as well do it now, without the headache of xml

    You might start by writing a single function called getClasses that traverses $doxydocs and pulls out classes (hashrefs), then getMembers, getParameters ... then a function called diffClasses that does the diff, or it might utilize diffMembers / diffParameters ... or even Data::Difference / Data::KeyDiff

    See do/Including files/ Re^5: Evaluating subroutines from within data for better how-to load DoxyDocs.pm into your program

    Good luck

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://986084]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (7)
As of 2014-07-11 06:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (220 votes), past polls