Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: Finding max value from a unique tag from XML

by vagabonding electron (Hermit)
on Nov 24, 2012 at 16:35 UTC ( #1005396=note: print w/ replies, xml ) Need Help??


in reply to Finding max value from a unique tag from XML

My 1 cent using XML::Rules (used as an exercise for myself). The xml is repaired by remiah.

#!/usr/bin/perl use strict; use warnings; use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => 'as array', 'date' => 'as is', 'doc' => 'as array no content', 'all' => 'no content' ); my $parser = XML::Rules->new(rules => \@rules); my $data = $parser->parse( $xml ); my $max_value = 0; for my $chunk ( @{ $data->{all}{doc} } ) { for my $str ( @{ $chunk->{str} } ) { $str->{name} eq 'docuid' and $str->{_content} > $max_value and $max_value = $str->{_content}; } } print "The max value is: $max_value\n";


Comment on Re: Finding max value from a unique tag from XML
Download Code
Replies are listed 'Best First'.
Re^2: Finding max value from a unique tag from XML
by Jenda (Abbot) on Nov 25, 2012 at 03:46 UTC

    If all you want from the document is the maximal docuid, you can set the rules to give you exactly that:

    #!/usr/bin/perl use strict; use warnings; no warnings qw(uninitialized); use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } ); my $parser = XML::Rules->new(rules => \@rules); my $max_value = $parser->parse( $xml ); print "The max value is: $max_value\n";

    This assumes that you want the maximal value from any <str> tag with attribute name="docuid" as it doesn't check the "path" to the <str> tag!

    Update: With version 1.16 and later it's easy to give the specific parser a more readable interface:

    use XML::Rules max_docuid => { method => 'parse', rules => { 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } } }; #... print "The max value is: " . max_docuid($xml) . "\n";

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

      Thank you Jenda!
      Could you please point me to the documentation of $_[4] approach?

        $_[4] is the fifth element of @_, the array containing subroutine arguments. $_[4] is used in both subroutines in the @rules array, and it is used as a hash reference. The @rules array is passed to XML::Rules->new(...). I would expect XML::Rules to call one or both of the subroutines while parsing the XML document.

        And, lo and behold, XML::Rules does call the subroutines passed in @rules. The documentation clearly states that the fifths parameter is the parser object (a blessed hash reference), documented with the name $parser. It offers two workspaces, $parser->{'pad'} and $parser->{'parameters'} to "store any data you need".

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1005396]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (20)
As of 2015-07-31 16:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (279 votes), past polls