Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re: Finding max value from a unique tag from XML

by vagabonding electron (Hermit)
on Nov 24, 2012 at 16:35 UTC ( #1005396=note: print w/ replies, xml ) Need Help??


in reply to Finding max value from a unique tag from XML

My 1 cent using XML::Rules (used as an exercise for myself). The xml is repaired by remiah.

#!/usr/bin/perl use strict; use warnings; use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => 'as array', 'date' => 'as is', 'doc' => 'as array no content', 'all' => 'no content' ); my $parser = XML::Rules->new(rules => \@rules); my $data = $parser->parse( $xml ); my $max_value = 0; for my $chunk ( @{ $data->{all}{doc} } ) { for my $str ( @{ $chunk->{str} } ) { $str->{name} eq 'docuid' and $str->{_content} > $max_value and $max_value = $str->{_content}; } } print "The max value is: $max_value\n";


Comment on Re: Finding max value from a unique tag from XML
Download Code
Re^2: Finding max value from a unique tag from XML
by Jenda (Abbot) on Nov 25, 2012 at 03:46 UTC

    If all you want from the document is the maximal docuid, you can set the rules to give you exactly that:

    #!/usr/bin/perl use strict; use warnings; no warnings qw(uninitialized); use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } ); my $parser = XML::Rules->new(rules => \@rules); my $max_value = $parser->parse( $xml ); print "The max value is: $max_value\n";

    This assumes that you want the maximal value from any <str> tag with attribute name="docuid" as it doesn't check the "path" to the <str> tag!

    Update: With version 1.16 and later it's easy to give the specific parser a more readable interface:

    use XML::Rules max_docuid => { method => 'parse', rules => { 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } } }; #... print "The max value is: " . max_docuid($xml) . "\n";

    Jenda
    Enoch was right!
    Enjoy the last years of Rome.

      Thank you Jenda!
      Could you please point me to the documentation of $_[4] approach?

        $_[4] is the fifth element of @_, the array containing subroutine arguments. $_[4] is used in both subroutines in the @rules array, and it is used as a hash reference. The @rules array is passed to XML::Rules->new(...). I would expect XML::Rules to call one or both of the subroutines while parsing the XML document.

        And, lo and behold, XML::Rules does call the subroutines passed in @rules. The documentation clearly states that the fifths parameter is the parser object (a blessed hash reference), documented with the name $parser. It offers two workspaces, $parser->{'pad'} and $parser->{'parameters'} to "store any data you need".

        Alexander

        --
        Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1005396]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (11)
As of 2014-12-25 03:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls