Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re^2: Finding max value from a unique tag from XML

by Jenda (Abbot)
on Nov 25, 2012 at 03:46 UTC ( #1005460=note: print w/replies, xml ) Need Help??

in reply to Re: Finding max value from a unique tag from XML
in thread Finding max value from a unique tag from XML

If all you want from the document is the maximal docuid, you can set the rules to give you exactly that:

#!/usr/bin/perl use strict; use warnings; no warnings qw(uninitialized); use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } ); my $parser = XML::Rules->new(rules => \@rules); my $max_value = $parser->parse( $xml ); print "The max value is: $max_value\n";

This assumes that you want the maximal value from any <str> tag with attribute name="docuid" as it doesn't check the "path" to the <str> tag!

Update: With version 1.16 and later it's easy to give the specific parser a more readable interface:

use XML::Rules max_docuid => { method => 'parse', rules => { 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } } }; #... print "The max value is: " . max_docuid($xml) . "\n";

Enoch was right!
Enjoy the last years of Rome.

Replies are listed 'Best First'.
Re^3: Finding max value from a unique tag from XML
by vagabonding electron (Chaplain) on Nov 26, 2012 at 18:36 UTC
    Thank you Jenda!
    Could you please point me to the documentation of $_[4] approach?

      $_[4] is the fifth element of @_, the array containing subroutine arguments. $_[4] is used in both subroutines in the @rules array, and it is used as a hash reference. The @rules array is passed to XML::Rules->new(...). I would expect XML::Rules to call one or both of the subroutines while parsing the XML document.

      And, lo and behold, XML::Rules does call the subroutines passed in @rules. The documentation clearly states that the fifths parameter is the parser object (a blessed hash reference), documented with the name $parser. It offers two workspaces, $parser->{'pad'} and $parser->{'parameters'} to "store any data you need".


      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1005460]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2018-07-20 22:32 GMT
Find Nodes?
    Voting Booth?
    It has been suggested to rename Perl 6 in order to boost its marketing potential. Which name would you prefer?

    Results (441 votes). Check out past polls.