Finding max value from a unique tag from XML

Shaveta_Chawla has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Finding max value from a unique tag from XML by remiah (Hermit) on Nov 22, 2012 at 10:45 UTC
Hello Shaveta_Chawla. It seems your xml have some trouble like '<\doc>'. If it is valid xml, XML::Twig will do like this. #!/usr/bin/perl use strict; use warnings; use XML::Twig; my $xml=join('', <DATA>); my $max_docid=0; XML::Twig->new( twig_roots =>{ 'str[@name="docuid"]' => \&set_max_docid, }, )->parse($xml); sub set_max_docid { my ($twig, $elt)= @_; my $docid= $elt->first_child_trimmed_text; if ($docid > $max_docid ){ $max_docid=$docid } } print "max docid=$max_docid\n"; __DATA__ <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> [download] XML::Twig has nice tutorial.	[reply] [d/l]
Re: Finding max value from a unique tag from XML by choroba (Cardinal) on Nov 22, 2012 at 13:19 UTC
A single XPath expression can find a maximum. Here is an example using XML::XSH2, a wrapper around XML::LibXML: `open 1.xml ; echo //doc/str[@name="docuid" and not(text() < //doc/str[@name="docuid +"]/text())] ;` [download] لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ	[reply] [d/l]
Re^2: Finding max value from a unique tag from XML by tobyink (Canon) on Nov 22, 2012 at 16:08 UTC
That's more XPath wizardry than I'd normally manage. FWIW, here's the translation of the above to plain old XML::LibXML: `use 5.010; use XML::LibXML 1.70; my $xml = XML::LibXML->load_xml(location => "1.xml"); say $xml->findnodes(q{ //doc/str[@name="docuid" and not(text() < //doc/str[@name="docuid" +]/text())] });` [download] `perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'`	[reply] [d/l]
Re: Finding max value from a unique tag from XML by jethro (Monsignor) on Nov 22, 2012 at 10:41 UTC
Yes, there are lots of XML modules around. For example take a look at XML::Simple and read the Quick Start section of the manual page, you can just adapt the sample code there for your application	[reply]
Re: Finding max value from a unique tag from XML by vagabonding electron (Curate) on Nov 24, 2012 at 16:35 UTC
My 1 cent using XML::Rules (used as an exercise for myself). The xml is repaired by remiah. #!/usr/bin/perl use strict; use warnings; use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => 'as array', 'date' => 'as is', 'doc' => 'as array no content', 'all' => 'no content' ); my $parser = XML::Rules->new(rules => \@rules); my $data = $parser->parse( $xml ); my $max_value = 0; for my $chunk ( @{ $data->{all}{doc} } ) { for my $str ( @{ $chunk->{str} } ) { $str->{name} eq 'docuid' and $str->{_content} > $max_value and $max_value = $str->{_content}; } } print "The max value is: $max_value\n"; [download]	[reply] [d/l]
Re^2: Finding max value from a unique tag from XML by Jenda (Abbot) on Nov 25, 2012 at 03:46 UTC
If all you want from the document is the maximal docuid, you can set the rules to give you exactly that: #!/usr/bin/perl use strict; use warnings; no warnings qw(uninitialized); use XML::Rules; my $xml = <<'XML'; <all> <doc> <date name="processingtime">2011-04-09T11:12:22.049Z</date> <str name="docuid">121422</str> <str name="title">ABC</str> </doc> <doc> <date name="processingtime">2012-04-09T11:12:22.049Z</date> <str name="docuid">13427</str> <str name="title">CDE</str> </doc> <doc> <date name="processingtime">2010-04-09T11:12:22.049Z</date> <str name="docuid">89822</str> <str name="title">LKK</str> </doc> </all> XML my @rules = ( 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } ); my $parser = XML::Rules->new(rules => \@rules); my $max_value = $parser->parse( $xml ); print "The max value is: $max_value\n"; [download] This assumes that you want the maximal value from any <str> tag with attribute name="docuid" as it doesn't check the "path" to the <str> tag! Update: With version 1.16 and later it's easy to give the specific parser a more readable interface: `use XML::Rules max_docuid => { method => 'parse', rules => { 'str' => sub { return unless $_[1]->{name} eq 'docuid'; my $id = $_[1]->{_content}; $_[4]->{pad} = $id if ($id > $_[4]->{pad}); return; }, 'all' => sub { return $_[4]->{pad}; } } }; #... print "The max value is: " . max_docuid($xml) . "\n";` [download] Jenda Enoch was right! Enjoy the last years of Rome.	[reply] [d/l] [select]
Re^3: Finding max value from a unique tag from XML by vagabonding electron (Curate) on Nov 26, 2012 at 18:36 UTC
Thank you Jenda! Could you please point me to the documentation of `$_[4]` approach?	[reply] [d/l]
Re^4: Finding max value from a unique tag from XML by afoken (Chancellor) on Nov 26, 2012 at 19:32 UTC
Re^5: Finding max value from a unique tag from XML by vagabonding electron (Curate) on Nov 29, 2012 at 16:37 UTC


laziness, impatience, and hubris
	PerlMonks