perl_seeker has asked for the wisdom of the Perl Monks concerning the following question:

Hello!
I am new to XML and XPath (I am reading material on XPath, and know some basics but it will take me some time). I need
to parse some XML statements, and build some simple data structures using XML::XPath. I have never used this
module, so I have a lot to learn about it too. I have downloaded this module and can use it.

Could some one help me with some sample code to get started?

My XML file looks like this (a portion of it):
test.xml - <AnnualWeatherRecord> - <DailyWeatherRecord> <date>1-1-2004</date> - <temperature> <maxdrybulb unit="degrees-centigrade" number="33.95" /> <mindrybulb unit="degrees-centigrade" number="30.95" /> <maxwetbulb unit="degrees-centigrade" number="33.53" /> <minwetbulb unit="degrees-centigrade" number="30.53" /> </temperature> <totalrainfall unit="mm" number="0.5" /> </DailyWeatherRecord> - <DailyWeatherRecord> <date>2-1-2004</date> - <temperature> <maxdrybulb unit="degrees-centigrade" number="23.34" /> <mindrybulb unit="degrees-centigrade" number="23.20" /> <maxwetbulb unit="degrees-centigrade" number="23.14" /> <minwetbulb unit="degrees-centigrade" number="20.14" /> </temperature> <totalrainfall unit="mm" number="0" /> </DailyWeatherRecord> - <DailyWeatherRecord> <date>1-2-2004</date> - <temperature> <maxdrybulb unit="degrees-centigrade" number="44.25" /> <mindrybulb unit="degrees-centigrade" number="39.25" /> <maxwetbulb unit="degrees-centigrade" number="33.53" /> <minwetbulb unit="degrees-centigrade" number="30.53" /> </temperature> <totalrainfall unit="mm" number="0.5" /> </DailyWeatherRecord> - <DailyWeatherRecord> <date>2-2-2004</date> - <temperature> <maxdrybulb unit="degrees-centigrade" number="50.25" /> <mindrybulb unit="degrees-centigrade" number="49.25" /> <maxwetbulb unit="degrees-centigrade" number="23.14" /> <minwetbulb unit="degrees-centigrade" number="20.14" /> </temperature> <totalrainfall unit="mm" number="0" /> </DailyWeatherRecord> - <MonthlyWeatherRecord> <date>1-2004</date> - <temperature> <maxdrybulb unit="degrees-centigrade" number="33.95" /> <mindrybulb unit="degrees-centigrade" number="23.20" /> <avgdrybulb unit="degrees-centigrade" number="33.95" /> </temperature> </MonthlyWeatherRecord> - <MonthlyWeatherRecord> <date>2-2004</date> - <temperature> <maxdrybulb unit="degrees-centigrade" number="50.25" /> <mindrybulb unit="degrees-centigrade" number="39.25" /> <avgdrybulb unit="degrees-centigrade" number="44.25" /> </temperature> </MonthlyWeatherRecord> </AnnualWeatherRecord>
Using XML::XPath, I need to build a subroutine that does this:

Accepts a date as a parameter, and searches the xml file for the node with that particular date. It extracts the
value of the maxdrybulb attribute, and builds an array which stores the date and the max dry bulb temperature.

For e.g. a sample array say @array would look like:
@array[0]=1-1-2004 @array[1]=33.95
I need another subroutine which stores the date and the value of the mindrybulb attribute, but the code would
mostly be the same I guess.

Please help!
Any help would be appreciated.

Thanks,
perl_seeker:)

Replies are listed 'Best First'.
Re: Help with XML::XPath
by johnnywang (Priest) on Aug 19, 2004 at 07:26 UTC
    If I understand you correctly, the following is what you want:
    use strict; use XML::XPath; my $date = "2-2004"; my @values = findmax($date); print join(",",@values),"\n"; sub findmax{ my $date = shift; my $xp = XML::XPath->new(filename => 'test.xml'); my $max = $xp->findvalue("/AnnualWeatherRecord/MonthlyWeatherRecor +d[./date=\"$date\"]/temperature/maxdrybulb/\@number"); return ($date,$max); } __END__ 2-2004, 50.25
      Hello johnny!
      thanks a lot, this really helps. I needed to look in the daily weather records and not in the monthly
      one's, but that meant a very minor change in the code, and I'm getting the results I need.

      I'll have to come back here incase I need more help:)

      *Thanks to all the others for their code too.*

      Cheers
      perl_seeker:)
      Hello,
      would you be able to tell me how to code this, still not able to use XML::XPath much:

      I need to search the XML file in concern for all the DailyWeatherRecords for a particular month.

      Say for the month of January, I need to search for those DailyWeatherRecords which have a "1" in the middle of the
      date attribute value, for example "1-1-2004" and "2-1-2004".I'm not sure how to look in the middle of the date value.

      The input to the sub that does this is the month i.e January or "1". For all the DailyWeatherRecords for the month of
      January, I need to extract the value of maxdrybulb if it is greater than 30.0, and the date. The sub should
      return as an array the list of dates and maxdrybulb values (only if it is greater than 30.0)

      For example the array might look like:
      1-1-2004 33.95 2-1-2004 30.95
      Thanks in advance.

      perl_seeker
      :)
        If you insist on using XPath, here's a version. But in this case, I'd say using something like XML::Simple is simpler.
        use strict; use XML::XPath; my $month = 1; my $min = 30; my $match = findmax($month,$min); foreach my $t(@$match){ print "$t->[0],$t->[1]\n"; } sub findmax{ my $month = shift; my $min = shift; $month = "-$month-"; my @result = (); my $xp = XML::XPath->new(filename => 'test.xml'); my $nodes = $xp->findnodes("//DailyWeatherRecord[contains(date,'$m +onth') and temperature/maxdrybulb[\@number>$min]"); foreach my $node($nodes->get_nodelist){ push @result,[$xp->findvalue("./date",$node),$xp->findvalue(". +//maxdrybulb/\@number",$node)]; } return \@result; } __END__ 1-1-2004,33.95
Re: Help with XML::XPath
by murugu (Curate) on Aug 19, 2004 at 10:11 UTC

    For this XML::Twig can be used.

    In below code, input date is given via command line.

    you can get the mindrybulb value also just by adding some more code in the handler part.

    use XML::Twig; undef $/; my $date=$ARGV[0]; my $s=<DATA>; my (@max,@min); my $t=new XML::Twig( twig_handlers=>{ "AnnualWeatherRecord"=>sub{my ($c)=$_[1]->get_xpath("//DailyWeather +Record/date[string()=\"$date\"]/../temperature/maxdrybulb[\@number]") +; push @max,[$date,$c->att("number")] if (defined $c and defined $c->att +("number"))} } ); $t->parse($s); local $"="\n"; print @$_ for @max; __DATA__ <AnnualWeatherRecord> <DailyWeatherRecord> <date>1-1-2004</date> <temperature> <maxdrybulb unit="degrees-centigrade" number="30.95"/> <mindrybulb unit="degrees-centigrade" number="30.95"/> <maxwetbulb unit="degrees-centigrade" number="33.53"/> <minwetbulb unit="degrees-centigrade" number="30.53"/> </temperature> <totalrainfall unit="mm" number="0.5"/> </DailyWeatherRecord> </AnnualWeatherRecord>

      Just a quick comment: if you don't want to have to backslash quotes in the XPath expression, you can use custom delimiters, one of those features that make me love Perl:

      get_xpath( qq{//DailyWeatherRecord/date[string()="$date"]})
Re: Help with XML::XPath
by bobf (Monsignor) on Aug 19, 2004 at 07:46 UTC
    I am not familiar with XML::XPath but I have used XML::Simple, which might be a suitable alternative (if you're restricted to using XPath, just file the rest of this post away for future reference). If your XML input file isn't too big, you could do something like this:
    use strict; use warnings; use XML::Simple; # read in the XML data file open( INFILE, '<', 'xml_input_file.xml' ) or die "error opening input file!"; undef $/; # slurp mode my $infile = <INFILE>; close INFILE; # parse the XML file using XML::Simple my $dataref = XMLin( $infile ); # view the parsed data structure use Data::Dumper; print Dumper( $dataref );
    I didn't code the rest (where you find a record given a date), but if you understand references it should be straightforward after seeing how the file is parsed (with Data::Dumper). If you decide to go this route and need help pulling out the data, just ask.
    HTH
Re: Help with XML::XPath
by gmpassos (Priest) on Aug 19, 2004 at 14:18 UTC
    Not a XPath solution, but something simple to do with XML::Smart:
    use XML::Smart ; my $date = '1-2-2004' ; my $xml = new XML::Smart('your_file.xml') ; my $node = $xml->{AnnualWeatherRecord}{DailyWeatherRecord}('date' , +'eq' , $date) ; my $max = $node->{temperature}{maxdrybulb}{number} ; my @array = ($date , $max) ;

    Graciliano M. P.
    "Creativity is the expression of the liberty".