I am new to Perl and attempting to parse XML into an array of hashes using XML::Parser.
The script parses each child element of a <story> element into a hash. A reference to the hash is pushed on an array. When printing out the array of hashes, only the values from the last element parsed are displayed.
Overwriting the global hash is probably causing the problem.
I would like to either:
Create a lexically scoped hash variable but I am having difficulty due to the event driven XML::Parser subs (StartTag, EndTag, and Text)
Effectively deal with the global hash variable (by redefining it?)
thanks in advance monks -
use strict;
use XML::Parser;
use LWP::Simple;
my @curr;
my @stories;
# global hash
my %story;
my $xml = get("http://www.slashdot.org/slashdot.xml");
die "Failed to obtain xml " unless defined($xml);
my $p = XML::Parser->new(Style => 'Stream');
$p->parse($xml);
# print out the results
foreach my $hashref (@stories) {
print "$hashref->{title}\n";
}
# StartTag - called when the start of an XML tag is found
sub StartTag {
my($p, $tag) = @_;
push @curr, $tag;
}
# EndTag - called when the end of a XML tag is seen
sub EndTag {
my($p, $tag) = @_;
# pushes the hash ref onto the array
if ($tag eq 'story') {
push @stories, \%story;
}
pop @curr;
}
# Text - called when text data is encountered
sub Text {
unless ($curr[-1] eq 'story') {
$story{$curr[-1]} = $_;
}
}
Example of a story element
<story>
<title>Debian And WineX</title>
<url>http://slashdot.org/article.pl?sid=02/05/28/1515220</url>
<time>2002-05-28 18:25:01</time>
<author>Hemos</author>
...
</story>
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.