Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Parsing Item Tag from RSS feed

by mr_p (Scribe)
on Jul 12, 2010 at 14:43 UTC ( #849024=perlquestion: print w/ replies, xml ) Need Help??
mr_p has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I was trying to parse Item tag with regEx but I am having problems because $capture->{_content} as a string is being translated to some other characterset. So, I am trying to pull out <item> tag using below method and I keep getting this error. Can someone please let me know why?

Error: XPathContext: lost current node at link_ext2.pl line 30

#!/usr/bin/perl -w #use strict; use warnings; use XML::RSS::LibXML; use XML::LibXML; use LWP::UserAgent; use Data::Dumper; #my ( $htmlInfile, $htmlOutfile, $cssOutfile ) = @ARGV; my $html_link = "http://rss.news.yahoo.com/rss/topstories"; my $parser = XML::LibXML->new; my $client = LWP::UserAgent->new(); my $capture = $client->get("$html_link") || die"$!\n"; useLibXmlParseXmlItems($capture->{_content}); sub useLibXmlParseXmlItems { my $rss = XML::RSS::LibXML->new; $rss->parse($_[0]) || die "Could not parse. <$!>"; my $xp = XML::LibXML::XPathContext->new($rss); my @nodes = $xp->findnodes("/rss/channel/item"); #print @nodes; }

Comment on Parsing Item Tag from RSS feed
Download Code
Re: Parsing Item Tag from RSS feed
by Corion (Pope) on Jul 12, 2010 at 15:14 UTC

    Why are you accessing $capture->{_content} ? What is it supposed to contain? Where in LWP::UserAgent is it documented?

      I don't see any documentation on it..but $capture->{_content} is the whole webpage.

        Except it is not in a character set that you want.

        Maybe you want to use ->decoded_content instead?

      can you look at my code and tell me why the Xpath is not working in my code? Why I get the error message? Thanks for your help.

        You are making things up that do not at all match any kind of documentation. That's not how programming works.

        The following code works for me, and all I needed to do was to read the documentation of XML::LibXML.

        #!/usr/bin/perl -w use strict; use warnings; use XML::LibXML; my $html_link = "http://rss.news.yahoo.com/rss/topstories"; my $dom = XML::LibXML->load_xml(location => $html_link); my $xp = XML::LibXML::XPathContext->new($dom); my @nodes = $xp->findnodes("/rss/channel/item"); print $_->toString for @nodes;

        As you haven't told us why you use XML::RSS::LibXML, I have removed it from the code.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://849024]
Approved by Ratazong
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (15)
As of 2014-08-27 16:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (244 votes), past polls