Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Hmmm why does XML::RSS::Parser choke on PM RSS feed?

by Tommy (Chaplain)
on Nov 17, 2006 at 21:14 UTC ( #584797=monkdiscuss: print w/ replies, xml ) Need Help??

Hmmm. Why does XML::RSS::Parse choke on the PM RSS feed (/var/www/atrixnet.com/cron/savepmnewestnodes.pl ?? (When I say "choke", I mean hang and suck RAM like a pig and then produce output nothing like what the docs for that module would lead me to expect.)

The code below is part of a cron job that I use to collect the latest PM nodes and display them on my website (www.atrixnet.com

#!/usr/bin/perl -w use strict; use warnings; use constant SAVEAS => '/cgi-bin/dat/pmnewestnodes.htmlpart'; # auto-flush STDOUT ++$|; # globals use vars qw( $dbh ); # libraries use DBI; use XML::RSS::Parser; use File::Util; # connect to DB $dbh = DBI->connect( q[DBI:mysql:] . qq[database=myrssfeeds;] . qq[host=localhost;] . qq[port=3306], 'rssbot', # username '^r$$p@$$w0rD!', # password { 'RaiseError' => 0 } ) or die qq[Aborting! Failed to connect to database: $DBI::errstr]; # grab feed from DB my($rss) = ($dbh->selectrow_array( q[SELECT content FROM feeds WHERE feedurl = ?], undef, 'http://perlmonks.org/index.pl?node_id=30175&xmlstyle=rss' ))[0] or die q{Couldn't get RSS from DB! } . $DBI::errstr; # parse feed $rss = XML::RSS::Parser->new()->parse_string($rss); die $rss->query('/channel/title'); # html-ify content my($output) = ''; foreach my $i ( $rss->query('//item') ) { my($node) = $i->query('title'); print $node->text_content, "\n"; } File::Util->new->write_file( 'filename' => SAVEAS, 'content' => $output ); print $output; print qq[DONE. RSS PARSED AND SAVED AS HTML IN "${\ SAVEAS }"\n]; # disconnect if not already disconnected END { $dbh->disconnect() if defined $dbh }
--
Tommy

Comment on Hmmm why does XML::RSS::Parser choke on PM RSS feed?
Download Code
Reaped: Re: Hmmm why does XML::RSS::Parser choke on PM RSS feed?
by NodeReaper (Curate) on Nov 17, 2006 at 21:18 UTC
Re: Hmmm why does XML::RSS::Parser choke on PM RSS feed?
by jdporter (Canon) on Nov 17, 2006 at 21:18 UTC

    In a few older threads (such as rss feed corrupted by certain nodes and those linked in RSS feed fixed), the topic of what bad RSS can do to an RSS parser has been discussed. You might look through those for some insight. Bad RSS might not be the cause; but know we're prone to generating bad RSS, so...

    We're building the house of the future together.
      *groan* so now I need to start sanitizing?! pm rss--
      --
      Tommy Butler, a.k.a. Tommy
Re: Hmmm why does XML::RSS::Parser choke on PM RSS feed?
by rhesa (Vicar) on Nov 18, 2006 at 01:02 UTC
Re: Hmmm why does XML::RSS::Parser choke on PM RSS feed?
by BaldPenguin (Friar) on Nov 20, 2006 at 03:39 UTC
    Have you tried using XML::RSS::Headline::PerlMonks? I wrote it to do exactly what you are doing, I am interested in feedback on the module. I use it to update an internal jabber channel with the latest PerlMonks 'Newest Nodes'.

    Don
    WHITEPAGES.COM | INC
    Everything I've learned in life can be summed up in a small perl script!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: monkdiscuss [id://584797]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (7)
As of 2014-10-01 11:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (8 votes), past polls