Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Code for Perlmonks XML to RSS

by xdg (Monsignor)
on Mar 30, 2004 at 02:02 UTC ( [id://340820]=monkdiscuss: print w/replies, xml ) Need Help??

As an XML/RSS exercise, I wrote up some code that converts the latest questions from the perlmonks Newest Nodes XML Feed to RSS. It's quick and dirty, but works and a script using it is now in my crontab (hourly). (It parses the authors, too, but I haven't done anything with that yet.) I thought other users (or the gods) might be interested in seeing it. People should be able to easily modify this example to pick out other parts of the newest nodes feed if they desire.

use warnings; use strict; use LWP::UserAgent; use XML::Simple; use XML::RSS; # pass in the URL for the newest nodes page, i.e. # http://perlmonks.org/index.pl?node_id=30175 sub process_xml { my $url = shift; my %authors; my @questions; my $browser = LWP::UserAgent->new(); $browser->timeout(30); my $response = $browser->get($url); die unless $response->is_success; my $tree = XMLin( $response->content ); foreach my $node ( @{ $tree->{AUTHOR} } ) { $authors{ $node->{'node_id'} } = $node->{'content'}; } foreach my $node ( @{ $tree->{NODE} } ) { my $item = { author => $authors{ $node->{'author_user'} }, node_id => $node->{'node_id'}, subject => $node->{'content'} }; push @questions, $item if $node->{'nodetype'} eq 'perlquestion +'; } my $rss = new XML::RSS (version => '1.0'); $rss->channel( title => "perlmonks newest questions", link => "http://perlmonks.org", description => "perlmonks newest questions", ); for my $item ( @questions ) { $rss->add_item( title => $item->{subject}, link => "http://perlmonks.org/index.pl?node_id=" . +$item->{node_id} ); } print $rss->as_string(); }

-xdg

Code posted by xdg on PerlMonks is public domain. It has no warranties, express or implied. Posted code may not have been tested. Use at your own risk.

DG: Updated to add strict and warnings to set a good example for new monks.

Replies are listed 'Best First'.
Re: Code for Perlmonks XML to RSS
by Vautrin (Hermit) on Mar 30, 2004 at 17:09 UTC

    I was just playing around with your++ script. It's very cool, but I have the following comments / suggestions:

    1. The line:

      die unless $response->is_success;

      Really should be:

      die ("Could not fetch the web page because: " . $response->status_line) unless $response->is_success; die ("Content type not text/xml. It was" . $response->content_type) unless ($response->content_type eq 'text/xml');

      Otherwise you aren't notified why a get() fails, and it allows the user to try parsing a non XML web page -- which leads to some humorous results.

    2. Why is it even necessary to pass in the URL for the newest nodes page? Your script doesn't look like it will work on any page besides the newest nodes page, so doing a shift (@_) for the URL seems useless. Why not just set my $url = "http://www.perlmonks.org/index.pl?node_id=30175"; and be done with it?
    3. Excellent work. I thoroughly enjoyed it. Also, not to nitpick, but why don't you put

      use strict; use warnings;

      at the top of your script? It runs with no problems under them, and it's good practice (plus I think it's a good idea to help any new monks seeing it to get in the habit of using them).

    Good job again!


    Want to support the EFF and FSF by buying cool stuff? Click here.

      Thanks for the comments. As I said, it was quick and dirty and your suggestions are worthwhile additions. For the record, this was pasted out of a larger script which did have strict and warnings on and was originally set for passing in URL's from the command line, which is why that appears that way. I probably should have cleaned it up further. Certainly, adding strict and warnings even in my example to encourage good practice among others is a great suggestion.

      -xdg

      Code posted by xdg on PerlMonks is public domain. It has no warranties, express or implied. Posted code may not have been tested. Use at your own risk.

Re: Code for Perlmonks XML to RSS
by exussum0 (Vicar) on Mar 30, 2004 at 16:50 UTC
    And if you have XSLT, it can be done that way too. But as allways, TIMTOWTDI.


    -- "So far my experience has been that most people who go for certification have broad but not deep knowledge in the field and the flavor of the knowledge is academic. But every once in a while one finds a gem of a person who learns all the time and uses certification to prove it." -- on Orkut

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: monkdiscuss [id://340820]
Approved by VSarkiss
Front-paged by matija
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (6)
As of 2024-03-28 09:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found