http://www.perlmonks.org?node_id=413299

Install the required modules for this RSS aggregator, then customize the MySQL database table it uses to look like this, otherwise you won't be able to actually store the RSS feeds you poll, and the code will fail:

+---------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +---------+--------------+------+-----+---------+-------+ | feedurl | varchar(255) | | PRI | | | | nextup | int(11) | YES | | NULL | | | lastmod | varchar(40) | YES | | NULL | | | etag | varchar(250) | YES | | NULL | | | content | longtext | YES | | NULL | | +---------+--------------+------+-----+---------+-------+
Then set up the code to run as a cron job every hour.
#!/usr/bin/perl -w use strict; use warnings; # list your feeds below in the format shown; leave the rest of the fil +e alone my(@feeds) = ( # feedurl # forced refresh i +n seconds ['http://rss.news.yahoo.com/rss/world', 60 * 60], # h +ourly ['http://www.microsite.reuters.com/rss/topNews', 60 * 60], # h +ourly ['http://feeds.feedburner.com/TommysNewsAndWorldReport', 60 * 60], +# hourly ['http://perlmonks.org/index.pl?node_id=30175&xmlstyle=rss', 60 * 6 +0], # hourly ['http://www.wordsmith.org/awad/rss1.xml', 60 * 60 * 24], # d +aily ['http://xml.education.yahoo.com/rss/wotd/', 60 * 60 * 24], # d +aily ['http://netrn.net/spywareblog/feed/rss2/', 60 * 60 * 24], # d +aily ); # globals use vars qw( $dbh ); # libraries use XML::RSS::TimingBotDBI; use DBI; # connect to DB $dbh = DBI->connect( q[DBI:mysql:] . qq[database=myrssfeeds;] . qq[host=localhost;] . qq[port=3306], '[PUT YOUR USERNAME HERE]', # MySQL DB username '[PUT YOUR PASSWORD HERE]', # ...and password { 'RaiseError' => 0, 'AutoCommit' => 1 } ) or die qq[Aborting! Failed to connect to database: $DBI::errstr]; foreach (@feeds) { my($feed) = $_; # check for an entry in the db corresponding to this feed my($row) = ( $dbh->selectrow_array(<<__SQL__, undef, $feed->[0]) )[ +0]; SELECT feedurl FROM feeds WHERE feedurl = ? __SQL__ unless ($row) { # auto-create db entry for this feed if it doesn't +exist $dbh->do(q[INSERT INTO feeds SET feedurl = ?], undef, $feed->[0] +) } # grab the feed and thbbbtave it getfeed(@$_); } sub getfeed { my($rssurl,$maxage) = @_; # initialize the RSS bot! my($rssbot) = XML::RSS::TimingBotDBI->new; $rssbot->rssagent_dbh($dbh); $rssbot->rssagent_table('feeds'); $rssbot->maxAge($maxage) if $maxage; $rssbot->maxAge($maxage) if $maxage; # grab the RSS feed my($response) = $rssbot->get($rssurl); # check response code if ($response->code == 200) { # save RSS feed content if it was successfully retrieved my($sth) = $dbh->prepare(q[UPDATE feeds SET content = ? WHERE feedurl = +?]) or die q[RSSBOT: Aborting! Problem encountered with MySQL: ] . $DBI::errstr; $sth->execute($response->content, $rssurl) or die q[RSSBOT: Aborting! Problem encountered with MySQL: ] . $DBI::errstr; $sth->finish(); print qq[RSSBOT: RSS feed "$rssurl" freshly retrieved to databas +e\n] } elsif ($response->code == 304) { print qq[RSSBOT: feed "$rssurl" already up to date. No need to +refresh\n] } else { # report the error and abort if there was a problem getting the +feed die qq[RSSBOT: Aborting! Problem accessing feed "$rssurl": ] . $response->status_line } # have the rss bot save it's RSS lookup history... # $rssbot->commit; #<-- only necessary if MySQL auto-commit is off # ...or die trying die q[RSSBOT: Aborting! Problem encountered while working with MyS +QL: ] . $DBI::errstr if $DBI::errstr; # update OK print qq[RSSBOT: update OK at ${\ scalar localtime }\n]; } # scram exit; # disconnect if not already disconnected END { $dbh->disconnect() if defined $dbh }

Replies are listed 'Best First'.
Re: Perl RSS aggregator
by Anonymous Monk on Jan 07, 2009 at 00:19 UTC
    This is probably flame bait, but SQLite is not heavily used (in my experience). It is being used by Mac OS X native applications (e.g. AddressBook etc.). On the other hand, chances are very good that your webserver is running an instance of mysqld and that you can get to a mysql command-line prompt pretty easily. Every single time I've come across a tutorial that used SQLite, I had a hard time getting the correct packages installed, tweaking syntax, etc. I think most corporate/high-traffic applications are going to be using MySQL over SQLite. But hey, the beautiful thing about the DBI module in Perl is that it makes switching between underlying database engines very easy.
•Re: Perl RSS aggregator
by merlyn (Sage) on Dec 08, 2004 at 22:34 UTC
    Why MySQL and not PostgreSQL?

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.


    update: To those that downvoted this posting, just be aware that you'll be seeing more just like this one. PostgreSQL is horribly under-known, and I'm going to make sure that every casual mention of MySQL has PostgreSQL mentioned somewhere in the thread. MySQL has had its day. PostgreSQL is now leading in terms of functionality and support, not to mention having a much better license structure.

      Hi merlyn!

      Short answer: because PG just doesn't cut it for me. It's just so clunky imo. I don't like it. Maybe I will someday. But today doesn't seem to be that day.

      --
      Tommy Butler, a.k.a. TOMMY
      

        In that case, why MySQL and not SQLite :)

        --
        <http://www.dave.org.uk>

        "The first rule of Perl club is you do not talk about Perl club."
        -- Chip Salzenberg

      I guess you've not heard of MariaDB then? (Which does better than both). PostgreSQL is "under-known" for a reason. Stuffing it in people's faces only makes it annoying, and naturally more people will avoid it. What your suggesting, and doing "...I'm going to make sure that every casual mention of MySQL has PostgreSQL mentioned..."), is a big disservice to the PostgreSQL community, you're hurting us. So please stop.

        ... So please stop.

        Pay attention to the dates, I doubt he's kept it up since 2004