http://www.perlmonks.org?node_id=547054


in reply to Predictive HTTP caching in Perl

For RSS feed, there will be little or no content to cache, so I'd see this approach as a lot of work for uncertain benefit.

Something that will work is parallelizing the retrieval of the pages/feeds. Create an application, say with Parallel::ForkManager, that creates multiple process, each one fetching one site and processing it. Then assemble the results from all the children into your composite feed. The time taken will be only a little longer than the slowest website/feed.

-Mark

Replies are listed 'Best First'.
Re^2: Predictive HTTP caching in Perl
by ryantate (Friar) on May 03, 2006 at 05:57 UTC
    Why do you say RSS feeds have no content to cache? There is the title, date and then, well, the content, either in the description element or content:encoded. And even in cases where it's just a title and a date, it takes time to open the connection and download the file.

    The benefit is: once cached, do not have to connect to server and download the Web page. When there are 30 pages, this is an issue.

    I'm already parallelizing the retrieval. I'm using LWP::Parallel after finding little additional speed benefit from either POE or HTTP::GHTTP with P::ForkManager.

    Thanks anyway.