Beefy Boxes and Bandwidth Generously Provided by pair Networks DiBona
Keep It Simple, Stupid
 
PerlMonks  

Re: Predictive HTTP caching in Perl

by kvale (Monsignor)
on May 03, 2006 at 04:26 UTC ( #547054=note: print w/ replies, xml ) Need Help??


in reply to Predictive HTTP caching in Perl

For RSS feed, there will be little or no content to cache, so I'd see this approach as a lot of work for uncertain benefit.

Something that will work is parallelizing the retrieval of the pages/feeds. Create an application, say with Parallel::ForkManager, that creates multiple process, each one fetching one site and processing it. Then assemble the results from all the children into your composite feed. The time taken will be only a little longer than the slowest website/feed.

-Mark


Comment on Re: Predictive HTTP caching in Perl
Re^2: Predictive HTTP caching in Perl
by ryantate (Friar) on May 03, 2006 at 05:57 UTC
    Why do you say RSS feeds have no content to cache? There is the title, date and then, well, the content, either in the description element or content:encoded. And even in cases where it's just a title and a date, it takes time to open the connection and download the file.

    The benefit is: once cached, do not have to connect to server and download the Web page. When there are 30 pages, this is an issue.

    I'm already parallelizing the retrieval. I'm using LWP::Parallel after finding little additional speed benefit from either POE or HTTP::GHTTP with P::ForkManager.

    Thanks anyway.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://547054]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (9)
As of 2014-04-17 00:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (437 votes), past polls