Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: Speeding up/parallelizing hundreds of HEAD requests

by aquarium (Curate)
on Sep 17, 2007 at 04:56 UTC ( #639335=note: print w/replies, xml ) Need Help??


in reply to Speeding up/parallelizing hundreds of HEAD requests

a fairly easy to implement cache that does not rely on extra code is to use squid. setup squid properly and configure your LWP requests to use squid as the proxy
also, if the pdb format is always available...you could provide the other formats with your own software, if it doesn't crunch the server with too many requests.
another idea = write the file links as dynamic javascript that does a HEAD and crosses out unavailable formats. this shifts the connections to the client...so if their search returned lots of links, it'll be up to their machine to resolve the availability of the file formats. this also makes certain that the links are truly available from the client, not just from your server.
the hardest line to type correctly is: stty erase ^H
  • Comment on Re: Speeding up/parallelizing hundreds of HEAD requests

Replies are listed 'Best First'.
Re^2: Speeding up/parallelizing hundreds of HEAD requests
by hacker (Priest) on Sep 17, 2007 at 20:08 UTC

    Unfortunately, the latest versions of Squid are not SMP-aware (as referenced by their core developers), and running it in front of Apache2 yields a significant performance decrease.

    I did a lot of thorough tests on this exact point. I've run Squid in front of Apache 1.3.x for years, and found roughly a 400% increase in request response time on a uniprocessor machine.

    When I moved to Apache 2 on a dual-core SMP machine, I tested Squid in front of Apache 2.x, and found that my request responses dropped 75% as compared to Apache 2.x running natively on port 80. Apache is able to thread processes across multiple cores, but Squid is not.

    I do, however.. have an internal Squid server running on my BSD machine, which ALL outbound traffic going across port 80 is transparently redirected through (redirected at the router by some iptables rules), so my HEAD requests are already going there. I don't see any significant increase or decrease in performance when enabling or disabling that capability.

    It is an interesting idea, but I don't think it applies to this specific problem.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://639335]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2021-01-26 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?