Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^4: Imploding URLs

by mobiGeek (Beadle)
on Jun 11, 2005 at 03:54 UTC ( [id://465745]=note: print w/replies, xml ) Need Help??


in reply to Re^3: Imploding URLs
in thread Imploding URLs

Yes, the savings is quite significant. From a hand-selected list of one user's habits, I was able to reduce some pages by more than 20%.

The other thing is that this is not a collection of URLs from across the entire web. The URLs being crawled vary, but the proxy is part of a kind of "portal". So there are potentially thousands of URLs, but they come from a select list of sites. Thus the reason I am looking for the weighting of substrings.

If one URL or one particular site (i.e. a particular substring) is crawled extremely frequently, then imploding that string might be much more bandwidth saving than simply imploding "http://" on all URLs.

mG.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://465745]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others learning in the Monastery: (5)
As of 2024-04-19 15:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found