Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: Quicker way to batch grab images?

by stonecolddevin (Vicar)
on Feb 17, 2014 at 22:03 UTC ( #1075225=note: print w/ replies, xml ) Need Help??


in reply to Quicker way to batch grab images?

2 things:

  1. Is there any way you could use rsync for this? Or are you effectively scraping another site for the images? If you are just connecting to a server and grabbing the image files, you should look into rsync.
  2. Are you doing any other operations whilst procuring said images? If you're updating database records in tandem, etc. then obviously you're going to want to move those outside of your image downloading logic.

I've done a ton of image moving through S3, and I've found a lot of success with something like Parallel::Runner. I think davido's response is probably your best bet. Parallelize, and if at all possible, do some horizontal scaling so you can have multiple worker machines chipping away at your queue.

Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past


Comment on Re: Quicker way to batch grab images?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1075225]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (16)
As of 2014-10-01 17:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    What is your favourite meta-syntactic variable name?














    Results (30 votes), past polls