Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: Quicker way to batch grab images?

by stonecolddevin (Vicar)
on Feb 17, 2014 at 22:03 UTC ( #1075225=note: print w/replies, xml ) Need Help??

in reply to Quicker way to batch grab images?

2 things:

  1. Is there any way you could use rsync for this? Or are you effectively scraping another site for the images? If you are just connecting to a server and grabbing the image files, you should look into rsync.
  2. Are you doing any other operations whilst procuring said images? If you're updating database records in tandem, etc. then obviously you're going to want to move those outside of your image downloading logic.

I've done a ton of image moving through S3, and I've found a lot of success with something like Parallel::Runner. I think davido's response is probably your best bet. Parallelize, and if at all possible, do some horizontal scaling so you can have multiple worker machines chipping away at your queue.

Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1075225]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (2)
As of 2017-03-28 23:03 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (342 votes). Check out past polls.