Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re: Quicker way to batch grab images?

by stonecolddevin (Vicar)
on Feb 17, 2014 at 22:03 UTC ( #1075225=note: print w/replies, xml ) Need Help??

in reply to Quicker way to batch grab images?

2 things:

  1. Is there any way you could use rsync for this? Or are you effectively scraping another site for the images? If you are just connecting to a server and grabbing the image files, you should look into rsync.
  2. Are you doing any other operations whilst procuring said images? If you're updating database records in tandem, etc. then obviously you're going to want to move those outside of your image downloading logic.

I've done a ton of image moving through S3, and I've found a lot of success with something like Parallel::Runner. I think davido's response is probably your best bet. Parallelize, and if at all possible, do some horizontal scaling so you can have multiple worker machines chipping away at your queue.

Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1075225]
[jedikaiti]: Hi Monks

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (6)
As of 2018-04-19 17:48 GMT
Find Nodes?
    Voting Booth?