Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much

Re: Quicker way to batch grab images?

by stonecolddevin (Vicar)
on Feb 17, 2014 at 22:03 UTC ( #1075225=note: print w/ replies, xml ) Need Help??

in reply to Quicker way to batch grab images?

2 things:

  1. Is there any way you could use rsync for this? Or are you effectively scraping another site for the images? If you are just connecting to a server and grabbing the image files, you should look into rsync.
  2. Are you doing any other operations whilst procuring said images? If you're updating database records in tandem, etc. then obviously you're going to want to move those outside of your image downloading logic.

I've done a ton of image moving through S3, and I've found a lot of success with something like Parallel::Runner. I think davido's response is probably your best bet. Parallelize, and if at all possible, do some horizontal scaling so you can have multiple worker machines chipping away at your queue.

Three thousand years of beautiful tradition, from Moses to Sandy Koufax, you're god damn right I'm living in the fucking past

Comment on Re: Quicker way to batch grab images?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1075225]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (10)
As of 2015-11-26 13:39 GMT
Find Nodes?
    Voting Booth?

    What would be the most significant thing to happen if a rope (or wire) tied the Earth and the Moon together?

    Results (700 votes), past polls