Re: Quicker way to batch grab images?

by stonecolddevin (Vicar)
in reply to Quicker way to batch grab images?

2 things:

  1. Is there any way you could use rsync for this? Or are you effectively scraping another site for the images? If you are just connecting to a server and grabbing the image files, you should look into rsync.
  2. Are you doing any other operations whilst procuring said images? If you're updating database records in tandem, etc. then obviously you're going to want to move those outside of your image downloading logic.

I've done a ton of image moving through S3, and I've found a lot of success with something like Parallel::Runner. I think davido's response is probably your best bet. Parallelize, and if at all possible, do some horizontal scaling so you can have multiple worker machines chipping away at your queue.

