Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Using LWP instead of wget?

by kingram (Acolyte)
on Jul 26, 2012 at 22:42 UTC ( [id://983956]=perlquestion: print w/replies, xml ) Need Help??

kingram has asked for the wisdom of the Perl Monks concerning the following question:

While not new to Perl, I have spent more time on databases and Java and decided to try using Perl to download podcast info. I really like Perl, but it takes a little more effort to get things going with it.

I want to download a large collection of podcasts which I have catalogued in a database. I downloaded the complete rss and catalogued the shows. Some of which I downloaded manually with wget

I read in a previous post on here about someone using system(wget .....) in perl, but a follow-up post suggested LWP

I'm not really getting in the LWP documentation how to do this. I'll read it again a little more closely, but if I need to spend 5 hours reading LWP, to get 30 or so files it's not an economical trade of time. It's easier to use system(wget -O...). However, if LWP is great, this would be my entry point to eventually become more fluent with it.

Would anyone be willing to show me example code using LWP to download podcasts from an array (of selected choices), to a specific directory, with a specific name for each file? I think that would help me get a better grip on LWP's usefulness over wget in a perl script

I have just about every other element of my automation done. This would be the last step.

Thanks for your time

Replies are listed 'Best First'.
Re: Using LWP instead of wget?
by BrowserUk (Patriarch) on Jul 26, 2012 at 23:01 UTC

    At its simplest, LWP::Simple will do what you've described:

    use LWP::Simple; my $url = '...'; my $filename = '...'; print "Getting $url and storing to $file returned: ", getstore $url, $ +file;

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      I like it!

        I have to say, if you are simply downloading a big list of urls and don't mind waiting, I'd skip the Perl completely and use:

        wget -nd -i urls.list

        wget is fast enough, but serial.

        If I was in a real hurry, and the urls were spread across many servers, I might use something like (untested):

        #! perl -slw use strict; use threads ( stack_size => 4096 ); use Thread::Queue; use LWP::Simple; my $Q = new Thread::Queue; sub worker { while( my $url = $Q->dequeue ) { chomp $url; ( my $file = $url ) =~ tr[/:?*"][_]; #" my $status = getstore $url, $file; $status == 200 or warn "$status : $url ($file)\n"; } } our $THREADS //= 4; $Q->enqueue( <>, (undef) x $THREADS ); $_->join for map threads->create( \&worker ), 1 .. $THREADS; ## use as thisScript -THREADS=8 < urls.list

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

Re: Using LWP instead of wget?
by bulk88 (Priest) on Jul 27, 2012 at 04:27 UTC
    If you know wget's syntax better than LWP, and your target system has wget, use system() and wget. There is nothing wrong with using Perl for shell scripting. LWP has higher CPU/MEM overhead than the C wget. The core LWP (libwww-perl) doesn't support async requests. Backgrounding command line downloaders is a easy way to implement async internet requests in Perl. I personally use http://aria2.sourceforge.net/ as my command line downloader. Sometimes I run aria2 in daemon mode and control it through XML::RPC::Fast having it downloading dozens of files at the same time to disk, then I read the files synchronously into the Perl from disk for processing, its almost instantaneous because of OS write caching.

      Nice!

      I'll play with that once I get my script working with wget and can update my selected podcast with automation. I really hate iTunes....

Re: Using LWP instead of wget?
by Anonymous Monk on Jul 26, 2012 at 23:03 UTC
      Jeez. Homework???
      *Sigh*
      Fine. I'll add them to my study list. In the meantime, I'm going to finish my little automation project.

      Good info though. Thanks

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://983956]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2024-04-24 13:11 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found