How to read from an URL ?

by Ben Win Lue (Friar)
on Mar 31, 2005 at 15:26 UTC
Ben Win Lue has asked for the wisdom of the Perl Monks concerning the following question:


I would like to read from an URL.
The naive Perl approach would be something like:

open MYHANDLE "<"; while(<MYHANDLE>){ print $_; }
This doesn't work.
What would be the right approach?

Re: How to read from an URL ?
on Mar 31, 2005 at 15:28 UTC
    See the LWP module.
Re: How to read from an URL ?
on Mar 31, 2005 at 16:32 UTC
    Ok, I know this is also LWP, but can teach something. If you have LWP installed with all its scripts, you have a GET command you can use in your shell, which outputs (dumps) the page content to STDOUT.

    This means you can do

    open MYHANDLE, "GET|"; while(<MYHANDLE>) { print $_; }
    Also, you can use lynx for the same purpose (this is good because it has some options to remove tags, in case you want).
    open MYHANDLE, "lynx -dump|"; while(<MYHANDLE>) { print $_; }
    but, of course, use LWP. If you don't know where to start, look into LWP::Simple.

      Thank you, this helped a lot

      I can get the Web page without Images

      How Can i get with image??

Re: How to read from an URL ?
on Mar 31, 2005 at 15:29 UTC

    You want to look at LWP. This module allows you to read web pages and perform some task based on its content. There are other modules that work with LWP on CPAN as well.

Re: How to read from an URL ?
on Mar 31, 2005 at 16:53 UTC

    This raises a very interesting question ... I'm actually somewhat surprised that IO::All doesn't do this already :-)

    After all ... IO::All is great at doing things the "naive" way...

    Update: Thanks, itub - I just looked at the IO::All docs, and didn't even notice the plethora of add-ons :-)

      It does, if you install IO::All::LWP:
      # examples from the SYNOPSIS use IO::All; "hello world\n" > io('ftp://localhost/test/x'); # save to FTP $content < io(''); # GET webpage io('') > io('index.html'); # save webpage

      And you can also tie it, treat is as a file, and use other IO::All tricks:

      my $io = io('')->tie; while (<$io>) { # do something }
        After already having succeding with LWP::Simple; I tried IO::All.
        But it didn't compile. I got some Error for not finding a Spiffy constructor. What did I wrong?
Re: How to read from an URL ?
on Mar 31, 2005 at 16:39 UTC
    print `GET`;
      print `GET`;
      Erm, for that why not call GET directly from the command line?

