Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Fetching an Image from HTTP

by Molt (Chaplain)
on Jun 07, 2002 at 14:29 UTC ( [id://172534]=note: print w/replies, xml ) Need Help??


in reply to Fetching an Image from HTTP

Have a look at LWP::Simple. This'll let you get the contents of any URL quickly and with minimal fuss, and from that you simply just open the file and write it.

There, that wasn't too painful..

Replies are listed 'Best First'.
(wil) Re: Re: Fetching an Image from HTTP
by wil (Priest) on Jun 07, 2002 at 14:38 UTC
    If you're just going to be fetching images, I would use Image::Grab instead of LWP::Simple. Of course, TMTOWTDI, and this way would be mine. =)

    Here's some example code. There's more in the POD documentation, of course.
    use Image::Grab; $pic->url('http://www.example.com/someimage.jpg') $pic->grab; open(IMAGE, ">image.jpg") || die"image.jpg: $!"; binmode IMAGE; # for MSDOS derivations. print IMAGE $pic->image; close IMAGE;

    It also supports a regex feature, which would be handy if you are unsure of the file extension of the image you're grabbing.

    You can isntruct it to search a paticular document on a website, and it will go through all IMG tags to find an image matching your regex. It will then request it using the document's URL as it's referrer.

    Something like would look for all .png images, but of course you can change this to match a filename you don't know the extension of. Could be handy for documents that change the types of images they use, for some bizarre reason. =)

    $pic = Image::Grab->new(SEARCH_URL=>'http://localhost/gallery.html', REGEXP =>'.*\.png');

    - wil
Re^2: Fetching an Image from HTTP
by Aldebaran (Curate) on Apr 07, 2012 at 08:35 UTC

    that was my first tack, and I thought I was on the right track, but what I ended up with using getstore() was files that kind of thought they were jpg's and kind of thought they were html docs. Here's the script I used:

    #!/usr/bin/perl -w use strict; use LWP::Simple; open FILE, "text1.txt" or die $!; my $url; my $text; while (<FILE>) { $text = $_; $url = 'http://www.nobeliefs.com/nazis/' . $text; $text =~ s#images/##; print "$url\n"; print "$text\n"; getstore($url, $text) or die "Can't download: $@\n"; }

    an ls command shows question marks:

    $ ls ... prayingHitler.jpg? PraysingCelebration.jpg? priests-salute.jpg? received.jpg reichchurch.gif? ...

    and when I open up a jpg it looks like this:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http:/ +/www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-type" content="text/html; charset=utf-8"> <title>Website Moved</title> <style type="text/css"> .statusBox { width: 80px; } .fb { width:43%; float:left; text-align:center; margin:5px 20px 5px 20px; padding:20px 0 20px 0px; background:#eef8fd; height:110px; border:solid 1px #dff4fe; } .fb2 { width:43%; float:right; text-align:center; margin:5px 20px 5px 20px; padding:20px 0 20px 0px; background:#eef8fd; height:110px; border:solid 1px #dff4fe; ...

    I think the trick might be to find a way to define $params such that this works, but I haven't been able to do that yet. (I only get errors)

    my $data = LWP::Simple::get $params{URL}; my $filename = "image.jpg"; open (FH, ">$filename"); binmode (FH); print FH $data; close (FH);

      Since you're reading your URLs from a text file, each one has a newline on the end of it. There may be other problems with them. So you're requesting bad URLs from the server, and it's sending back an information page to tell you that, hence the "Website Moved" title of the HTML page you're getting back. Load the page you get back in a web browser (you might want to rename it to something.html first) to see what it's trying to tell you. (The same newline issue will cause weirdness with the local filenames you're saving to as well.)

      Inspect the actual URL you're requesting, right before requesting it, with a line like the following, and you should see the problem:

      print qq[ '$url' ];

      Aaron B.
      My Woefully Neglected Blog, where I occasionally mention Perl.

        thanks so much, aaron, that got me over the hump. I mistakenly posted on this thread that looked similar to mine, which I was using as a reference. The original node is here: http://www.perlmonks.org/?node_id=963858 . I'd like to clean this up yet by, for example using WWW::Mechanize correctly as well as chomp, but I'll try to solicit comment back on the original.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://172534]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (5)
As of 2024-04-24 07:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found