Re: Fetching an Image from HTTP

Replies are listed 'Best First'.
(wil) Re: Re: Fetching an Image from HTTP by wil (Priest) on Jun 07, 2002 at 14:38 UTC
If you're just going to be fetching images, I would use Image::Grab instead of LWP::Simple. Of course, TMTOWTDI, and this way would be mine. =) Here's some example code. There's more in the POD documentation, of course. `use Image::Grab; $pic->url('http://www.example.com/someimage.jpg') $pic->grab; open(IMAGE, ">image.jpg") \|\| die"image.jpg: $!"; binmode IMAGE; # for MSDOS derivations. print IMAGE $pic->image; close IMAGE;` [download] It also supports a regex feature, which would be handy if you are unsure of the file extension of the image you're grabbing. You can isntruct it to search a paticular document on a website, and it will go through all IMG tags to find an image matching your regex. It will then request it using the document's URL as it's referrer. Something like would look for all .png images, but of course you can change this to match a filename you don't know the extension of. Could be handy for documents that change the types of images they use, for some bizarre reason. =) `$pic = Image::Grab->new(SEARCH_URL=>'http://localhost/gallery.html', REGEXP =>'.*\.png');` [download] - wil	[reply] [d/l] [select]
Re^2: Fetching an Image from HTTP by Aldebaran (Curate) on Apr 07, 2012 at 08:35 UTC
that was my first tack, and I thought I was on the right track, but what I ended up with using getstore() was files that kind of thought they were jpg's and kind of thought they were html docs. Here's the script I used: `#!/usr/bin/perl -w use strict; use LWP::Simple; open FILE, "text1.txt" or die $!; my $url; my $text; while (<FILE>) { $text = $_; $url = 'http://www.nobeliefs.com/nazis/' . $text; $text =~ s#images/##; print "$url\n"; print "$text\n"; getstore($url, $text) or die "Can't download: $@\n"; }` [download] an ls command shows question marks: `$ ls ... prayingHitler.jpg? PraysingCelebration.jpg? priests-salute.jpg? received.jpg reichchurch.gif? ...` [download] and when I open up a jpg it looks like this: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http:/ +/www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-type" content="text/html; charset=utf-8"> <title>Website Moved</title> <style type="text/css"> .statusBox { width: 80px; } .fb { width:43%; float:left; text-align:center; margin:5px 20px 5px 20px; padding:20px 0 20px 0px; background:#eef8fd; height:110px; border:solid 1px #dff4fe; } .fb2 { width:43%; float:right; text-align:center; margin:5px 20px 5px 20px; padding:20px 0 20px 0px; background:#eef8fd; height:110px; border:solid 1px #dff4fe; ... [download] I think the trick might be to find a way to define $params such that this works, but I haven't been able to do that yet. (I only get errors) `my $data = LWP::Simple::get $params{URL}; my $filename = "image.jpg"; open (FH, ">$filename"); binmode (FH); print FH $data; close (FH);` [download]	[reply] [d/l] [select]
Re^3: Fetching an Image from HTTP by aaron_baugher (Curate) on Apr 07, 2012 at 10:57 UTC
Since you're reading your URLs from a text file, each one has a newline on the end of it. There may be other problems with them. So you're requesting bad URLs from the server, and it's sending back an information page to tell you that, hence the "Website Moved" title of the HTML page you're getting back. Load the page you get back in a web browser (you might want to rename it to something.html first) to see what it's trying to tell you. (The same newline issue will cause weirdness with the local filenames you're saving to as well.) Inspect the actual URL you're requesting, right before requesting it, with a line like the following, and you should see the problem: `print qq[ '$url' ];` [download] Aaron B. My Woefully Neglected Blog, where I occasionally mention Perl.	[reply] [d/l]
Re^4: Fetching an Image from HTTP by Aldebaran (Curate) on Apr 09, 2012 at 05:21 UTC
thanks so much, aaron, that got me over the hump. I mistakenly posted on this thread that looked similar to mine, which I was using as a reference. The original node is here: http://www.perlmonks.org/?node_id=963858 . I'd like to clean this up yet by, for example using WWW::Mechanize correctly as well as chomp, but I'll try to solicit comment back on the original.	[reply]


P is for Practical
	PerlMonks