Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re^2: Fetching an Image from HTTP

by Aldebaran (Curate)
on Apr 07, 2012 at 08:35 UTC ( [id://963899]=note: print w/replies, xml ) Need Help??


in reply to Re: Fetching an Image from HTTP
in thread Fetching an Image from HTTP

that was my first tack, and I thought I was on the right track, but what I ended up with using getstore() was files that kind of thought they were jpg's and kind of thought they were html docs. Here's the script I used:

#!/usr/bin/perl -w use strict; use LWP::Simple; open FILE, "text1.txt" or die $!; my $url; my $text; while (<FILE>) { $text = $_; $url = 'http://www.nobeliefs.com/nazis/' . $text; $text =~ s#images/##; print "$url\n"; print "$text\n"; getstore($url, $text) or die "Can't download: $@\n"; }

an ls command shows question marks:

$ ls ... prayingHitler.jpg? PraysingCelebration.jpg? priests-salute.jpg? received.jpg reichchurch.gif? ...

and when I open up a jpg it looks like this:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http:/ +/www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html> <head> <meta http-equiv="Content-type" content="text/html; charset=utf-8"> <title>Website Moved</title> <style type="text/css"> .statusBox { width: 80px; } .fb { width:43%; float:left; text-align:center; margin:5px 20px 5px 20px; padding:20px 0 20px 0px; background:#eef8fd; height:110px; border:solid 1px #dff4fe; } .fb2 { width:43%; float:right; text-align:center; margin:5px 20px 5px 20px; padding:20px 0 20px 0px; background:#eef8fd; height:110px; border:solid 1px #dff4fe; ...

I think the trick might be to find a way to define $params such that this works, but I haven't been able to do that yet. (I only get errors)

my $data = LWP::Simple::get $params{URL}; my $filename = "image.jpg"; open (FH, ">$filename"); binmode (FH); print FH $data; close (FH);

Replies are listed 'Best First'.
Re^3: Fetching an Image from HTTP
by aaron_baugher (Curate) on Apr 07, 2012 at 10:57 UTC

    Since you're reading your URLs from a text file, each one has a newline on the end of it. There may be other problems with them. So you're requesting bad URLs from the server, and it's sending back an information page to tell you that, hence the "Website Moved" title of the HTML page you're getting back. Load the page you get back in a web browser (you might want to rename it to something.html first) to see what it's trying to tell you. (The same newline issue will cause weirdness with the local filenames you're saving to as well.)

    Inspect the actual URL you're requesting, right before requesting it, with a line like the following, and you should see the problem:

    print qq[ '$url' ];

    Aaron B.
    My Woefully Neglected Blog, where I occasionally mention Perl.

      thanks so much, aaron, that got me over the hump. I mistakenly posted on this thread that looked similar to mine, which I was using as a reference. The original node is here: http://www.perlmonks.org/?node_id=963858 . I'd like to clean this up yet by, for example using WWW::Mechanize correctly as well as chomp, but I'll try to solicit comment back on the original.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://963899]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (3)
As of 2024-04-18 23:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found