Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Getting pictures with LWP

by Amoe (Friar)
on Aug 27, 2001 at 20:32 UTC ( [id://108164]=perlquestion: print w/replies, xml ) Need Help??

Amoe has asked for the wisdom of the Perl Monks concerning the following question:

This isn't so much a problem, as a request for comments. Comments on this code, to be precise. It gets pictures or webpages, and saves pictures to $save_path or returns the body of webpages. Yeah, and it's strict and -w compatible, although it might not be -w compatible under all circumstances.
sub get_object { my($agent, $url, $referer, $save_path); # args: LWP::UserAgent, + url to get, (optional) referer to grab that url with, (optional) pat +h to save it to my($request, $response, $content); # http stuff $agent = shift(); $url = shift(); $referer = shift(); $save_path = shift(); $request = HTTP::Request->new('GET', $url); $request->referer($referer) if ($referer); # if referer param i +s specified set it as the referer header $save_path ? print("Getting picture $url...") : print("Getting inf +o from $url\n"); $response = $agent->request($request); $content = $response->content(); return($content) unless ($save_path); # if we're not saving it, + return the html # now save it if (length($content) >= 5120) # check for a dud picture { open(OUT, ">$save_path") or die("Couldn\'t open picturefile to + save: $!"); binmode(OUT); # if we're saving it it must be a picture +, therefore binary print(OUT $content); close(OUT); return(1); } else { print("Picture was a dud."); return(0); } }
It assumes that valid pictures aren't going to be under 5KB, reckless perhaps, but it suffices for the job. I just know you're gonna cuss that part to pieces, so...fire away! I was also thinking maybe I should check for binary and text data in $content, but I'm not quite sure how to do that without first writing to file and using filetest. Okay, flame away!

Replies are listed 'Best First'.
Re: Getting pictures with LWP
by damian1301 (Curate) on Aug 27, 2001 at 21:19 UTC
    I think you may prefer the easier Image::Grab which has a really easy to use interface and easy commands.

    Here is your code written over(in the way I would write it :).

    sub get_object{ use Image::Grab qw(grab); my($agent, $url, $referer, $save_path) = @_; my($request, $response, $content); $content = grab("$url"); $save_path ? print("Getting picture: $url...") : print("Getting info from $url\n"); return($content) unless ($save_path); open(OUT, ">$save_path") or die("Couldn\'t open picturefile to sav +e: $!"); binmode(OUT); print(OUT $content); close(OUT); }


    $_.=($=+(6<<1));print(chr(my$a=$_));$^H=$_+$_;$_=$^H; print chr($_-39); # Easy but its ok.
      I would use this, but being able to specify the "referer" parameter in the requests I make is key in the script this is part of, and Image::Grab doesn't provide that functionality, AFAIK.
Re: Getting pictures with LWP
by wog (Curate) on Aug 27, 2001 at 21:38 UTC
    A few pieces of advise.

    sub get_object { my($agent, $url, $referer, $save_path); # args: LWP::UserAgent, + url to get, (op +tional) referer to grab that url with, (optional) path to save it to my($request, $response, $content); # http stuff $agent = shift(); $url = shift(); $referer = shift(); $save_path = shift();

    Why do you need to use all these shifts? It's more concise and still clear what you're doing if you did:

    my($agent, $url, $referer, $save_path) = @_; # args: # $agent - LWP::UserAgent object # $url - URL to get # $referer - referer to grab URL with (optional) # $save_path - path to save image/etc. to (optional) my($request, $response, $content); # for http stuff

    Personally, I'd reccommend documenting this function with POD.

    # ... $save_path ? print("Getting picture $url...") : print("Getting inf +o from $url\n");

    This is better written as:

    print $save_path ? "Getting picture $url..." : "Getting info from +$url\n";

    (Remove redundent code like that.) update: Getting a little picky here, that's probably better written as:

    print defined $save_path ? "Getting picture $url..." : "Getting in +fo from $url\n";

    ... because you don't want your code to fail if you try to save to the file called '0' in the current directory (though the null string, '', is an interesting case). At the same time you should change the unless ($save_path) test below. Fortunatly your $referer test is probably safe because '0' and '' are not valid referers.

    You say: I was also thinking maybe I should check for binary and text data in $content, but I'm not quite sure how to do that without first writing to file and using filetest.

    In most cases you can probably test for this by checking the HTTP response:

    my $type = $response->headers->header("Content-Type"); if (defined $type and $type =~ /^text/) { # ... it's probably text } else { # ... probably not text }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://108164]
Approved by root
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (3)
As of 2024-04-19 02:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found