Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re^4: Scrappy user_agent error

by docster (Novice)
on Jan 03, 2012 at 18:45 UTC ( #946111=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Scrappy user_agent error
in thread Scrappy user_agent error

Yes, it is entirely possible. I took the code above from the authors blog post. It is hard to find examples of a working Scrappy script. But as of now I am using: Scrappy (0.94112090).

And CPANs Module Version: 0.94112090 docs from: http://search.cpan.org/dist/Scrappy/lib/Scrappy.pm#user_agent

user_agent The user_agent attribute holds the Scrappy::Scraper::UserAgent object which is used to set and manipulate the user-agent header of the scraper.
my $scraper = Scrappy->new; $scraper->user_agent;
So in that context, how would I set the user_agent correctly to be firefox using Scrappy 0.94112090? There used to be way. Maybe it was removed. I seem to be missing the entire picture somehow :)


Comment on Re^4: Scrappy user_agent error
Select or Download Code
Re^5: Scrappy user_agent error
by roboticus (Chancellor) on Jan 04, 2012 at 04:01 UTC

    docster:

    I've not used Scrappy, but perhaps you could check out the tests for examples on how to change the user agent name.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re^5: Scrappy user_agent error
by jethro (Monsignor) on Jan 04, 2012 at 10:16 UTC
      Tested this but it still gives an error. Maybe Scrappy is just not the right tool for this job..
      Can't locate object method "user_agent" via package "scraper" (perhaps + you forgot to load "scraper"?) at ./scrappy.pl line 8. #!/opt/local/bin/perl use strict; use warnings; use Scrappy; my $url = 'http://google.com'; my $scraper = Scrappy->new; scraper->user_agent("opera","Macintosh"); $scraper->get("$url"); print $scraper->domain, "\n"; # print www.google.com __END__

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://946111]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (14)
As of 2015-07-02 18:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (44 votes), past polls