Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options

Re: perl regex or module that identifies bots/crawlers

by shigetsu (Hermit)
on Mar 20, 2007 at 19:18 UTC ( #605732=note: print w/replies, xml ) Need Help??

in reply to perl regex or module that identifies bots/crawlers

Perhaps HTTP::BrowserDetect's robot() method?
  • Comment on Re: perl regex or module that identifies bots/crawlers

Replies are listed 'Best First'.
Re^2: perl regex or module that identifies bots/crawlers
by argv (Pilgrim) on Mar 20, 2007 at 21:56 UTC
    Perhaps HTTP::BrowserDetect's robot() method?
    While I retain my enthusiasm for this module, and while it does precisely what I wanted it to do -- namely, to have a simplified/generic series of regex's that can determine whether a browser is a robot -- it suffers from a problem that plagues all who venture into this area: it's impossible to keep up with the robots. I've found numerous databases of known robot names, and all of them stipulate that none of these lists are complete. It is an unsolvable problem, which is the primary reason for the crypt glyphs you see on pages (that make you type something to prove you're a human). That said, the robot() method does a good enough job for now, and certainly well worth not having had to spend more time dealing with this problem. Great bang for the buck. perlmonks rescued me once again...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://605732]
[marto]: today in Zukbot 5000 news

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (4)
As of 2018-03-23 13:04 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (293 votes). Check out past polls.