Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Need to remove session IDs

by wolfi (Scribe)
on Apr 10, 2004 at 03:16 UTC ( [id://344073]=note: print w/replies, xml ) Need Help??


in reply to Need to remove session IDs

i may be totally under-thinking this, but a really brief and crude thought...just bypass your normal routines, if it's a bot.(I'm assuming, you probably want to do less work on them anyways.)...and just split it into two sections. Bots and Not-Bots.
if ($ENV{'HTTP_USER_AGENT'}=~/^googlebot\/2\.1|slurp|some_other_bot_na +me$/){ #put any other routines you want to run on the request -> like logs, e +tc here or just... print "Location: "http:\/\/www\.example\.org\/$ENV{'PATH_INFO'}"; } # And if not in your bot-list, do what you originally planned... else{ your original script's body here }

it probably be easier to use the whole googlebot\2.1 stuff in a $variable associated with some array - than putting too many | or statements into that regex... but i'm being lazy and non-thinking at the moment.

one word of caution before using something like this-> any and all $ENV variables need to be cleaned up; One needs to ensure that they have no evil characters in them. As a thought for the directories you have there: $ENV{'PATH_INFO'}=~/^([a-zA-Z_0-9]+\/?)*$/

one can't rely on the $ENVironmental variables too much, but in this case, it probably would work.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://344073]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (2)
As of 2024-03-19 06:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found