Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Untainting cookies

by mothra (Hermit)
on Apr 11, 2001 at 06:00 UTC ( #71574=perlquestion: print w/replies, xml ) Need Help??

mothra has asked for the wisdom of the Perl Monks concerning the following question:

I'm developing a site that uses cookies to identify users. Initially, to generate the ID that I bake up each cookie with, I use this commonly seen code:
sub unique_id() { # Use Apache's mod_unique_id if available return $ENV{UNIQUE_ID} if exists $ENV{UNIQUE_ID}; require Digest::MD5; my $md5 = new Digest::MD5; my $remote = $ENV{REMOTE_ADDR} . $ENV{REMOTE_PORT}; # ** Note ** This is intended to be unique, not unguessable my $id = $md5->md5_base64(time, $$, $remote); $id =~ tr|+/=|-_.|; # make non-word characters URL friendly return $id; }

Currently, I'm trying this cheap Camel ripoff to untaint a cookie that was given to me from the client (ie. I've already generated the cookie for this client, so they pass it to my program, therefore making it tainted):

sub untaint_cart_id($) { my $old_id = shift; my $cart_id; #print "$old_id<BR>"; if ($old_id =~ /^([-\@\w.]+)$/) { $cart_id = $1; } else { die("Bad Cart ID"); } #print "$cart_id<BR>"; return $cart_id; }

which dies often (in fact, anytime the cookie's ID doesn't contain a mix of -'s, @'s and word chars).

So how can I untaint the cookie when the user returns to the site? I obviously would rather not pull any /^(.*)$/ ugliness, because that doesn't get me anywhere.

Replies are listed 'Best First'.
Re: Untainting cookies
by merlyn (Sage) on Apr 11, 2001 at 06:08 UTC
    Are you so concerned about the size that you can't use hex instead of base64? Hex works fine, and has very safe characters which can be interpolated everywhere.

    Here's what Apache::Session used the last time I looked:

    require MD5; my $session = MD5->hexhash(MD5->hexhash(time.{}.rand().$$));

    -- Randal L. Schwartz, Perl hacker

      How about just tr'ing the initial Base64 ID like so:
      $id =~ tr|+/=|___|; # or $id =~ tr|+/=|000|;
      You would lose just a few bits of randomness (acceptable in this application), but would be left with a shorter ID that's an easy match with a /\w/.
                     s aamecha.s a..a\u$&owag.print
Re: Untainting cookies
by kha0z (Scribe) on Apr 11, 2001 at 09:02 UTC
    Another thing to consider, depending on the level of security that you want, is using a simple digest or even the crypt function.

    Additionally, if all you are doing is tracking a user and unique id number (such as 9 digit number similar to a social security number) might do the trick.

    As merlyn stated the easiest way is to generate characters that are easier to match with regex. If not maybe you should revisit your regex and develop one that will "untaint" all base64 characters.

    Good hunting,

Re: Untainting cookies
by Masem (Monsignor) on Apr 11, 2001 at 17:19 UTC
    A somewhat unrelated problem, but as discussed in this node, using the IP address of the remote user is NOT a good way to guarentee unique session information, particularly if your user is behind a firewall or proxy. While highly unlikely that it might happen that two users from the same proxy may hit your site at the say time (and thus generating the same md5 key), it could still happen. I would at least add another level of randomness to the key before md5'ing it (eg add ".(int rand 10000)") to $remote.

    update fixed node link

    Dr. Michael K. Neylon - || "You've left the lens cap of your mind on again, Pinky" - The Brain
      I would say that it's even more likely than you might initially suspect. Some large organisations, such as AOL, have been known to send all of their traffic through just a handful of gateways. I've run into this problem a few times.

      Typically, as Masem suggests, I add in some sort of random value, and as precise a time value as I care to conjure up, just to even out the randomness a bit. Also, if the script runs on several machines behind a load balancer, I'll use an unique identifier of the machine (host id on Sun, for example) to limit my collision space further. Be creative, but be wary of this problem.

      In addition, the less formulaic the data is that you encrypt, the less likely someone will be able to hijack the session by computing what another user's session identifier is.
Re: Untainting cookies
by traveler (Parson) on Apr 11, 2001 at 19:15 UTC
    Forgive me if I'm misunderstanding, but you seem to be trying to untaint and semi-validate the cookie at once. Try untainting (and ignoring what is in the cookie) then validate it against a list of known cookies or (as you're trying to do now) a "syntax".

    I agree with the others about cookie content, too. Use hex or something easier to deal with than base64. I think base64 is a bit of overkill for what you seem to be doing.


Re: Untainting cookies
by ask (Pilgrim) on Apr 12, 2001 at 11:58 UTC
    Try using Apache::Usertrack which is a Perl version of mod_unique (and hence makes unique and not so ugly ids.

    - ask

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://71574]
Approved by root
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2021-04-12 01:47 GMT
Find Nodes?
    Voting Booth?

    No recent polls found