|Problems? Is your data what you think it is?|
How safe is truncating an MD5 digest string?by lemmett (Sexton)
|on Sep 10, 2001 at 22:20 UTC||Need Help??|
lemmett has asked for the
wisdom of the Perl Monks concerning the following question:
Let me warn you ahead of time this is more of an algorithm question than a pure Perl question.
I recently got handed the job of maintaining a web application that registers visitors to a site for sweepstakes and free samples. It uses Apache::Session to store the visitor's information. This is the routine from Session.pm that creates a session id to use as a key.
To be a good key for this use on the web, I'm primarily worried about two things: is it unique, and how likely is it that given one key a person can guess another valid one.
The string normally returned from hexhash is 32 chars long. This routine is tossing 16 of those characters right out the window (64 of 128 bits).
If I convert to a base 64 representation for the 128 bit quantity (22 chars) I only lose 36 bits by truncation. If I do what seems the logical thing and override the truncation being done by the module, our current database column definitions are too short and several of the reporting tools have to be examined for bugs (best case) or partially rewritten (worst case). In order to get management sign-off for that, I need to present a pretty strong case for change.
My questions are:
If I'm doing the math correctly, 28 bits gives me almost 270 million buckets (a very rough 1 to 1 mapping to the 285 million in the US). I could code that in hex as 7 characters or base64 as 5 chars (although then the user has to get the case of the letters correct when they type them in).
Maybe a better second question would be:
BTW, if this offends you because it's not directly a Perl question, I'd love to take it elsewhere but I don't have access to newsgroups/IRC from work and don't know of a more appropriate place to ask. If you do, please suggest it.