Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Randomizing Unique ID?

by Martin A (Beadle)
on Aug 09, 2000 at 15:24 UTC ( #27006=perlquestion: print w/replies, xml ) Need Help??
Martin A has asked for the wisdom of the Perl Monks concerning the following question:

Last week i asked for the best website login solution here at the perlmonks, and I got alot of good answers. Thanks alot. So then I decided to go for the Unique Session ID solution, But now I have run into another problem.

I would like to get a randomized Unique ID that looks something like this: OZE43c6BXgIAAALlJX4.
(Useing the A-Z, a-z, 0-9 chars)

Which would be the best way of doing this?

// Martin

Replies are listed 'Best First'.
Re: Randomizing Unique ID?
by jettero (Monsignor) on Aug 09, 2000 at 15:36 UTC
    #!/usr/bin/perl use Time::HiRes qw /time/; use Crypt::Blowfish; $unique = &ecry(time); # the_line print "Unique ID: $unique\n"; sub ecry() { my ($e_msg, $m); my $msg = shift; my $block; my $cipher = new Crypt::Blowfish pack("H32", "0123456789abcdef"); $msg = time . "$msg"; while($msg =~ m/(.{1,8})/g) { $m = $1; $m = $m . " " while length($m) < 8; $block = $cipher->encrypt($m); $e_msg = $e_msg . $block; } return unpack("H*", $e_msg); }

    update: tilly suggests using some combination of time and Math::TrulyRandom on the_line (above).

Re: Randomizing Unique ID?
by ar0n (Priest) on Aug 09, 2000 at 16:32 UTC
    Considering the fact of me being lazy, I'd do something like this (just to save typing space):
    use Digest::MD5 qw(md5_hex); use Time::HiRes qw(time); my $id = md5_hex time();

    -- ar0n | Just Another Perl Joe

      In practice, I prepend a string ($msg = "vcmi$msg") before I encrypt. Then, when I'm given one of my unique id's back I can verify that it hasn't been fake'd by decrypting it and checking for the /vcmi/ at the beginning.

      Anyhoo, that's why I use'd blowfish.

RE: Randomizing Unique ID?
by Adam (Vicar) on Aug 09, 2000 at 21:24 UTC
    Any algorithm using "random" numbers risks repition. What you want is a unique number. Something like $$ . time (Thats the process ID concatenated with the epoch.) If you want it to use letters and such, change the base after generating the number. Here is an algorithm for that:
    # For a positive $number my @nums = (0..9,'a'..'z','A'..'Z'); my $base = @nums; my $rep = ""; # this will be the end value. while( $number > 0 ) { $rep = $nums[$number % $base] . $rep; $number = int( $number / $base ); }
    (Thanks Tye for looking over my shoulder and helping construct that simple algorithm)

      Note that this is barely adequate since a "typical" double has 53 bits of mantissa (not counting the sign bit) which can handle just under 16 decimal digits while time() currently returns 9 digits but will soon return 10, while $$ is typically no more than 5 digits (total will soon be 15 digits).

      Doing the conversion to base-62 in two steps would give you lots of room (and different results).

              - tye (but my friends call me "Tye")
        Well, If you want more room in the double, then do the base conversion before doing the concatenation, thusly:
        # Generate a truly unique ID sub GenerateBase { my $base = shift; $base = 62 if $base > 62; my @nums = (0..9,'a'..'z','A'..'Z')[0..$base-1]; return sub { my $number = shift; my $rep = ""; # this will be the end value. while( $number > 0 ) { $rep = $nums[$number % $base] . $rep; $number = int( $number / $base ); } return $rep; } } my $ToBase62 = GenerateBase( 62 ); my $ID = $ToBase62->( $$ ) . $ToBase62->( time );

      However, the risk of repetition is more than outweighed by the new risk of predictability, especially using time and a process ID (yes, some smarter OS's have non-sequential PIDs). I do not think the issue of repetition is much of a risk. In the example the original poster used, the chance of a repetition is about 1 in 10^31. I'll take those odds any day. :)

        The rand() solutions are nearly as predictable, since rand() is deterministic and the latest Perl's seed function is deterministic based on $$, time(), and PL_stack_sp (except on some platforms). That last value, at the time it is checked, is probably often constant across invocations of the same script.

        The odds are much worse than 1/10^31 as the entire sequence from rand() is deterministic based on the seed. Perhaps you meant 1/2^32, which presumes a completely random seed, which you don't have.

        If you want to avoid predictability, then use Math::TrulyRandom (or encrypt using a secret key, if you think you can keep it secret) as others suggested.

        At least Perl has a much better seed generator these days.

        Also, even 1/2^31 would be the odds of a single duplicate. If you have lots of IDs with overlapping lifetimes, then the odds rise surprising fast. If you rarely have more than a few IDs active at once, then 1/2^31 might well be acceptable. But if it is possible, for example, to have 6600 simultaneous IDs, you get about a 1% chance of a collisions per batch of IDs; which probably isn't acceptable.

                - tye (but my friends call me "Tye")
        You can have the best of both worlds. Use a combination of randomness and uniqueness to make the output unique. Something like, (and I am building on the code in RE: RE: RE: Randomizing Unique ID? ):
        my $ID = $ToBase62->( $$ ) . $ToBase62->( rand( $$ ) ) . $ToBase62->( +time );
        Not forgeting to call srand() at some point.
        And of course, you can use whatever rand() you want.

        Update: Don't bother seeding. Tye explains the internal seed here. Also, see the Base Conversion Utility.

Re: Randomizing Unique ID?
by BlaisePascal (Monk) on Aug 09, 2000 at 15:42 UTC
    I dunno, would something like:
    my $id = ""; for (1..18) { $id .= ('A'..'Z','a'..'z','0'..'9')[int(rand(62))];}
    work for you?
      Nice solution. Downsides:
      • This recreates the 62 element list repeatedly in memory, since there's no place to store it.
      • The number "62" is hardcoded, and must match the list exactly. That's a maintenance headache: suppose a requirement came along to ensure that underscore were included, or that digits were not.
      • The int is redundant.
      Implementing these three changes, I'd go with something like:
      BEGIN { my @source = ('A'..'Z', 'a'..'z', '0'..'9'); sub generate_random_password { my $id = ""; for (1..18) { $id .= $source[rand @source]; } $id; } }
      For so-called "pronounceable" passwords, try Crypt::RandPasswd. Looks nice, and based on a government standard.

      -- Randal L. Schwartz, Perl hacker

        About your downsides...

        1) Yup, I can see capturing the 62-element in a closure instead of re-creating it. I didn't think of that when I was coding off the top of my head.

        2) Nice catch

        3) Oops... I missed that. I looked in the library to verify the result of rand() and it said it returns a float, not an int. But I guess the [ takes care of that...

        Since this is supposed to be a "UniqueID" and not a password, using "pronounceable" passwords are probably not necessary.

        Why put it in a BEGIN block? Doesn't seem necessary to me.

Re: Randomizing Unique ID?
by turnstep (Parson) on Aug 09, 2000 at 16:35 UTC

    Here is what I use:

    my @letters = ('A'..'Z','a'..'z',0..9,qw(! # $ % ^ | _)); my $UID = join("", @letters[map {rand @letters} (1..8)]);
    I don't use the characters @ or & because it is sometimes passed through a cgi and I don't want to make things more confusing than they have to be. By the way, 8 characters should be plenty, as there are about 69^8 possible combinations. Or without the funky characters, 62^8, which is 218340105584896 possible unique IDs. Or get real paranoid and bump it up to 16 characters. :)

Re: Randomizing Unique ID?
by Martin A (Beadle) on Aug 09, 2000 at 19:57 UTC
    Thanx everybody.
    I Think I'll have a go at turnstep's or BlaisePascal's solution. Not that I don't think that the others are good to, but the thing is is that the sucky server that i use doesn't allow me to use any modules. :(

    But if there is any other monks with other (non module based) solutions, I am still open for new ideas.

    // Martin

      Note that both of these have a risk of collision that cannot be reduced below 1/2^(number of actually random bits in the seed), no matter how many characters you decide to use in the session ID. This is perhaps a small risk in this particular case, but I thought the general problem should be mentioned.

      On old versions of Perl, srand() isn't even called so that each solution will always produce the exact same ID.

      Adam's is a better idea and nearly as simple. It only requires about 5 characters of ID. You can't have a collision unless 32000 processes are created within a single second on your server. If you have multiple computers doing sophisticated load balancing of web requests such that IDs need to be unique across machines, then you'll need to append some characters that are unique to a machine (what to use depends on how your load balancing is done).

              - tye (but my friends call me "Tye")

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://27006]
Approved by root
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (6)
As of 2017-06-28 07:48 GMT
Find Nodes?
    Voting Booth?
    How many monitors do you use while coding?

    Results (628 votes). Check out past polls.