Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Custom-length random/unique string generator

by RecursionBane (Sexton)
on Feb 02, 2013 at 17:15 UTC ( #1016725=perlquestion: print w/ replies, xml ) Need Help??
RecursionBane has asked for the wisdom of the Perl Monks concerning the following question:

Greetings, Monks!
I would like your opinion on any weaknesses in this (sufficiently random) random string generator in order to build universally unique identifiers (UUIDs). These UUIDs are used only on the local network to identify frequent build processes, where each build should be individually addressable.
I understand that there are several CPAN modules claiming to provide near-random numbers and even UUIDs, but I'm looking for an alphanumeric string, so I went with a simple, custom approach.
sub generate_unique_id { # Creates and returns a "random" string of 56 alphanumeric # Usage: generate_unique_id() # This includes [a-z] [A-Z] [0-9], a range of 62 characters # The random number generator is seeded elsewhere with a permutati +on of this child's process ID and current machine time my $buildID = md5_base64 ( rand($$) ) . md5_base64 ( rand($$) ) . +md5_base64 ( rand($$) ); # Strip out "+" and "/" $buildID =~ s/\+//g; $buildID =~ s/\///g; # Truncate to fifty six characters (this is 2.4e+100 possible comb +inations, or approximately two googol) # It would take 187 sesvigintillion (187e+81) years at 4 billion b +uilds/second for a conflict to arise, which makes it a fairly unique +buildID return substr ($buildID, -56 ); }
Your suggestions for code improvement are most welcome!
Sincerely,
~RecursionBane

Comment on Custom-length random/unique string generator
Download Code
Re: Custom-length random/unique string generator
by davido (Archbishop) on Feb 02, 2013 at 17:30 UTC

    Use Bytes::Random::Secure. Seeding is superior, the CSPRNG is ISAAC, and its random_string_from function does exactly what you want, with very few dependencies.

    use Bytes::Random::Secure qw( random_string_from ); my $string = random_string_from( join( '', ( 'a' .. 'z' ), ( 'A' .. 'Z' ), ( '0' .. '9' ) ), 56 ); print $string, "\n"; # Done; no md5 bias, no modulo bias, strong seeding, strong CSPRNG.

    Update: I didn't have time to elaborate earlier. But the point here is that seeding correctly is hard. Generating strong pseudo-randomness is hard. But this is a problem that has been solved already (on CPAN), with a good deal of research, and collaboration. And to get well seeded, high quality random bytes, you need one module, which has exactly three non-core dependencies in its heritage, if you exclude what Test::Warn drags along with it. ...and it works portably across many platforms, and back through Perl 5.8. In some cases even 5.6.

    As others have mentioned there are flaws in the seeding you're using. And an MD5 RNG is less than ideal.


    Dave

Re: Custom-length random/unique string generator
by Athanasius (Monsignor) on Feb 02, 2013 at 17:43 UTC

    A minor point:

    From the Camel Book (4th Edition, 2012, page 695), under the heading “Time Efficiency”:

    • If you’re deleting characters, tr/abc//d is faster than s/[abc]//g.

    So, if speed is a limiting factor, consider replacing:

    # Strip out "+" and "/" $buildID =~ s/\+//g; $buildID =~ s/\///g;

    with:

    $buildID =~ tr{+/}{}d;

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Custom-length random/unique string generator
by martell (Friar) on Feb 02, 2013 at 18:13 UTC

    Hi

    if you need a readable UUID simply use Data::UUID like this my $uuid = Data::UUID->new()->create_str();. This will create a UUID that looks like 4162F712-1DD2-11B2-B17E-C09EFE1DC403. This is an alphanumeric string.

    Kind Regards

    Martell

Re: Custom-length random/unique string generator
by BrowserUk (Pope) on Feb 02, 2013 at 19:33 UTC
    my $buildID = md5_base64 ( rand($$) ) . md5_base64 ( rand($$) ) . md5_base64 ( rand($$) );

    Let's see. On most *nix's, max_pid is 32768. That means that at most, you have 32768**4 = 1,152,921,504,606,846,976 inputs.

    But, if your process happens to get allocated pid=100; then you only have 100**4 = 100,000,000 possibles.

    The random number generator is seeded elsewhere with a permutation of this child's process ID and current machine time

    Again, pids ranging from ~ 2 .. 32768. Time() during the working day, say 8.00am to 6pm: 29,000 .. 64,000.

    You don;t say how they are combined, but typically people use something like  srand( $$ ^ time() ), and that produces a range of number far smaller than they intuitively expect.

    this is 2.4e+100 possible combinations,

    I think that is way optimistic.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Custom-length random/unique string generator
by RichardK (Priest) on Feb 02, 2013 at 19:42 UTC

    Using the process id for your random number doesn't look like a good idea.

    PIDs get reused so they're not unique and can be a very small value.

    It's a bit better to make a string from the date time plus other info then hash that.

    Or as other people have said, it's a lot better to use an existing module :)

Re: Custom-length random/unique string generator
by RecursionBane (Sexton) on Mar 21, 2013 at 19:02 UTC
    Thank you for all your responses! It looks like using an existing module might be the way to go.
    Again, thanks, everyone!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1016725]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (6)
As of 2014-12-23 01:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (133 votes), past polls