http://www.perlmonks.org?node_id=218821

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Would the following code produce an Id equvialnt to M$'s GUID and/or something that would be totally safe to use in a SQL insert?
#!/usr/local/bin/perl use MD5; use Time::HiRes qw /time/; my $UniqueId = substr(MD5->hexhash(time(). {}. rand(). $$. 'SomeText') +, 0, 32); $UniqueId=~tr/[a-z]/[A-Z]/; $UniqueId=~s/^((\d|\w){8})((\d|\w){4})((\d|\w){4})((\d|\w){4})((\d|\w) +*)$/\{$1\-$3\-$5\-$7\-$9\}/; print $UniqueId; exit(0);

Replies are listed 'Best First'.
Re: Safe for SQL
by valdez (Monsignor) on Dec 10, 2002 at 17:08 UTC

    If your script runs under Apache, then I would use $ENV{UNIQUE_ID}. Otherwise Data::UUID should do what you need. Here is an excerpt from its man page:

    This module provides a framework for generating UUIDs (Universally Unique Identifiers, also known as GUIDs (Globally Unique Identifiers). A UUID is 128 bits long, and is guaranteed to be different from all other UUIDs/GUIDs generated until 3400 A.D. UUIDs were originally used in the Network Computing System (NCS) and later in the Open Software Foundation's (OSF) Distributed Computing Environment. Currently many different technologies rely on UUIDs to provide unique identity for various software components, Microsoft COM/DCOM for instance, uses GUIDs very extensively to uniquely identify classes, applications and components across network-connected systems.

    HTH, Valerio

Re: Safe for SQL
by John M. Dlugosz (Monsignor) on Dec 10, 2002 at 17:05 UTC
    No, it's not equvialnt to either MS's GUID or the original UUID features. That has provisions for setting the clock back (e.g. daylight-savings time or corrections if you notice it's fast) while you could get the same time; The process ID will be the same for the life of the program and could certainly be the same on different machines running at the same time; a UUID includes the machine's MAC address and following the algorithm will always produce unique results.

    How random is rand()? Assuming it's 32 bits, that's the majority of your effect. The $$ will protect against multiple processes calling this at the same time on the same box, if Time::HiRes didn't already do that (e.g. multiple processors).

    I think it's a little better than a random number alone, but not a sure thing. It's longer than it needs to be for the amount of uniqueness it offers. I'm against formatting it like a GUID, since I like to think that if it looks like a GUID it must have been generated using the common algorithm.

    You can find the source code for UUID's in the DCE code.

    —John

Re: Safe for SQL
by hiseldl (Priest) on Dec 10, 2002 at 17:08 UTC

    You could use the MD5->hexdigest() as well:

    use strict; use Digest::MD5; sub generate_id { my $md5 = new Digest::MD5(); $md5->add($$ , time() , rand(9999) ); return $md5->hexdigest(); }

    On my machine, this produces a 32 byte string which appears to be what you're looking for. BTW, this method is used in CGI::Session to create session ID's so it should be relatively safe to use in a SQL insert.

    --
    hiseldl
    What time is it? It's Camel Time!

      It returns a 32 byte string, but you only have 10 bytes of entropy, 2 from the process id, 4 from the time, and 4 from the random number generator. Maybe this is enough, maybe it isn't. There's certainly no garantee it will create unique numbers.

      Abigail

        You are absolutely correct, from RFC 1321:

          ...It is conjectured that it is computationally infeasible to produce two messages having the same message digest, or to produce any message having a given prespecified target message digest...

        So, just using the MD5 algorithm means that there is no gaurantee of a unique message digest. ;-)

        For something such as a session id for a web script, this is usually sufficient. To resolve your argument, you could add more text until you are reasonably certain that your entropy message is not reproducible:

        ... while(<LINES_OF_TEXT>) { $md5->add($_); } ...
        ...the downside is that it takes longer to compute.

        One way to gaurantee that the number is unique for that 'insert' statement is to have that database generate the number either from a sequence or stored procedure.

        I was just trying to keep it simple and sufficient. :-)

        --
        hiseldl
        What time is it? It's Camel Time!

Re: Safe for SQL
by perlplexer (Hermit) on Dec 10, 2002 at 16:49 UTC
    Why not have a sequence set up together with a trigger to automatically populate the "unique id"?

    --perlplexer
      Well, explaining it all would be too much, but the real short version s that my script wouldn't do any inserting or have access to SQL or anything, but would send information from a web script (including this unique Id) to an(other) programmer and Im required to send a proper Id so he can insert with it. Sounds silly to me, but thats what Im asked to do, so...
        <opinion>

        why doesn't your DBA or whoever is in charge of the DB and tables.. just set up an identity column. that way, each time a record gets inserted - it will have a unique ID.

        </opinion>

        it just may be a bit easier for you and them.
Re: Safe for SQL
by Anonymous Monk on Dec 10, 2002 at 18:48 UTC

    I found it easier to use MS SQL's generated GUID using the following:

    use DBI; use DBD::ODBC; my $dsn = "driver={SQL Server};Server=$dbserver;database=$database;uid +=$username;pwd=$password;"; my $cnn = DBConnect($dsn); my $SQL = "SELECT NEWID()"; my $rs_guid = $cnn->prepare("$SQL"); $rs_guid->execute;

    Get the actual GUID from the obtained $rs_guid and the guid will be inserted back to SQL without any problem


    Regards,
    Nathan

      This brings up something I've been pondering for a while with regards to the predictability of using UUIDs.

      Back in early versions of MSSQL7, the newid() function would return predictable results when called consecutively. It appears that Data::UUID also does the same.

      Here's a quick run using create/to_string:

      #!/usr/bin/perl -w -T use Data::UUID; use strict; my $uuidgen = Data::UUID->new; for (1..5) { my $uuid = $uuidgen->create(); print $uuidgen->to_string($uuid) . "\n"; };
      yields
      1D888068-1DD2-11B2-AD1E-F3DCB6356242 1D8899D6-1DD2-11B2-AD1E-F3DCB6356242 1D88A142-1DD2-11B2-AD1E-F3DCB6356242 1D88A5F2-1DD2-11B2-AD1E-F3DCB6356242 1D88A994-1DD2-11B2-AD1E-F3DCB6356242

      I usually use UUID to guarantee uniqueness, and sometimes in things like ecommerce, you don't want these things to be predictable in a basic effort to prevent end user guessing games. Yes, I know securiyy is much larger than that, but preventing predictable sequential keys for some things help. :-P

      If I know the first line above, it's pretty easy to start generating the next 4 through trial and error.

      Somewhere along the line, MSSQL7 SP3+ and MSSQL 2K made the output non predictable when running a series

      select newid(), newid(), newid(), newid(), newid()
      yields
      F52C627C-E150-44AB-8CE9-4A4CBFF2C8EE 5E3190A5-670F-491A-A9E4-18E15B0129FD 227533E0-65A2-4668-B7F0-97F7BCC8625F 7C85C050-4956-4E1E-BFFB-2FE6E6378267 AAF3F967-30C3-418B-BD6F-2CC8B8440DD9

      So, what changed I wonder, and how can I make Data::UUID behave more like MSSQL2K when generating consecutive uuids?

      I tried feeding the previous uuid into the namespace/name of the new uuid during creation and it appears to work.

      #!/usr/bin/perl -w -T use Data::UUID (NameSpace_X500); use strict; my $uuidgen = Data::UUID->new; # Get first random seed uuid my $uuid = $uuidgen->create(); my $prevuuid = $uuidgen->to_string($uuid); for (1..5) { my $uuid = $uuidgen->create_from_name(NameSpace_X500, $prevuuid); $prevuuid = $uuidgen->to_string($uuid); print $prevuuid . "\n"; };
      yields
      997737A7-8390-360A-A356-92B94CC4F6BC 3CA5C37D-B2B7-343F-B1F5-A0F3DAED46F2 21CBBF52-8C16-3AFE-9F40-DCD60825842D 741F2107-D4C1-322C-AEFE-638F63502407 1D47E49F-7568-30D2-9E27-5AF90D407F58
      Certainly less predictable.

      So, the question becomes: have I sacrificed any randomness or uniqueness by doing this?

      -=Chris

Re: Safe for SQL
by Anonymous Monk on Dec 10, 2002 at 17:15 UTC
    Ive examined the Data::UUID, and it looks like it would solve everything. Everything except the fact that no modules will be installed to acomplish this. $%^&!@^#*.
      Everything except the fact that no modules will be installed to acomplish this.
      Not using modules is like being restricted to only odd numbers. Please explain to the powers-that-be that much of the productivity of Perl comes from using modules, especially from the CPAN. Would they expect you to use C without the STDIO library, or Java without the standard classes?

      At worst, grab the source code of that module, and include it in your program.

      There's absolutely no excuse for "no modules". Please stop bringing that up.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

Re: Safe for SQL
by Rudif (Hermit) on Dec 12, 2002 at 00:45 UTC
    On Win* machines, this will work <code> #!perl use strict; use warnings; use Win32; for (0..4) { printf "%s\n", Win32::GuidGen(); }
    Rudif