Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: Generate a unique ID

by DrHyde (Prior)
on Nov 15, 2010 at 10:31 UTC ( #871435=note: print w/ replies, xml ) Need Help??


in reply to Generate a unique ID

Why avoid UUIDs, when they do exactly what you've asked for and someone else has already written and tested both the algorithm and the code? Data::UUID is the correct answer, unless you have some other constraint that you're not telling us about.


Comment on Re: Generate a unique ID
Re^2: Generate a unique ID
by BrowserUk (Pope) on Nov 15, 2010 at 13:20 UTC

    The basic one is that with any mechanism that has random numbers at its core, there is no guarantee of uniqueness, despite what Data::UUID POD says. (If you look up the RFC it uses the phrase: "is either guaranteed to be different from all other UUIDs/GUIDs generated until 3400 A.D. or extremely likely to be different".

    The probability of collision might be very small, but it still exists. It would therefore be necessary to record enough information to disambiguate between chance collisions. And that means recording all the information that went into generating the number in the first place. As I wouldn't have access to all the information, that isn't possible.

    A second reason is that the timestamping mechanisms used by Data::UUID are broken.

    1. On Windows: It uses QueryPerformanceCounter() api to get a timestamp to the required 100 nanosecond resolution.

      But this high resolution elapsed time counter is known to drift with respect to the system clock, but the module make no attempt at corrections.

      See Time::HiRes for its correction mechanism.

    2. On other systems: it uses GetSystemTime(2).

      But that api is limited to microsecond resolution, so it just multiplies by 10.

    While the spec calls for a once-only randomly initialised, thereafter monotically increasing, sequence component, I can see no provision for storing/retrieving this on windows systems.

    The output of the true_random() function is suspect on systems where rand()is limited to 15 bits.

    As my needs are limited to a single system, using a deterministic value with a self-consistent, high resolution time component will suffice.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      I would say that your chances of collision is very, very, very small, even on Windows.

      I wrote a benchmark once (using Benchmark) that used GUID and UUID (from UUID, not Data::UUID) to generate session keys for me. I ran it through 100k iterations on my windows box and it generated fast (1.1 seconds) and produced 0 collisions. I even had a variant that would cut the result to make the string 16 characters (substr(0, 16)) to save storage space. 0 Collisions in 1.1 seconds.

      I just cut it further to 8 characters and it still produced no collisions.

      Sure installing UUID is a pain on windows, and the code for using it is ugly, but I don't see a reason not to use it.

        Thanks. I hadn't seen UUID. But I don't see installing it is possible on windows?

        As for your testing, if you generate them sequentially, there is probably little danger of collision, but what happens if two (or more) concurrent processes or threads call it at the same time?

        My program is threaded and different threads will be generating their own spill files concurrently so this is a significant possibility. And most "UUID" simulate the "clock sequence" value of the spec. which is meant to be initialised once per system/NIC card change using per process globals derived from the same time source as they use for the time component, rendering its purpose negated.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://871435]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (13)
As of 2014-12-18 18:23 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (59 votes), past polls