http://www.perlmonks.org?node_id=158689

in reply to Rolling a biased die

While IO's solution is really cool, and really simple, it's run time is O(n) (where n = the number of distinct die values). More specificly it is EXACTLY n (i don't remember what that's called ... Theta? Omega?) -- it can never do better, because it allways iterates over all the possible die values. If you are rolling the same die over and over and over that may not be good enough for you (you asked for O(1) ... let's see if we can get there.)

abell's approach is a lot closer to a true O(1) solution. It requires some initialization cost and more space to build a lookup table -- but depending on the how you rank memory/speed it may be worth it. (especially if all of your frequencies are 'relatively' close, requiring you to have 'relatively' few buckets for each value after you've multipled the frequencies by the neccessary factor to get them above 1. The problem is: It's still not truely O(1). For a lot of distrobutions, it will be really close, but you still have to pay an extra cost to resolve th conflicts on all the transition points -- it may seem like that would be inconsequential to the run time, but it really depends on what is important to you. (Example: If you roll the die m times, and your frequencies are such that 1 roll out every k requires a transition point, and picking the riht number at a transtion requires t time, then the run time is O(1 + t(m/k)) ... even if t is 1 (constant time) that's breaks down to O(m/k) which is probably worse then O(n))

(UPDATE: abell made me see the light, his way is O(1) ... and requires a lot less space then the one below, even if the frequencies require that (almost) every bucket be a transition point.)

The only approach I know of that's truely O(1) for every lookup requires that all of the frequency values be integers, such that you can build an array (similiar to the one abell suggested) where each die value has the same number of buckets as it's frequency, and you just pick a random integer value less then the size of the array to use as an index for looking up your die value. This will allways be a true O(1) lookup time cost, but the memory costs can be seriously prohibitive. In MeowChow's orriginal six sided example, converting the frequencies to interger values requires an array of 1322822219 (31 + 20234 + 17 + 1542232 + 1321249563 + 10142) buckets. This appraoch is hardless worth using unless the frequencies are all small integer values and n is "big but not too big" (ie: n ~ 100, and most of the frequencies are 1 with a few 2s 3s, and other small integers)

All in all, IO's solution is the best when you have no idea what n will be, or what the distrobution of frequencies will be. BUT! ... If you know at coding time what n & the frequencies are, and n is relatively small, you can probably get the fastest possible lookup time by hardcodding in a 'switch statement'. If the frequencies are heavily skewed towards a few values, this should definitely give you a performance boost for the "common roll".

Replies are listed 'Best First'.
Re: Re: Rolling a biased die
by abell (Chaplain) on Apr 13, 2002 at 09:08 UTC
The problem is: It's still not truely O(1). For a lot of distrobutions, it will be really close, but you still have to pay an extra cost to resolve th conflicts on all the transition points -- it may seem like that would be inconsequential to the run time, but it really depends on what is important to you. (Example: If you roll the die m times, and your frequencies are such that 1 roll out every k requires a transition point, and picking the riht number at a transtion requires t time, then the run time is O(1 + t(m/k)) ... even if t is 1 (constant time) that's breaks down to O(m/k) which is probably worse then O(n))
Actually, O(1)=O(k) for any constant k, because the O(*) expression "eats" all constant multiplicative factors. Furthermore, you can't put the number of tries in the complexity count and expect it to remain constant. Even if a single computation requires just one CPU cicle, the complexity for computing m is O(m). I believe MeowChow asked for a solution with constant execution time regardless of the number of faces of the die and not of the number of rolls.

Best regards

Antonio

My bad ... you're totally right.

I don't know what i was thinking when i did that calculation. Even if every bucket is a transition point, resolving the transtion only requires that you pick between the two possible (weighted) values for that bucket.

O(1 + k) = O(1)