in reply to Rolling a biased die

abell's approach is a lot closer to a true O(1) solution. It
requires some initialization cost and more space to build a lookup
table -- but depending on the how you rank memory/speed it may be
worth it. (especially if all of your frequencies are 'relatively' close,
requiring you to have 'relatively' few buckets for each value after
you've multipled the frequencies by the neccessary factor to get them
above 1. ~~The problem is: It's still not truely O(1). For a lot of
distrobutions, it will be really close, but you still have to pay an
extra cost to resolve th conflicts on all the transition points -- it may
seem like that would be inconsequential to the run time, but it really
depends on what is important to you. (Example: If you roll the die m
times, and your frequencies are such that 1 roll out every k requires a
transition point, and picking the riht number at a transtion requires
t time, then the run time is O(1 + t(m/k)) ... even if t is 1
(constant time) that's breaks down to O(m/k) which is probably worse
then O(n))~~

(**UPDATE:** abell made me see the light, his way is
O(1) ... and requires a lot less space then the one
below, even if the frequencies require that (almost) every
bucket be a transition point.)

The only approach I know of that's truely O(1) for every lookup requires that all of the frequency values be integers, such that you can build an array (similiar to the one abell suggested) where each die value has the same number of buckets as it's frequency, and you just pick a random integer value less then the size of the array to use as an index for looking up your die value. This will allways be a true O(1) lookup time cost, but the memory costs can be seriously prohibitive. In MeowChow's orriginal six sided example, converting the frequencies to interger values requires an array of 1322822219 (31 + 20234 + 17 + 1542232 + 1321249563 + 10142) buckets. This appraoch is hardless worth using unless the frequencies are all small integer values and n is "big but not too big" (ie: n ~ 100, and most of the frequencies are 1 with a few 2s 3s, and other small integers)

All in all, IO's solution is the best when you have no idea what n will be, or what the distrobution of frequencies will be. BUT! ... If you know at coding time what n & the frequencies are, and n is relatively small, you can probably get the fastest possible lookup time by hardcodding in a 'switch statement'. If the frequencies are heavily skewed towards a few values, this should definitely give you a performance boost for the "common roll".