in reply to Drawing samples from a set
I'm not a math major, but I don't really see an alternative if your probabilities are not predictable in advance. If they are, you can tackle an algorithm without going through the list, but you imply this is not the case.
Here is one idea though. This is off the top of my head, doubtless it needs some refinement:
If your probabilities don't change often, you could pre-sort the list into "chunks" of recorded sizes (but near a target, frex 0.1) which would make your select time near constant to select the correct chunk, then linear inside that chunk (where you have notably fewer items). If your probabilities change often though, this sorting may not be worthwhile.
Example: a b c d e f g h i j with probabilities 0.02, 0.05, 0.20, 0.11, 0.1, 0.14, 0.08, 0.08, 0.09, 0.11)
We can sort this into a data structure like:
So the end result is that you can examine $sorted->[int($r*10)], move up down inside sorted (usually no more than one element, depending on the differences between your probablities), then go through the subset in that chunk linearly.[ { Total => 0.07, Set => [ { a=> 0.02, b => 0.07}] }, { Total => 0.27, Set => [ { c => 0.20}] }, { Total => 0.27, #repeat since it span's two "subsets" Set => [ { c=> 0.20 }] }, { Total => 0.38, Set => [ { d => 0.11 } ] }, #etc ]
But again, this sorting will be much more expensive than a single linear search, so you'll have to decide if it's worth it.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Re: Drawing samples from a set
by John M. Dlugosz (Monsignor) on Jan 28, 2003 at 17:10 UTC |