dthacker has asked for the wisdom of the Perl Monks concerning the following question:

I'm not lazy enough. Consider the following case. I'm working on a soccer simulation that assigns players a score for each of 4 attributes. For example midfielders will have a passing attribute between 6 and 21 , shooting attribute between 6 and 12, and so on. I want to tweak the distribution to a "bell shaped curve" so that most of the players are assigned a score in the midpoint of the range, and fewer are assigned a high or low score. My solution was to create a file with 100 values like this: I read the file into an array, generate a random number, and assign the value in  @array[$rand] to a player.

There must be an easier and more perlish way to do this, but I'm having trouble visualizing a data structure that would do it. Any suggestions?

Code On!

Edit by BazB, fix sig div tag.

Replies are listed 'Best First'.
Re: Efficient ways of storing a data set for random access
by Zaxo (Archbishop) on Jan 29, 2004 at 03:45 UTC

    You can get random numbers with a distribution function from Math::Random. The &Math::Random::random_normal function is exported by default. It takes up to three arguments, the number of samples to generate, their mean and their standard deviation. It will generate a single sample in scalar context.

    After Compline,

Re: Efficient ways of storing a data set for random access
by l3nz (Friar) on Jan 29, 2004 at 12:35 UTC
    Your approach is not bad at all; you are trading memory versus a much higher access speed than the actual generation might require. The use of a data file allows for simple tweaking of the data distribuition just by altering the number of elements in the file.

    I'd extend it by randomizing based on the actual number of item read from file, and I'd load the file at startup without touching it anymore; this way your program will be fast.

    Or of course you could use one of the statistical modules...

Re: Efficient ways of storing a data set for random access
by Roy Johnson (Monsignor) on Jan 29, 2004 at 16:19 UTC
    A sum of random numbers will be more bell-shaped than a single random number. Hence, rolling 2 6-sided dice will yield more sevens on average than rolling an 11-sided die numbered from 2 to 12. The more dice, the more heavily weighted toward the center of the range you'll be.

    For your 6-21 example, there are 16 numbers that need to be covered. You could roll two 16-sided dice, add them together, and divide by two to get a center-weighted result in the desired range.

    Here's a little program to demonstrate the distribution:

    my $range=16; my $low_end=6; my %freq = (); for (1..1000) { my $result = int((rand($range)+rand($range))/2)+$low_end; ++$freq{$result}; } print "$_: $freq{$_}\n" for (sort {$a<=>$b} keys %freq); __END__ 6: 10 7: 35 8: 44 9: 52 10: 61 11: 77 12: 107 13: 109 14: 118 15: 111 16: 95 17: 46 18: 49 19: 57 20: 24 21: 5

    The PerlMonk tr/// Advocate