in reply to Bag uniform distribution algorithms
After mulling on this one for a while, in particular thinking about how to track when a letter was due to get output again and how to maintain a distribution, I think I have something good. Compute a rolling score for each element, where it accumulates based upon probability of output and gets dinged when its element gets selected; maximum score always gets chosen.
use strict;
use warnings;
use List::Util 'sum';
use Data::Dump qw[ pp ];
sub gen {
my $href = shift;
my %score = %$href;
my $norm = sum values %score;
my @series;
for (1 .. $norm) {
my ($max, $elem) = 0;
for (keys %score) {
($max, $elem) = ($score{$_}, $_) if $score{$_} >= $max;
$score{$_} += $href>{$_};
}
push @series, $elem;
$score{$elem} = $norm;
}
return \@series;
}
pp gen( { A => 4, B => 2, C => 3, D => 1 } );
pp gen( { A => 5, B => 4, C => 3, D => 2, E => 1 } );
outputs
["A", "C", "B", "A", "D", "C", "A", "B", "C", "A"]
["A" .. "D", "A", "B", "E", "A", "C", "B", "A", "D", "C", "B", "A"]
It also naturally extends to infinite series and noninteger element counts. The infinite series case is why it's necessary to have a >= in the score comparison; you could alternatively initialize $max to something negative since the sum of all scores is necessarily zero at all times. N*M complexity, where N is the length of desired series and M is number of distinct elements.
This approach also lends itself to designing a metric for how smooth a series is:
use strict;
use warnings;
use List::Util 'sum';
sub measure {
my @series = @_;
my %count;
$count{$_}++ for @series;
my $norm = sum values %count;
$_ /= $norm for values %count;
my %score = %count;
my $metric = 0;
for my $elem (@series) {
$score{$elem} = 1;
for (keys %score) {
$metric += $score{$_}**2;
$score{$_} += $count{$_};
}
}
return $metric/@series;
}
print measure("A", "C", "B", "A", "D", "C", "A", "B", "C", "A"), "\n";
print measure("A", "C", "B", "A", "C", "D", "A", "B", "C", "A"), "\n";
print measure("A", "A", "A", "A", "B", "B", "C", "C", "C", "D"), "\n";
outputs
0.37
0.37
3.45
#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.
