derekn has asked for the
wisdom of the Perl Monks concerning the following question:
I am trying to calculate percentages. For example, user has 5 choices, each choice will be displayed as percentage of total votes. The problem is that percentages are not displayed nicely as whole numbers (eg, 92.84513%). When using rounding methods to get this to the whole number (93%), the numbers sometimes don't add up to 100 as they should, thus making the percentage reflected inaccurate. Sometimes it's 99, sometimes 101, so on. I have used $percent=sprintf("%.0f", $value) to calculate this but no luck. Any ideas how to accomplish this so that they add up to 100%?
Derek
Re: accurately rounding numbers for percentages by Trimbach (Curate) on Aug 02, 2009 at 22:24 UTC 
What you're asking is not possible. Anytime you round a number you're going to introduce error, how much error will depend on how much you're rounding. Add enough errors together and your total will always be off from the "expected" total (in this case 100%).
The only way around this is to go ahead and round the individual entries to whole numbers for display, but when calculating the total don't add the rounded entries, add the unrounded entries, and then round the result for display, if you want.
 [reply] 

So i'm gonna have to live with "37%, 23%, 9%, 16%, 16%" (rounded values) equalling 101, even though it SHOULD equal 100%?
 [reply] 

That's not what Trimbach said, by a long shot.
If you add the UNrounded numbers percentages, they should total 100% (except for the fact that you'll sometimes run into value/count pairs that are rounded at the end of whatever length decimal value you use: 100/6, for example).
But, for cases such as I infer yours is, a quite standard and commonly accepted practice is to include the disclaimer "Totals may not equal 100% because of rounding."
Update: For clarity (in light of OP's next reply), s/numbers/percentages/ at strikeout above.
 [reply] [d/l] 


 
Re: accurately rounding numbers for percentages by GrandFather (Cardinal) on Aug 02, 2009 at 22:40 UTC 
20.2, 20.2, 20.2, 20.2, 19.2
which would you change when rounded to integer values so the sum was 100?
True laziness is hard work
 [reply] [d/l] 

Lies, Damned Lies, and Statistics
 Benjamin Disraeli
 [reply] 

The last one of course, minimizing the break of symmetry! ;)
I think 33 1/3 ,33 1/3 ,33 1/3 might make your point clearer... 8)
PS: this reminds me of the extra rules for the group phase in football tournaments to decide who continues ...
same points? oh!
same number of goals? oh!
direct comparison undecided? oh!
...
and so on, and if nothing can be chosen for a decision they finally flip a coin! 8)
eg UEFA_Euro_2008#Tiebreaking_criteria
 [reply] 
Re: accurately rounding numbers for percentages by jbt (Chaplain) on Aug 02, 2009 at 23:28 UTC 
Could you store the numbers as numerator/denominator integers and then do integer arithmetic?  [reply] 
Re: accurately rounding numbers for percentages by ysth (Canon) on Aug 03, 2009 at 00:20 UTC 
 [reply] 
Re: accurately rounding numbers for percentages by scorpio17 (Monsignor) on Aug 03, 2009 at 13:48 UTC 
Let's say you've got 5 percentages. Sort them, from high to low, then make the smallest one 100(sum of the 4 bigger ones). This forces them to add up the way you want, but it pushes all the round off error into the smallest percentage. Another way is to make the largest value 100(sum of the 4 smallest). This pushes the error into the largest value. Neither way is "correct" in a strict mathematical sense, but I'm assuming that's not much of a priority for you anyway.
 [reply] 
Re: accurately rounding numbers for percentages by ig (Vicar) on Aug 03, 2009 at 14:06 UTC 
You can have the quantized percentages to add to 100 but doing so will increase the quantization error compared with rounding. Doing so minimizes the aggregate error rather than the individual errors. While others have advocated minimizing the individual errors, there may be cases where minimizing the aggregate error is preferable.
The following example demonstrates one way the aggregate error can be minimized. The implementation is crude, not well tested and replete with print statements which may help you follow what it is doing.
use warnings;
use strict;
use Data::Dumper;
my @percentages = generate();
print "@percentages\n";
my @quantized = quantize(1000,@percentages);
print "Original percentages: @percentages\n";
print "Quantized percentages: @quantized\n";
my $sum;
$sum += $_ foreach(@quantized);;
print "Sum of quantized percentages: $sum\n";
=head2 my @quantized = quantize($factor, @percentages);
The quantize() function takes a quantizaton factor and
an array of percentages which should add to 100%.
It returns an array of quantized percentages
which does add to 100%.
The percentages are quantized to multiples of (100/$factor).
The function minimizes the worst case error.
Two error functions are provided: one is the absolute error
(the difference between the original value and the quantized value)
and the other is the absolute relative error (the absolute error
divided by the value being quantized). There are many other
possibilities, depending on your needs.
=cut
sub quantize {
my $quantum = 100 / shift;
my $error = 0;
my $sum = 0;
my @x = map {
my $q = sprintf("%0.0f", $_/$quantum) * $quantum;
my $d = $q  $_;
$error += $d;
$sum += $q;
[ $_, $q, $d ]
} @_;
print Dumper(\@x);
print "initial total error: $error\n";
print "initial sum: $sum\n";
while(abs($sum  100) > $quantum/2) {
my $direction = ($sum > 100) ? 1 : 1 ;
my $min_error = 10000;
my $min_index = 0;
print "errors of adjusted values: ";
foreach my $i (0..(@x1)) {
my $e = abs($x[$i]>[2]  $quantum * $direction) / $x[$i]
+>[0]; # relative error
#my $e = abs($x[$i]>[2]  $quantum * $direction);
+ # absolute error
print " $e";
if($e < $min_error) {
$min_error = $e;
$min_index = $i;
print "(i = $i)";
}
}
print "\n";
print "adjust $min_index: $x[$min_index]>[0], $x[$min_index]
+>[1] $x[$min_index]>[2]\n";
$x[$min_index]>[1] = $quantum * $direction;
$x[$min_index]>[2] = $quantum * $direction;
print "\t$x[$min_index]>[1], $x[$min_index]>[2]\n";
$sum = $quantum * $direction;
}
return(map { $_>[1] } @x);
}
=head2 generate()
The generate() function generates a somewhat random
array of percentages that adds to 100%.
=cut
sub generate {
my $sum = 0;
my @percentages;
foreach (1..20) {
my $x = rand(50);
if($sum + $x < 100) {
push(@percentages, $x);
$sum += $x;
}
}
push(@percentages, 100  $sum);
return(@percentages);
}
 [reply] [d/l] 
Re: accurately rounding numbers for percentages by ELISHEVA (Prior) on Aug 03, 2009 at 15:06 UTC 
ysth's node above has a link to a nice essay on fudging numbers so that they round up to 100. Apparently in the author's company, they fudge the numbers to add up to 100 so that the help desk isn't inundated with complaints about "mistakes" in the reports the publish. So there may be some situations where, reality aside, one may really need to make those numbers add up to 100!
The question then becomes how to do this so that one minimizes mistaken impressions. One's choice will depend a great deal on how one expects people to view the numbers. If one thinks that readers are making judgements based on absolute percentages then you will want to add your fudge factor to the largest numbers. Adding 1 to 1% doubles it whereas adding 1 to 98% is rather insignificant.
However, percentages are relative measures by nature. Thus one might also assume that readers are making judgements based on relative percentages more than absolute percentages. In that case, one might argue that fudge factors should be randomly to the percentages to avoid
bias. I don't know which is best. I found several articles on subjective perceptions of statistics via google, but most of them were from paid collections and would have required a trip to the university library. Unfortunately, I didn't have the time to look them up.
The article ysth linked to also had a nice sample of test data, so I decided to work up the case of random assignment of fudge factors along with a test suite based on Test::More.
The test suite is wrapped in a subroutine, runTests to make it easier to test alternative algorithms. If you would like to try your own alternate algorithm against the test suite, pass a code reference. Alternate fudging routines should accept two parameters: ($precision, $aHistogram). $precision is the number of decimal digits in your total. For example, if $precision == 2 then your percentages must add up to 100.00. $aHistograph is a histogram whose numbers can add up to anything. The fudging subroutine is responsible for converting them to percentages.
Best, beth
 [reply] [d/l] [select] 
Re: accurately rounding numbers for percentages by scorpio17 (Monsignor) on Aug 03, 2009 at 17:52 UTC 
I just had another idea: dynamically generate a pie chart using something like GD. Then you don't even have to show the actual numbers (a picture is worth a thousand words, etc.)
 [reply] 

