### Top and bottom 10 percent elements of an array

 on Apr 29, 2010 at 04:35 UTC
sesemin has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I need to sort an array ascending or descending does not matter. then choose the top and bottom 10 percent of data, replace them with A (for tops) and replace them by B (for lows). The remaining replace by "-". The put the data back to their original order.Something like the following. The 10 percent is hypothetical can be any percentage.

```@array = (2 ,4, 3, 8, 9, 12, 13, 20, 18, 7 )
@sortedarray = (20, 18, 13, 12, 9, 8, 7, 4, 3, 2)
After replacement (A, A, -, -, -, -, -, -, -, -, B, B)
@finalarray = (B, -, B, -, -, -, -, A, A , -)

I know how to sort by index like the following code but I am just wondering if you can help me to learn how to the replacement. May be map function is the way to go.

```#! perl -slw
use strict;

my @array = (2 ,4, 3, 8, 9, 12, 13, 20, 18, 7 );
my @orderedIndeces = sort{
\$array[ \$b ] <=> \$array[ \$a ]
} 0 .. \$#array;

my \$n = scalar @array;
my \$twentyperc = \$n * 0.2;

for my \$i (0..\$#orderedIndeces){
if ( \$i < \$twentyperc) {

\$array[\$orderedIndeces[\$i]] = "A";

print "\$array[\$orderedIndeces[\$i]]\t";
}
elsif (\$i >= \$n-\$twentyperc){
\$array[\$orderedIndeces[\$i]] = "B";
print "\$array[\$orderedIndeces[\$i]]\t";

}
else{
\$array[\$orderedIndeces[\$i]] = "-";
print "\$array[\$orderedIndeces[\$i]]\t";
}
print "\n";

}

Re: Top and bottom 10 percent elements of an array
by ikegami (Pope) on Apr 29, 2010 at 05:53 UTC

It's simpler if you sort the indexes instead of the values.

```my \$portion = 0.20;
my @array = (2, 4, 3, 8, 9, 12, 13, 20, 18, 7);

my \$keep = int(@array * \$portion);
my @sorted_idxs = sort { \$array[\$a] <=> \$array[\$b] } 0..\$#array;

my @final = ('-') x @array;
\$final[\$sorted_idxs[\$_]] = 'B' for 0..\$keep-1;
\$final[\$sorted_idxs[\$_]] = 'A' for -\$keep..-1;

print("@final\n");
```B - B - - - - A A -

Update: Fixed off-by-one error.

Re: Top and bottom 10 percent elements of an array
by nagalenoj (Friar) on Apr 29, 2010 at 05:23 UTC

This works better for the given elements. It gives the result as you need. Refer splice to know about it.

```my @array = (2 ,4, 3, 8, 9, 12, 13, 20, 18, 7 );

my \$percent = (scalar @array * 0.20);
my (@resultA, @resultB);

my @ordered = sort {\$a <=> \$b} @array;

@resultB = splice(@ordered, 0, \$percent);
@resultA = splice(@ordered, (scalar @ordered - \$percent));
print "@array", "\n";

for (my \$i=0; \$i < scalar @array; \$i++)  {
if ( grep { \$array[\$i] eq \$_ } @resultA ) {
\$array[\$i] = 'A';
}
elsif ( grep { \$array[\$i] eq \$_ } @resultB ) {
\$array[\$i] = 'B';
}
else  {
\$array[\$i] = '-';
}
}

print "@array", "\n";

Thanks, easy to implement and easy to understand.
Re: Top and bottom 10 percent elements of an array
by codeacrobat (Chaplain) on Apr 29, 2010 at 06:20 UTC
How about a solution with map.
```@array = (2 ,4, 3, 8, 9, 12, 13, 20, 18, 7 );

my \$c=0;
@sort_a_pos = sort {\$a->[1]<=>\$b->[1]} map { [\$c++ => \$_] } @array;

\$pct = @array  / 10;

\$_->[1] = "B" for @sort_a_pos[0 .. \$pct];
\$_->[1] = "-" for @sort_a_pos[\$pct+1 .. \$#array-\$pct-1];
\$_->[1] = "A" for @sort_a_pos[\$#array-\$pct .. \$#array];

@finalarray  = map {\$_->[1] } sort { \$a->[0] <=> \$b->[0] } @sort_a_pos
+;

print "@finalarray";

print+qq(\L@{[ref\&@]}@{['@'x7^'!#2/"!4']});
```\$_->[1] = "B" for @sort_a_pos[0 .. \$pct];
\$_->[1] = "-" for @sort_a_pos[\$pct+1 .. \$#array-\$pct-1];
\$_->[1] = "A" for @sort_a_pos[\$#array-\$pct .. \$#array];
should be
```\$_->[1] = "B" for @sort_a_pos[0 .. \$pct-1];
\$_->[1] = "-" for @sort_a_pos[\$pct .. \$#array-\$pct-1];
\$_->[1] = "A" for @sort_a_pos[\$#array-\$pct .. \$#array];

You probably got confused (like me) by the OP's weird math of 10 * 10% = 2. The output he gave was for 20%.

Re: Top and bottom 10 percent elements of an array
by samarzone (Pilgrim) on Apr 29, 2010 at 05:48 UTC

Note that your logic may fail if there are duplicate entries in the array. You may get "A"s, "B"s or "-"s more/less than required percentage depending on your implementation.

Good point, Thanks

I will try Ikegami or codeacrobat solutions (below). seems more reliable. I was thinking the index sorting is the way to go too but not smart enough to finish the task.

Re: Top and bottom 10 percent elements of an array
by JavaFan (Canon) on Apr 29, 2010 at 07:46 UTC
The 10 percent is hypothetical can be any percentage.
So, how should the array look like if the percentage is 75?
or 100%
or 99%...

Good point. Also, the last two solutions do not work well if the number of array elements are odd. For instance, if we have 11 unique numbers it 5 should be A, 5 B and 1 dash. But both of them print 6 "A"s and 4 "B"s and no dash. I tried to play around with the ranges but could not resolve it. I think 50% is the maximum that we want to go to be able to divide the population of numbers into two categories. Therefore, I am not worry about 75% or higher unless there is another use for this little script.

Re: Top and bottom 10 percent elements of an array
by Limbic~Region (Chancellor) on Apr 29, 2010 at 18:08 UTC
sesemin,
Why do you need to do this? Is the data you presented representative of your actual data? I ask because sorting to find the top/bot N is typically simpler but less efficient than say using a heap. This sounds like it might be a fun and interesting problem but I really wouldn't want to spend time beyond the solution ikegami provided without knowing more.

Cheers - L~R

