P is for Practical PerlMonks

### Comment on

 Need Help??

Here is a "best ordering" optimizer that is just my previous trivial code with two short-circuits added to it.

First, if we've managed to come up with an ordering that costs 18.1 units already (our best ordering so far), and our current calculations only get through the first 9 filters and have already added up 18.2 units of cost, then there's no point permuting any of the remaining filters (nor measuring any further using that starting arrangement).

Second, if my next permutation is going to place a filter with a cost of 5 and a selectivity of .9 in front of a filter with a cost of 4 and a selectivity of .8, then I would just be wasting time to do anything with that permutation. So this code does some efficient checks to prevent at least a lot of cases of putting one filter in front of another filter that it is completely worse than (neither cost nor selectivity improved). Well, assuming the filters start out sorted reasonably, anyway.

This code starts out by naively sorting the filters by the product of cost and selectivity. That gets you pretty close to the ideal solution in the few cases I dreamed up so far. It also ensures that no filter starts out in front of a filter that is completely better than it, which is required for the second short-circuit above to be very effective.

I don't have time right now to explain this code or even test it more than the little bit that I have. NextPermuteNum() in this code is just a modification of the one from Algorithm::Loops.

The usage is to give a list of filter performance values on the command line. Each performance value is an integer cost followed by a decimal selectivity. So "4.9" indicates a filter that eliminates 10% of items (keeps 0.9) and costs 4 units (of CPU).

So a sample run might look like:

``` % filterOrder 1.9 1.8 2.9 2.8 3.9 3.8 4.9 4.8 30.1 20.2 10.3
1 .8, 1 .9, 2 .8, 2 .9, 3 .8, 3 .9, 30 .1, 10 .3, 4 .8, 4 .9, 20 .2,
0.002 19.00391 ( 0 1 2 3 4 5 6 7 8 9 10 )
0.100 12.88756 ( 0 1 2 7 4 8 3 10 5 6 9 )   9
1 .8, 1 .9, 2 .8, 10 .3, 3 .8, 4 .8, 2 .9, 20 .2, 3 .9, 30 .1, 4 .9,
3.750 seconds

Which shows the first stab at a solution and the final solution were:

```1 .8, 1 .9, 2 .8,  2 .9, 3 .8, 3 .9, 30 .1, 10 .3, 4 .8,  4 .9, 20 .2
|     |     |      \+3   |     \+3    \+3    /-3   /-3    \+1    /-3
|     |     |            |
|     |     |      /+3   |     /+3    \-3    /+3   \-3    \-3    \-1
1 .8, 1 .9, 2 .8, 10 .3, 3 .8, 4 .8,  2 .9, 20 .2, 3 .9, 30 .1,  4 .9

And it took only 0.1 seconds to find the optimal solution (but then took 3.75 seconds to verify that there was nothing better). An exhaustive search would have taken minutes without those short-circuits.

It still isn't practical for optimally ordering 30 or 40 filters; the run time would depend greatly on how many filters can be declared "completely better than" how many others but could easily run into the "millennia" range for such large lists.

It can probably be most useful for improving the weighting formula used for the initial sort once you have some typical numbers for your situation. \$cost*\$selectivity isn't a bad starting point, but I'm sure, especially given a limited pattern of values, one can quickly come up with something better with a bit of trial and error.

If the sort gets the first several filters in the right order, then an optimal solution can be had very quickly. A good ordering can be had immediately and better orderings can pop out as you let the thing run.

It may also be useful as something to further modify to get to something that can get even closer to the optimal total cost rather quickly. For example, if you identify a subset of the filters that are likely to be the best ones, then you can separately optimize just that subset first to get a better first part of the ordering.

Anyway, here is the code (which is not pretty nor modular but appears to work reasonably well in the few test cases I played with).

```#!/usr/bin/perl -w
use strict;

use Time::HiRes qw< time >;

\$| = 1;
my( @Costs, @Trims );
for(  @ARGV  ) {
my( \$cost, \$trim ) = /^(\d+)(\.\d+)\$/ or die "Invalid cost.trim: \$
+_\n";
push @Costs, \$cost;
push @Trims, \$trim;
}
my \$Start = time();
my @order = sort {
\$Trims[\$a]*\$Costs[\$a] <=> \$Trims[\$b]*\$Costs[\$b]
} 0..\$#Costs;
@Trims = @Trims[@order];
@Costs = @Costs[@order];
print " \$Costs[\$_] \$Trims[\$_],"
for  0..\$#Costs;
print \$/;
my @Win;
cost( \@Costs, \@Trims );
print \$/;
print " \$Costs[\$_] \$Trims[\$_],"
for  @Win;
printf "\n%.3f seconds\n", time() - \$Start;

sub NextPermuteNum(\@)
{
my( \$vals )= @_;
my \$last= \$#{\$vals};
return !1   if  \$last < 1;
while( 1 ) {
# Find last item not in reverse-sorted order:
my \$i= \$last-1;
\$i--   while  0 <= \$i  &&  \$vals->[\$i+1] <= \$vals->[\$i];
# If complete reverse sort, we are done!
if(  -1 == \$i  ) {
# Reset to starting/sorted order:
@\$vals= reverse @\$vals;
return !1;
}
# Re-sort the reversely-sorted tail of the list:
@{\$vals}[\$i+1..\$last]= reverse @{\$vals}[\$i+1..\$last]
if  \$vals->[\$last] < \$vals->[\$i+1];
# Find next item that will make us "greater":
my \$j= \$i+1;
\$j++
while   \$vals->[\$j] <= \$vals->[\$i];
do {
# Swap:
@{\$vals}[\$i,\$j]= @{\$vals}[\$j,\$i];
# If "greater" item isn't strictly worse than try it:
return 1
if  \$Costs[\$vals->[\$i]] < \$Costs[\$vals->[\$j]]
||  \$Trims[\$vals->[\$i]] < \$Trims[\$vals->[\$j]];
# Else, we picked a strictly worse one, so pick again
\$j++;
} while(  \$j <= \$last  );
# Force us to move back to at least \$i-1:
@{\$vals}[\$i..\$last] = sort { \$b <=> \$a } @{\$vals}[\$i..\$last];
}
}

sub cost {
my( \$costs, \$trims ) = @_;
my @order = 0 .. \$#\$costs;
my \$best = 0;
my \$max = 0;
do {
my \$size = 1;
my \$cost = 0;
my \$skip = @order;
for my \$i ( @order ) {
\$skip--;
last
if  \$best  &&  \$best < \$cost;
\$cost += \$size * \$costs->[\$i];
\$size *= \$trims->[\$i];
}
if(  ! \$best  ||  \$cost <= \$best  ) {
printf "\r%.3f %.5f ( %s )",
time() - \$Start, \$cost, "@order";
print \$/
if  ! \$best;
@Win = @order;
\$best = \$cost;
\$max = 0;
} elsif(  1 < \$skip  ) {
if(  \$max < \$skip  ) {
printf "%4d\b\b\b\b", \$skip;
\$max = \$skip;
}
@order[\$#order-\$skip+1..\$#order] = sort { \$b <=> \$a }
@order[\$#order-\$skip+1..\$#order];
}
} while(  NextPermuteNum(@order)  );
}

- tye

In reply to Re: Evolving a faster filter? (code) by tye
in thread Evolving a faster filter? by Ovid

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

• Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
• Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
• Read Where should I post X? if you're not absolutely sure you're posting in the right place.
• Posts may use any of the Perl Monks Approved HTML tags:
a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
• You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
 For: Use: & & < < > > [ [ ] ]
• Link using PerlMonks shortcuts! What shortcuts can I use for linking?

Create A New User
Chatterbox?
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (5)
As of 2017-08-21 23:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
Voting Booth?
Who is your favorite scientist and why?

Results (326 votes). Check out past polls.

Notices?