http://www.perlmonks.org?node_id=1011650

Ovid has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks,

The Code

I have some code which resembles the following:

sub remove_unwanted { my ( $self, @objects ) = @_; foreach my $filter ($self->filters) { @objects = $self->$filter(@objects); last unless @objects; } return @objects; }

There might be thousands of @objects and tens of filters. The filter will return zero or more objects. The number of filters is static (for a given software release), but the number of objects will always change (duh).

The Problem

This is some of the hottest code on a high performance system spread across 75 servers. It needs to run fast, fast, fast. The order in which the filters run can impact this greatly. I've had anywhere between 7% to 41% performance speedups by choosing the order carefully. This means that the slowest filters should usually run later and filters which filter out the most @objects should usually run sooner. However, what if we have a slow filter which filters out many objects? Or a fast filter which filters virtually nothing? Further, the criteria on which things get filtered is very complex and rapidly changes and requires that we use real-world data to determine what's really faster.

What Solution?

At first glance, this really looked like a candidate for genetic programming. One or two of our servers could run this code, instrumented to keep mutating the order of the filters until we find a "fast enough" solution.

I don't want to go down that route if someone can suggest a much easier way. On the other hand, if you think a genetic algorithm is the way to go, I wouldn't mind seeing how you would approach it.