http://www.perlmonks.org?node_id=585598

Over in Ultra compact set searching, I had a little function called "and_maker" which inspired two comments, so I figured I'd generalize it and give it its own node.

The general problem is: I have an operation I would like to perform on each element in a list.

It's very easy to state. It's very easy to think about. "I want to add up these 5 items." "I want to multiply these 7 things." "I want to xor these 83 elements." Like so many things, the devil's in the details. While you can hardwire your function to add up exactly 5 things, you may need to add up 6 next time. And maybe 4 after that.

The standard approach is to use an iterator. I'll show you a different way. Not necessarily better, but different.

A standard way to do this might be something like the following. Say you want to add up a list of elements:

sub multi_add { my $sum = 0; $sum += $_ foreach @_; return $sum; } my $result = multi_add(3,4,5);

What we're gonna do instead is dynamically generate a subroutine that only operates on the number of elements you have for this call. We'll be slick and cache the function so that subsequent calls give us back the same code snippet. Call the arbitrary_operator_maker with two arguments - the operator and the number of elements.

{ my %dispatch = (); sub arbitrary_operator_maker { my $operation = shift; my $num_elements = shift; my $dispatch_key = "$operation$num_elements"; return $dispatch{$dispatch_key} if $dispatch{$dispatch_key}; my $built = join $operation, map {'$_[' . $_ . ']'} (0..$num_eleme +nts - 1); return $dispatch{$dispatch_key} = eval "sub { $built }"; } }

You then create a new coderef for what you want to do:

my $add_3_things = arbitrary_operator_maker('+', 3); my $result = $add_3_things->(3,4,5); print "$result\n"; my $other_result = $add_3_things->(77,88,99); print "$other_result\n"; #etc.

The function generated is straightforward. If you looked at what it actually does (which you can see by dumping out the $built variable inside the arbitrary_operator_maker, it builds up your list of elements once for you, for repeated use.

my $add_3_things = arbitrary_operator_maker('+', 3); $add_3_things internally looks like: sub { $_[0] + $_[1] + $_[2] } my $xor_5_things = arbitrary_operator_maker('^', 5); $xor_5_things internally looks like: sub { $_[0] ^ $_[1] ^ $_[2] ^ $_[3] ^ $_[4] }

Okay, so what? Well, this newly created function doesn't need to iterate, it just performs the operation as many times as you need. Performance can be impressive, as I'd written up over here. But, in other testing I'd done, its scaling is twitchy.

There also is overhead with doing the lookup and dispatching a pre-created function. If you can cache yourself in advance, the benefits are impressive. But my caching routine may just suck.

my $op = '+'; my $and_maker_5 = arbitrary_operator_maker($op, 5); my @list_5 = (1..5); timethese(1000000, { 'and_maker 5 elements cached' => sub { $and_maker_5->(@list_5); }, 'and_maker 5 elements dispatched' => sub { my $and_maker_5 = arbitrary_operator_maker($op, 5); $and_maker_5->(@list_5); }, 'multi_add 5 elements' => sub { multi_add(@list_5); }, }); Benchmark: timing 1000000 iterations of and_maker 5 elements cached, a +nd_maker 5 elements dispatched, multi_add 5 elements... and_maker 5 elements cached: 2 wallclock secs ( 0.98 usr + 0.00 sys += 0.98 CPU) @ 1020408.16/s (n=1000000) and_maker 5 elements dispatched: 4 wallclock secs ( 2.95 usr + 0.01 +sys = 2.96 CPU) @ 337837.84/s (n=1000000) multi_add 5 elements: 3 wallclock secs ( 2.53 usr + 0.01 sys = 2.54 + CPU) @ 393700.79/s (n=1000000)

Basically, a technique like this would be useful if you're going to do the same operation many many times over the same number of arguments. If not, the overhead of building the function or even looking up a pre-cached function may cause performance to fall below that of an iterator. This version of the operator maker also does no bounds checking. So you can call $add_5_things with 3 elements or 97 of them. That may be a bad thing. YMMV.