perlmeditation
ambrus
<h3>
Introduction
</h3>
<p>
During a discussion on the cb<del>,</del><ins> with</ins> [jZed]<ins>, [bart]</ins>
has mentioned that it would be useful to represent chains of methods like the <code>->foo->bar</code> part of <code>$object->foo->bar;</code> as one entity.
In this meditation, I show that this is possible in perl.
<h3>
Goals
</h3>
<p>
Let's take this method call chain:
<code>
$object->method1(@args1)->method2(@args2)->method3(@args3)
</code>
we want an object <code>$chain</code> so that
<code>$object->$chain</code> does the same as the above method call.
<p>
For the sake of easy interface, we'll construct this with the simple syntax
<code>
$chain = Object::MethodChain->method1(@args1)->method2(@args2)->method3(@args3);
</code>
<readmore>
This complicates out implementation a bit, because we have to use AUTOLOAD.
We could avoid AUTOLOAD and gain more flexibility if we used
some mode complicated syntax such as
<code>
$chain = Object::MethodChain->new("method1", \@args1, "method2", \@args2, "method3");
</code>
</readmore>
<h3>
Implementations
</h3>
<p>
I'll show multiple versions of the code.
<readmore>
<p>
The tricky part of the code is how we create an object
<code>$chain</code> dynamically in such a way that <code>$class->$chain</code> does something useful.
We could avoid this if we allowed for another compromise in the interface, like requiring <code>$chain->call_on($object)</code> instead of <code>$object->$chain</code>.
You might find this trivial, but this is the real reason I wrote this meditation.
<p>
To solve this, let's see what the <code>$object->$method</code> syntax does.
If <code>$method</code> is a string without a double colon, then this is a real method call which calls the method named $method from the class of <code>$object</code> or from the class <code>$object</code> if <code>$object</code> is a string. (There's also the same magic for filehandles as with the normal <code>->method</code> method calls.)
We don't know what class will be used, so we can't want to install a new method in it.
However, we can install a method in UNIVERSAL.
But there's another problem too: we want to be able to do
<code>
$chain = Object::MethodCall->method1(@args1)->method2(@args2);
</code>
so <code>
Object::MethodCall->method1(@args1)
</code> has to be an object, so you can't use a string.
The way around this is overloading: use an object that stringifies magically.
<p>
I shall create only one method in UNIVERSAL, and use a global variable to pass the method chain to that method. This is because if we created a new method for every Object::MethodChain ever stringified, those methods could not be garbage collected. (Also, it would probably be slow because of the method cache, but I'm not sure in this.)
Thus, the stringification method will return a constant string, the name of this method, but it will store the data about the method chain in a global variable.
<p>
Here's the code. (Assume <code>use warnings; use strict;</code> for here and the rest of the writeup.)
<code>
{
package Object::MethodChain;
use overload q[""], "__Set_MethodChain__";
AUTOLOAD {
my $self = shift;
my $class = ref($self) || $self;
my @new = ref($self) ? @$self : ();
our $AUTOLOAD =~ /.*::(.*)/s or
die "error: invalid method name";
push @new, {"method", $1, "args", \@_};
bless \@new, $class;
}
our $chain;
sub __Set_MethodChain__ {
$chain = $_[0];
"__Call_MethodChain__";
}
sub UNIVERSAL::__Call_MethodChain__ {
my $r = $_[0];
for my $pair (@$chain) {
my($method, $args) = @$pair{"method", "args"};
$r = $r->$method(@$args);
}
$r;
}
DESTROY {
}
}
</code>
<p>
There's one more bit you should note:
we store <code>\@_</code> so that lvalue arguments to methods work.
<p>
Let's continue thinking what <code>$method</code> could me
in a <code>$object->$method</code> call.
If <code>$method</code> is a string with double colons, or a code reference, or a glob, then <code>$object->$method</code> is equivalent to <code>&$method($object)</code>.
<p>
Code reference is a good way to go, as we can easily create a code reference with whatever content we want, and it can be blessed too.
This has the disadvantage that you can't print the chain object with Data::Dumper (you can with the above).
<p>
The code is much shorter then the first one, in fact, this was my first version of the code:
<code>
{
package Object::MethodChain;
AUTOLOAD {
my $s = shift;
my $c = ref($s) || $s;
my $p = ref($s) ? $s : sub { $_[0] };
our $AUTOLOAD =~ /.*::(.*)/s or die;
my $m = $1;
my $a = \@_;
bless sub {
&$p($_[0])->$m(@$a);
}, $c;
}
DESTROY {
}
}
</code>
<p>
The above code does not use an array to store the methods and their arguments, instead, it creates a sequence of closures each of which reference the previous one through an enclosed variable.
<p>
If you don't understand this code, first consider the case when there's only one method in a chain. This call:
<code>
$chain = Object::MethodChain->foo;
</code>
will call <code>Object::MethodChain->AUTOLOAD</code> with <code>$Object::MethodChain::AUTOLOAD</code> set to <code>"Object::MethodChain::foo</code>.
The AUTOLOAD function then sets <code>$m</code> to <code>"foo"</code>, <code>$p</code> to <code>sub { $_[0] }</code>, the identity function, and <code>$a</code> to a reference to an empty array. The (blessed) code reference it returns has these variables enclosed. The code is <code>sub { &$p($_[0])->$m(@$a); }</code>
which, when substituted the variables, becomes
<code>sub { &{sub { $_[0] }}($_[0])->"foo"(@$a); }</code>,
which is roughly equivalent to this: <code>
sub { $_[0]->foo; }
</code>
<p>
Then, if we call <code>$object->$chain</code>, as <code>$chain</code> is a sub, it calls <code>&$chain($object)</code> which does <code>$object->foo</code>.
<p>
Now you can probably find out yourself how the chaining case works: the variable <code>$p</code> stores the Object::MethodChain object for the methods before the last one.
<p>
I show a more straightforward solution here which stores the data in an array like in the first solution.
This also has the advantage that it doesn't go to a deep recursion if you use a
very long method chain.
(Actually, you could use a closure chain in the overloading solution too, but an array seemed more natural.)
<p>
For this, we need a way to get the array from such an object.
My solution to this is to pass a special argument to the sub.
We could just as well use some other way, like setting a global variable.
<code>
{
package Object::MethodChain;
our $open_sesame = [];
AUTOLOAD {
my $self = shift;
my $class = ref($self) || $self;
my @chain = ref($self) ? &$self($open_sesame) : ();
our $AUTOLOAD =~ /.*::(.*)/s or
die "error: invalid method name";
push @chain, {"method", $1, "args", \@_};
bless sub {
if(ref($_[0]) && $open_sesame == $_[0]) {
@chain;
} else {
my $r = $_[0];
for my $pair (@chain) {
my($method, $args) = @$pair{"method", "args"};
$r = $r->$method(@$args);
}
$r;
}
}, $class;
}
DESTROY {
}
}
</code>
<p>
But let's get back to what I've said: <code>$method</code> could also be a fully qualified subname, or a glob.
I can't solve this with a glob, as that can't be blessed.
However, we can use an object that overloadingly stringifies to a fully qualified name. That gives another solution that's almost the same as the first one:
<code>
{
package Object::MethodChain;
use overload q[""], "__Set_MethodChain__";
AUTOLOAD {
my $self = shift;
my $class = ref($self) || $self;
my @new = ref($self) ? @$self : ();
our $AUTOLOAD =~ /.*::(.*)/s or
die "error: invalid method name";
push @new, {"method", $1, "args", \@_};
bless \@new, $class;
}
our $chain;
sub __Set_MethodChain__ {
$chain = $_[0];
"Object::MethodChain::__Call_MethodChain__";
}
sub __Call_MethodChain__ {
my $r = $_[0];
for my $pair (@$chain) {
my($method, $args) = @$pair{"method", "args"};
$r = $r->$method(@$args);
}
$r;
}
DESTROY {
}
}
</code>
<p>
Note finally that you cannot use an object with sub dereferencing <code>(&{}</code>) overloaded, as <code>$object->$method</code> always calls the stringification overload of <code>$method</code> instead, and doesn't like if the stringification function returns a coderef.
<h3>
Example
</h3>
This <code>Object::MethodChain</code> object can be called on an object or a package name bareword
or a filehandle, just like any ordinary method.
Needless to say that MethodChain works even if a method returns something else than the original
object, the next method will be called on the next object.
Here's an example of using MethodChain.
The following example is intended to show all of the above features,
and it also shows that you can call a method of Object::MethodChain indirectly
(see below <code>->$f</code> where <code>$f = "foo"</code>).
<p>
The example works with any of the following definitions.
Take the following definitions:
<code>
{
package AnObj;
sub new {
bless [], $_[0];
}
sub foo {
print "just ";
$_[0];
}
sub bar {
$_[1] = "ack";
OtherObj->new("erl h");
}
}
{
package OtherObj;
sub new {
bless [$_[1]], $_[0];
}
sub baz {
print "anot", $_[1], $_[0][0];
"er,\n";
}
}
</code>
<p>
Then the following chain method call prints a familiar message:
<code>
{
my $f = "foo";
my $n = AnObj->new->$f->bar(my $v)->baz("her p");
print $v, $n;
}
</code>
<p>
We can substitute the chain with a single call of a method chain object, and the results are the same:
<code>
{
my $f = "foo";
my $c = Object::MethodChain->new->$f->bar(my $v)->baz("her p");
my $n = AnObj->$c;
print $v, $n;
}
</code>
<h3>
Limits
</h3>
You cannot use the <code>can</code> or <code>isa</code> UNIVERSAL methods on a method chain.
While you can use an indirect method call <code>Object::MethodChain->$f</code> if <code>$f</code>
is a name of a method (without double colons), you usually cannot use a fully qualified indirect
method call, i.e. when <code>$f</code> is a subname with package prefix or a sub reference.
<p>
As an interesting exception however, you can use an <code>Object::MethodChain</code> as a method
to call on another <code>Object::MethodChain</code> (or the class itself). For example this code
<code>
{
my $f = "foo";
my $m = Object::MethodChain->$f->bar(my $v);
my $c = Object::MethodChain->new->$m->baz("her p");
my $n = AnObj->$c;
print $v, $n;
}
</code>
does the same as above. In this case, you get a simple flattened method chain.
<code>$c</code> Dumpered from the above is
<code>
bless( [
{
'args' => [],
'method' => 'new'
},
{
'args' => [],
'method' => 'foo'
},
{
'args' => [
undef
],
'method' => 'bar'
},
{
'args' => [
'her p'
],
'method' => 'baz'
}
], 'Object::MethodChain' );
</code>
<h3>
Thoughts
</h3>
I can't help noticing the similarity between this class and the <code>__</code> object
in the <code>Switch</code> package. It would probably even possible to combine them.
</readmore>
<P>
Update: agressive shortdown with readmore tags.
<p>
Update: changed the details of the cb conversation I didn't remember correctly.
<p>
Update: there's a bug in three of the implementations of the method
chain, which causes a method chain call not to propagate context,
and always call the last method in scalar context. See the
discussion in the [id://449751|replies].