Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Creating Nested Functions

by ikegami (Patriarch)
on Jul 10, 2008 at 01:42 UTC ( [id://696592]=perltutorial: print w/replies, xml ) Need Help??

Some Background

As you probably know, Perl will successfully compile nested named subroutines, but they probably won't behave as intended. Perl even tries to warn you:

use strict; use warnings; sub outer { my $foo; sub inner { $foo; } inner(); }
>perl -c test.pl Variable "$foo" will not stay shared at test.pl line 8. test.pl syntax OK

The fix is obviously to use an anonymous sub instead of a named one.

use strict; use warnings; sub outer { my $foo; my $inner = sub { $foo; }; $inner->(); }
>perl -c test.pl test.pl syntax OK

However, that fails spectacularly when recursion is introduced.

Problem 1 — Referencing the Wrong Variable

This problem is almost always easy to spot if use strict; is in effect.

use strict; use warnings; sub outer { my $inner = sub { $inner->() } }
Global symbol "$inner" requires explicit package name at test.pl line +6. test.pl had compilation errors.

Remember that a my only makes the declared symbol available in statements following the one containing the my. This can be fixed trivially by splitting the assignment into two statements.

use strict; use warnings; sub outer { my $inner; $inner = sub { $inner->() } }
>perl -c test.pl test.pl syntax OK

Problem 2 — Memory Leak

There's a subtle lesson we should have learned from the first problem: If the sub references the lexical (by capturing it), and that same lexical references the sub, then it's a cyclic structure that cannot be freed by Perl's garbage collecting mechanism.

# ReleaseTracker.pm # This "module" simply provides "o()", a # func that creates a simple object whose # sole purpose is to print "Destroyed" when # it is freed. The inner workings are not # relevant to this example. use strict; use warnings; package ReleaseTracker; BEGIN { our @EXPORT_OK = qw( o ); require Exporter; *import = \&Exporter::import; } sub new { return bless(\(my $o), $_[0]); } sub DESTROY { print("Destroyed\n"); } sub o { ReleaseTracker->new(); } 1;
use strict; use warnings; use ReleaseTracker qw( o ); sub outer { # Variables that are read by each recursive instance. my $var = shift @_; my $helper; $helper = sub { no warnings 'void'; "do something with $var and @_"; $helper->(@_); }; #$helper->(@_); } outer( o() ); END { print("On program exit:\n"); }
>perl test.pl On program exit: Destroyed <--- BAD!

$var is not being freed because the anonymous sub is not being freed.
The anonymous sub is not being freed because it both references and is referenced by $helper.

Let's illustrate:

################################################
## Before outer exits
##

   &outer
      |                       +=============+
      v              --+----->[ Reference   ]
+==============+    /  |      +=============+
[ outer's pad  ]   /   |      [ refcount: 2 ]    +=============+
+==============+  /    |      [ pointer:  ------>[ Object      ]
[ $var:    -------     |      +=============+    +=============+
[ $helper: -------     |                         [ refcount: 1 ]
+==============+  \    |                         +=============+
                   \   |
                    \  |      +=============+
                     -----+-->[ Reference   ]
                       |  |   +=============+
                       |  |   [ refcount: 2 ]    +=============+
+==============+       |  |   [ pointer:  ------>[ Helper Sub  ]
[ helper's pad ]       |  |   +=============+    +=============+
+==============+       |  |                      [ refcount: 1 ]
[ $var:    ------------+  |                      [ pad:      -----+
[ $helper: ---------------+                      +=============+  |
+==============+                                                  |
      ^                                                           |
      |                                                           |
      +-----------------------------------------------------------+


################################################
## After outer exits
##

                              +=============+
                       +----->[ Reference   ]
                       |      +=============+
    (outer still       |      [ refcount: 1 ]    +=============+
     exists, but       |      [ pointer:  ------>[ Object      ]
     it's not          |      +=============+    +=============+
     referencing       |                         [ refcount: 1 ]
     anything in       |                         +=============+
     this graph )      |
                       |      +=============+
                       |  +-->[ Reference   ]
                       |  |   +=============+
                       |  |   [ refcount: 1 ]    +=============+
+==============+       |  |   [ pointer:  ------>[ Helper Sub  ]
[ helper's pad ]       |  |   +=============+    +=============+
+==============+       |  |                      [ refcount: 1 ]
[ $var:    ------------+  |                      [ pad:      -----+
[ $helper: ---------------+                      +=============+  |
+==============+                                                  |
      ^                                                           |
      |                                                           |
      +-----------------------------------------------------------+

Nothing has a refcount of zero, so nothing can be freed.

Solution — Dynamic Scoping

The solution to both problems is the same: Don't use a lexical variable.

use strict; use warnings; use ReleaseTracker qw( o ); sub outer { # Variables that are read by each recursive instance. my $var = shift @_; local *helper = sub { no warnings 'void'; "do something with $var and @_"; helper(@_); }; #helper(@_); } outer( o() ); END { print("On program exit:\n"); }
>perl test.pl Destroyed On program exit: <-- good

Package variables aren't captured, so &helper's reference count isn't affected by the call in the inner function.

################################################
## Before outer exits
##


   &outer
      |
      v
+=================+
[ outer's pad     ]
+=================+        +=============+
[ $var: --------------+--->[ Reference   ]
+=================+   |    +=============+
                      |    [ refcount: 2 ]    +=============+
                      |    [ pointer:  ------>[ Object      ]
                      |    +=============+    +=============+
+=================+   |                       [ refcount: 1 ]
[ *helper{SCALAR} ]   |                       +=============+
+=================+   |    +=============+
[ pointer: --------------->[ Reference   ]
+=================+   |    +=============+
                      |    [ refcount: 1 ]    +=============+
                      |    [ pointer:  ------>[ Helper Sub  ]
+=================+   |    +=============+    +=============+
[ helper's pad    ]   |                       [ refcount: 1 ]
+=================+   |                       [ pad:      -----+
[ $var: ===-----------+                       +=============+  |
+=================+                                            |
      ^                                                        |
      |                                                        |
      +--------------------------------------------------------+


################################################
## After outer exits
## and local restores
## *helper{SCALAR}
##

                           +=============+
                      +--->[ Reference   ]
                      |    +=============+
    (outer still      |    [ refcount: 1 ]    +=============+
     exists, but      |    [ pointer:  ------>[ Object      ]
     it's not         |    +=============+    +=============+
     referencing      |                       [ refcount: 1 ]
     anything in      |                       +=============+
     this graph )     |
                      |    +=============+
                      |    [ Reference   ]
                      |    +=============+
                      |    [ refcount: 0 ]    +=============+
+=================+   |    [ pointer:  ------>[ Sub         ]
[ helper's pad    ]   |    +=============+    +=============+
+=================+   |                       [ refcount: 1 ]
[ $var: --------------+                       [ pad:      -----+
+=================+                           +=============+  |
      ^                                                        |
      |                                                        |
      +--------------------------------------------------------+

There is no cycle, so everything will be freed in turn, starting with the reference with a refcount of zero.

Alternative Solutions

I won't delve into these, so feel free to provide links which discuss these in more details. For each alternative solution, I'll post the equivalent to the solution I've already presented.

sub outer { ... local *helper = sub { ... helper(...); ... }; helper(@_); }

Y Combinator

ambrus pointed out that Y Combinator can also achieve this goal. This is the most worthwhile alternative.

sub outer { ... Y( sub { my $helper = shift @_; sub { ... $helper->(...); ... } } )->(@_); }

While it's a great tool to completely anonymize a recursive function, and while it might even be required in functional languages, it adds overhead in a situation where reducing overhead is important, and I think it's unnecessarily complex to solve the problem of nesting functions in Perl.

Weak Reference

One could weaken the reference to the inner function using Scalar::Util::weaken.

sub outer { ... my $helper; $helper = sub { ... $helper->(...); ... }; weaken($helper); $helper->(@_); }

However, the Weak Reference solution is much more complex than the Dynamic Scoping solution and has no advantage that I can see.

Summary

Using local *helper = sub {}; over my $helper = sub {}; not only provides a cleaner calling syntax, it can be used for recursive functions without accidentally referencing the wrong variable or causing a memory leak.

Replies are listed 'Best First'.
Re: Creating Nested Functions
by tilly (Archbishop) on Jul 10, 2008 at 21:46 UTC
    One problem with using dynamic scoping is that you've opened up the possibility that the same variable could be used somewhere else for a different purpose. That is the advantage to using lexicals.

    Another solution is to use ReleaseAction to do cleanup when you exit the sub.

    sub outer { my $helper; $helper = sub { ... }; my $delay_cleanup = on_release {$helper = undef}; $helper->(@_); }
    Effectively it does the same thing as the weaken solution, but the code is a little more explicit about what is happening, when. I'd personally lean towards this solution.

    And a final option. Larry thinks that using dynamic scoping on lexical variables is too confusing to allow. But he does allow it for data structures. Which leads to the convoluted:

    sub outer { my @helper; local $helper[0] = sub { ... }; $helper[0]->(@_); }
    I wouldn't suggest this solution for fear of the psychopathic maintenance programmer demanding to know why this works. But every year or two I wind up using this fact in a recursive function to detect and track down potential deep recursion bugs. For instance I'm recursing through a set of nodes and I'll do something like this:
    { my %in_node; sub something_recursive { my $node = shift; if ($in_node{$node->{name}}) { confess("Can't access $node->{name} while accessing $node->{name +}"); } local $in_node{$node->{name}} = 1; ... } }
Re: Creating Nested Functions
by ikegami (Patriarch) on Jul 10, 2008 at 07:46 UTC

    I was asked about the possibility of using our $inner; instead of my $inner;. It's doable.

    sub outer { ... local our $helper; $helper = sub { ... $helper->(...); ... }; $helper->(@_); }

    local still needs to be used to protect the current value of the variable and to clear the reference to the helper sub when outer exits.

    I don't see any advantage in localizing $helper instead of &helper.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perltutorial [id://696592]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2024-03-19 11:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found