bt101 has asked for the wisdom of the Perl Monks concerning the following question:
Hi I'm trying to understand variable scope. I'm used to C where a variable created inside a subroutine is on the stack and it disappears when the subroutine returns. However in this perl example, I create a variable inside a subroutine (a hash called %hsh). I then assign the address of this hash to a global variable and then set some values inside this hash. When the subroutine ends, I would expect that the global variable would point to something that has disappeared. However the printout succeeds in printing the correct values from that hash. Why is this. I'm basically concerned that everything I'm creating inside these subroutines is not getting cleaned-up.
my $gptr;
a1();
print "v1=" . $gptr->{'v1'} . "\n";
exit(0);
sub a1
{
my %hsh;
$gptr = \%hsh;
$gptr->{'v1'} = 10;
$gptr->{'v2'} = 20;
}
Re: Do subroutine variables get destroyed?
by Tanktalus (Canon) on May 08, 2016 at 04:50 UTC
|
Everything in perl is on the heap (*). Variables are created when necessary (my variables are created when the my statement executes), and cleaned up as soon as there are no references left to them.
This allows you to create new hashes, arrays, etc., in your sub, and then return references to them and have them still work.
Compared to C, this is basically perl calling malloc for you for each variable declaration, and then calling free automatically not when the variable goes out of scope, but when there are no more references to that variable.
(*) Not quite true. But close enough for our purposes.
| [reply] |
Re: Do subroutine variables get destroyed?
by GrandFather (Saint) on May 08, 2016 at 07:06 UTC
|
I see where your C/C++ influence gets you a variable called gptr, but in Perl there are no pointers. A better name is gref. So, to elaborate on Tanktalus's reply a little: Perl variables are reference counted and garbage collected, so when you reference %hsh with $gref Perl bumps the reference count on %hsh. %hsh was created with a reference count of 1 and the count is decremented when %hsh goes out of scope. However by the time %hsh goes out of scope the reference count has been incremented due to the reference assignment to $gref so the storage remains in use until $gref either goes out of scope or is assigned a new value.
For the vast majority of Perl scripts you needn't be concerned about memory management and that significant bane of C, invalid memory accesses, essentially can't happen.
Premature optimization is the root of all job security
| [reply] |
|
To elaborate further on GrandFather's point that perl
uses a reference-counted
garbage collector,
note that with Perl you get "deterministic destructors" for free;
that is, in perl, you are guaranteed that an object is destroyed (and destructor called) immediately its reference count goes to zero.
BTW, deterministic destructors are a feature of the C++ RAII idiom
yet are problematic when using a tracing garbage collector,
such as that used by Java ... which is why Java has a "finally" clause
(see also Dispose pattern).
See also Finalizer (wikipedia).
You can use deterministic destructors to good effect in Perl with lexical file handles,
which are automatically closed when the file handle goes out of scope;
for example in:
sub fred {
# Lexical $fh is known from point of declaration (next line) to en
+d of scope
open(my $fh, "<", "f.tmp") or die "open error f.tmp: $!";
# ... process file here
# We assume no references to $fh are created in and returned from
+this sub
# (if you did that reference count of $fh would not be zero on sub
+ exit)
# ... die might be called ... (that's ok, can be caught via block
+eval)
# ... there can be multiple return statements ...
return;
}
note that $fh is automatically closed immediately the function exits because
the $fh variable goes out of scope at end of function ... and when $fh goes out of scope,
its reference count goes to zero and its destructor
is automatically and immediately called to close the file handle.
No need for an explicit close.
(Update: this automatic close at end of scope does not check for errors though,
so an explicit close with error checking is advisable for file handles used for writing; see also autodie).
A drawback to reference counting, that tracing GCs solve,
is the dreaded circular reference problem.
For an example of how to deal with circular references in Perl,
see Eliminate circular reference memory leak using weaken (perlmaven).
See also Automatic Reference Counting and Weak reference and Circular reference (wikipedia).
As to when malloc'ed memory is actually released back to the OS, see the answers to this question.
What I dislike more than both reference-counting and tracing garbage collectors is
Manual memory management, typical in C programs ...
which forces you to rely on static and dynamic code analysis tools, such as
Coverity, Valgrind, AddressSanitizer and many more, to keep the code clean.
See also EXTERNAL TOOLS FOR DEBUGGING PERL
section at perlhack.
References Added Later
Updated 2022: Added paragraph about manual memory management.
| [reply] [d/l] [select] |
Re: Do subroutine variables get destroyed?
by Marshall (Canon) on May 08, 2016 at 14:12 UTC
|
The first 2 posts are completely on target, re: reference counting.
As a demo, I re-wrote your a1 sub and added an a2 sub.
Rather than assigning the hash reference in the subroutine to the "global" variable,
I would recommend returning the hash reference from the sub as shown in
recoded sub a1.
Like in C, it is possible to pass a "pointer", the hash reference to a sub as
shown in sub a2. The memory previously used by sub a1 is "re-used", same struct is modified. If "my $href" goes out of lexical scope, its memory will be recovered and reused by Perl.
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper; # a cool core module
# that dumps any structure
my $href = a1(); #hash reference returned from sub
print "v1=$href->{'v1'}\n"; #dot operator not needed
print Dumper $href;
a2($href);
print Dumper $href;
exit(0);
sub a1
{
my %hash;
$hash{v1} = 10;
$hash{v2} = 20;
return \%hash; # %hash memory will "live" due to
# reference counting
}
sub a2
{
my ($href) = @_;
$href -> {v1} = 30; #quotes are ok but not needed
$href -> {'v2'} = 40;
return;
}
__END__
v1=10
$VAR1 = {
'v2' => 20,
'v1' => 10
};
$VAR1 = {
'v2' => 40,
'v1' => 30
};
| [reply] [d/l] |
|
Thanks everyone, those are great answers.
I gather that if the reference variable is pointed to a new item, then the old item to which it pointed no longer has any references to it and it will be destroyed.
| [reply] |
|
| [reply] [d/l] |
|
|
my $href = a1();
$href = a1();
The first call makes a hash and returns a reference to it.
The second call also makes a hash and returns a reference to it.
However after the second call, the reference to the first
hash is now "lost" as it was replaced by a reference to the
second hash that was generated. The memory for the first hash is then
recylced because its reference count is zero and there is no
way for the program to access that data anymore.
Of course in a "real" example, probably there are some parameters
to sub a1 so that it generates a different kind of hash on the
second call. One reason to do this might be in a GUI interface
where a1() winds up being say a "button factory". If the references
returned are kept in scope, say in an array, then each button is a
distinct thing.
For the most part, Perl memory management does the "right thing" under
the covers and is transparent to you. There are of course special
considerations with certain types of data structures and when making
truly huge structures in the sub. | [reply] [d/l] |
Re: Do subroutine variables get destroyed?
by AnomalousMonk (Archbishop) on May 08, 2016 at 15:54 UTC
|
If you have some kind of real-time, graphical memory resource display on your system, you can see the processes discussed above in action and play with them. (I'm using the memory "Performance" display of Windows 7 Task Manager, running Strawberry Perl 5.14.4.1.)
This example subroutine creates a lexical (my) array and increases its size to a fairly large value. (You may have to adjust the sizes of the following examples according to the memory resources of your system.) It then abandons the array and returns. The memory of the array, no longer referenced, is garbage collected. The for-loop calls the subroutine over and over, and you can see a small, constant blip in memory usage while the loop is running. (Be patient; the loop takes significant time to create and destroy all that stuff.) When the loop (and the program) finishes execution, system memory usage returns to its pre-loop, pre-program baseline value.
c:\@Work\Perl\monks>perl -wMstrict -le
"sub Sa {
my @ra;
$#ra = 100_000_000;
return;
}
;;
for (0 .. 100) {
Sa();
}
"
(Update: The Sa() subroutine was changed to add an explicit return statement to make it absolutely clear that the routine returns nada and the lexical @ra array inside it is no longer referenced.)
This example is almost the same, except that the subroutine returns a reference to the array created within it. The reference is assigned to a lexical scalar variable created within the scope of the for-loop, and the variable goes out of existence almost immediately because the end of the loop (and its scope) is reached. My system shows some difference in memory usage as compared to the Sa() subroutine example: peak usage is the same, but during the execution of the loop, there is some irregular variation between the peak value and the pre-execution, baseline memory usage value. I attribute this variation to the fact that unreferenced memory objects are only marked for garbage collection when their reference count goes to zero; the actual act of recovering the memory is determined by multiple factors and is not, IIRC, necessarily deterministic.
c:\@Work\Perl\monks>perl -wMstrict -le
"sub Sb {
my @ra;
$#ra = 100_000_000;
return \@ra;
}
for (0 .. 100) {
my $arrayref = Sb();
}
"
(This example also takes a noticable time to run.)
The third example differs in that the reference returned by the subroutine is saved in the for-loop in an array outside the scope of the loop instead of being immediately thrown away. In this case, the memory allocated within the subroutine during its execution cannot be garbage-collected because it is still being referenced! A graphical display of memory should show a steadily increasing usage while the loop is running. Usage will be constant during the sleep period after the loop terminates. During this time, all references to memory allocated in the Sc() subroutine are still in existence in the @save_references array. Then all the references in the array are destroyed, and their referents are garbage-collected. A return to baseline memory usage is seen that continues to program termination (when memory usage continues at baseline).
c:\@Work\Perl\monks>perl -wMstrict -le
"sub Sc {
my @ra;
$#ra = 10_000_000;
return \@ra;
}
;;
my @save_references;
for my $i (0 .. 20) {
$save_references[$i] = Sc();
}
sleep 8;
;;
@save_references = ();
sleep 8;
"
Give a man a fish: <%-{-{-{-<
| [reply] [d/l] [select] |
Re: Do subroutine variables get destroyed?
by ikegami (Patriarch) on May 09, 2016 at 17:35 UTC
|
Think of my a dynamic memory allocator (like "new" in other languages). The allocated variable will be freed when there are no more reference to it.
| [reply] [d/l] [select] |
|
|