Re: Memory usage double expected
by LanX (Saint) on Oct 27, 2022 at 08:51 UTC
|
I reduced it to 1 GB and was still able to reproduce it.
But the effect disappeared, after I changed the logic to avoid temporary data on the RHS.
use v5.12;
use warnings;
use Devel::Size qw(total_size);
my $x = 'a';
$x x= 2**30; # 1GiB
print total_size($x) . "\n";
sleep 60;
Seems like the allocated extra space for your 'a' x (2**32) wasn't released.
update
see also Re^2: Memory usage double expected (run-time)
| [reply] [d/l] [select] |
Re: Memory usage double expected
by Discipulus (Canon) on Oct 27, 2022 at 12:09 UTC
|
Hello,
what I see is nonsense to me
This is perl 5, version 26, subversion 0 (v5.26.0) built for MSWin32-x64-multi-thread
With your program (little modification to not watch the task manager) I see it doubled
use strict;
use Devel::Size qw(total_size);
my $x = 'a' x (2**30);
print "Devel::Size = ".human(total_size($x)).
+"\n";
open my $cmd, qq(tasklist /NH /FI "PID eq $$"|) or die;
while (<$cmd>){ print qq(tasklist PID $$ = $1\n) if /(\S+\s\w{1,2}$)/}
sub human{
my $size = shift;
my @order= qw/Tb Gb Mb Kb byte/;
if($size<1024){return"$size byte"}
while ($size >= 1024){$size=$size/1024;pop @order;}
return sprintf("%4.2f %2s", $size, (pop @order));
}
__END__
Devel::Size = 1.00 Gb
tasklist PID 37288 = 2.104.612 K
But with this version of mine I see what everyone is expecting to:
use strict;
use warnings;
use Devel::Size qw(total_size);
my $x;
foreach my $order ( qw(20 24 30 32) ){
$x = 'a' x ( 2 ** $order );
print "\n\nsize of scalar 2**$order\n";
print "Devel::Size = ".human(total_size($
+x))."\n";
open my $cmd, qq(tasklist /NH /FI "PID eq $$"|) or die;
while (<$cmd>){ print qq(tasklist PID $$ = $1\n) if /(\S+\s\w{1,2}
+$)/}
}
sub human{
my $size = shift;
my @order= qw/Tb Gb Mb Kb byte/;
if($size<1024){return"$size byte"}
while ($size >= 1024){$size=$size/1024;pop @order;}
return sprintf("%4.2f %2s", $size, (pop @order));
}
__END__
size of scalar 2**20
Devel::Size = 1.00 Mb
tasklist PID 19660 = 8.452 K
size of scalar 2**24
Devel::Size = 16.00 Mb
tasklist PID 19660 = 23.820 K
size of scalar 2**30
Devel::Size = 1.00 Gb
tasklist PID 19660 = 1.056.012 K
size of scalar 2**32
Devel::Size = 4.00 Gb
tasklist PID 19660 = 4.201.748 K
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
Re: Memory usage double expected
by bliako (Abbot) on Oct 27, 2022 at 06:51 UTC
|
If that's any help, linux's pmap also reports 8615556K
Edit:
Could it be because Perl (and others) re-allocates the size of $x to double the last requested size? Unlikely if total_size reports 4Gb I guess.
Try reading a large file instead of creating a huge scalar in $x.
| [reply] [d/l] [select] |
Re: Memory usage double expected
by NERDVANA (Priest) on Oct 29, 2022 at 05:04 UTC
|
Ok, here's a layman explanation:
Perl is free to do whatever it thinks might help your program run faster, usually at the expense of using more memory. You can't make any assumptions about memory usage of a script. In this case, perl probably created a compile-time constant for that which it copies when you ask for its value.
If you are working with extremely large data, you can't toss it around in the usual manners without large spikes of memory usage. You need to consider something like File::Map and then pass around references to that scalar instead of passing the scalar around by value. If you describe more of your needs, we can suggest better techniques to avoid loading it all into memory at once. | [reply] |
Re: Memory usage double expected
by kcott (Archbishop) on Oct 28, 2022 at 06:43 UTC
|
G'day sectokia,
I have Perl v5.36.0 (via Perlbrew) running on Cygwin which is running on Win10.
Cygwin and Win10 were both updated in the last 24 hours; so, everything is up-to-date.
I ran your code and got the same results: Devel::Size & Devel::Peek showing 4GB; MSWin showing 8GB.
I changed the size from 2**32 (4GB) to 2**31 (2GB).
Devel::Size & Devel::Peek are now showing 2GB, as expected; however, MSWin is still showing 8GB.
In both cases, all of the 8GB was freed when the Perl code completed.
I don't want to jump to any conclusions:
I'm sure there would be differences between PerlbrewPerl-Cygwin-MSWin and StrawberryPerl-MSWin.
However, it does seem to me that issues are possibly related more to MSWin memory management than Perl itself.
I'd suggest rerunning your code with a variety of sizes;
checking Devel::Size, Devel::Peek & MSWin memory values for each.
| [reply] [d/l] [select] |
Re: Memory usage double expected -- further questions
by Discipulus (Canon) on Oct 28, 2022 at 07:55 UTC
|
# wrong beaviour as it doubles the memory
# first code of my previous post, very similar to the OP one
my $x = 'a' x (2**30);
# RIGHT beaviour, it does NOT double memory used
# second code posted above
my $x;
foreach my $order ( qw(20 24 30 32) ){
$x = 'a' x ( 2 ** $order );
...
# RIGHT beaviour, even with my $x declared inside the foreach loop
foreach my $order ( qw(20 24 30 32) ){
my $x = 'a' x ( 2 ** $order );
...
In addition every perl I have atm ( strawberry portable: 5.26.0 5.22.3 5.24.2 5.26.2 ) I observe the same beahviours of the two above programs, ie. doubled and not doubled; I read also Linux users experience the same. So it must be something really bound to Perl itself and I'd like to know why and how to prevent this: a doubled memory footprint is not such a great feauture to have :)
L*
There are no rules, there are no thumbs..
Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.
| [reply] [d/l] [select] |
|
| [reply] [d/l] [select] |
|
C:\tmp>perl -MO=Deparse -e "$_ =10; my $x = 'a' x 10"
$_ = 10;
my $x = 'aaaaaaaaaa';
-e syntax OK
C:\tmp>perl -MO=Deparse -e "$_ =10; my $x = 'a' x $_"
$_ = 10;
my $x = 'a' x $_;
-e syntax OK
C:\tmp>
| [reply] [d/l] |
|
|
|
|
Re: Memory usage double expected
by bliako (Abbot) on Oct 28, 2022 at 08:22 UTC
|
print "pid:$$\n";
my $x = 'monks' x (3);
sleep 60;
and then use gcore <pid> to dump the memory of said process to file (called core.<pid>) and then find occurences of "monks" either with strings core.<pid> | grep monks or hexdump -C core.<pid> | grep monks
true to what the explanation offered to perlmonks above, the anonymous monk firstly:
print "$$\n";
my $n = 3;
$x = 'monks' x ($n);
sleep 60;
monksmonksmonks
monks
contrast to
print "$$\n";
$x = 'monks' x (3);
sleep 60;
monksmonksmonks
monksmonksmonks
monksmon<
bw, bliako | [reply] [d/l] [select] |
Re: Memory usage double expected
by Anonymous Monk on Oct 27, 2022 at 07:14 UTC
|
| [reply] [d/l] |
|
> Try $x = 'a' x (2**${\32});
yes, forcing the creation of the RHS into runtime works.
edit
Dave_The_M will know better, why the space of an inline constant wasn't released.
I suppose it's just an overlooked optimization.
| [reply] [d/l] |
|
sub foo {
my $var = "X" x 10_000_000;
}
foo();
Tim explains that:
- The buffer for the $var lexical is preserved (for next time you call the subroutine).
- The compiler says hey "X" is a constant and 10_000_000 is a constant, so I'm gonna build you a 10 MB constant! And I'm going to keep it here to one side so when you call this subroutine again I can just copy it in for you! (aka Constant folding).
With recursion, things get much worse (sorry, I couldn't bring myself to watch that part :).
Though Tim's Devel::SizeMe module might be useful, AFAIK it is up for adoption and not being actively developed ATM.
I also keep a list of Memory Tools References
(from this list, Mini-Tutorial: Perl's Memory Management by ikegami is definitely worth reading).
| [reply] [d/l] [select] |
|
|
| [reply] |
|
| [reply] |
|
Re: Memory usage double expected
by harangzsolt33 (Deacon) on Oct 27, 2022 at 19:58 UTC
|
I have noticed the SAME behavior in TinyPerl 5.8 running on Windows XP. I reserve a large amount of memory, let's say 20 MB using the 'x' operator. I want a string that is 20 million bytes and is filled with letter 'A' all the way. So, I do this : my $VAR = 'A' x 20000000; # And boom! It uses 40MB of memory. I used a memory viewer to look into the TINYPERL.EXE application to see what's going on. I thought, it will be filled with 00 41 00 41 00 41 because it might store the letters as Unicode, reserving two bytes for each character. But nope! That's not what happens. Perl literally creates a twice as many letter 'A's in memory than what I want!
Someone explained it this way: Since Perl sees that both the letter 'A' and the 20_000_000 are constants, it creates a backup copy in memory in order to use it later... Nah, that's not true. because you can replace the 20 million with a variable, and read the number from STDIN and whatever you punch in, it still fills up twice as many bytes with letter 'A's which makes no sense.
I have played around with this a little bit and discovered that if you use the vec() function, you can initialize a string without wasting memory. vec($A, 19999999, 8) = 0; will give you a string that is exactly 20 million bytes long filled with zero bytes. Now, if you do vec($A, 19999999, 8) = 65; it will still pad the string with zero bytes and insert a letter 'A' at the end. I would like to know if there's a way to tell vec() function to use some other character for padding. It always uses the zero byte as padding. So, to fill up the string with letter 'A's, I would probably create a for loop and repeat the following 5 million times: vec($A, $PTR++, 32) = 0x41414141; That'll give you a string with 20 million letter 'A's. But if you want to write 4 gigs of letter 'A's that'll take quite awhile! lol
If anybody knows a shortcut to initialize a string with letter 'A's QUICKLY and without using the 'x' operator, please, do tell me!!! | [reply] |
|
$A =~ y/\0/A/;
That works for me.
I ran a few tests and that seems to take about five times longer than "$A = 'A' x 20_000_000".
Run your own benchmarks, but I think you'd be better off with the 'x' operator.
| [reply] [d/l] [select] |
|
my $VAR = makeA(20000000);
sub makeA { return 'A' x (shift) }
This would also fix it:
my $VAR = make20m('A');
sub make20m { return (shift) x 20000000 }
The conclusion seems to be that perl the constant itself occupies memory, and then the variable gets its own copy of the constant.
| [reply] [d/l] [select] |
Re: Memory usage double expected
by harangzsolt33 (Deacon) on Oct 27, 2022 at 20:13 UTC
|
What's more, if you create a sub that creates a big string, and the sub uses that string as the return value, then two copies of that string will live in memory EVEN THOUGH you created it with vec() function. Once you're outside the sub, you can only access one copy, and once you undef it outside the sub, it deletes only one copy! This is really weird behavior. I don't understand it at all. But again, I have played around... and I have noticed that if I pass the variable as an argument, then I only one copy of the variable exists in memory:
#!/usr/bin/perl -w
use strict;
use warnings;
sub myfunc_1
{
my $A = '';
vec($A, 20000000, 8) = 0;
return $A;
}
sub myfunc_2
{
vec($_[0], 100000000, 8) = 0;
return 0;
}
$b = <STDIN>;
my $BIGSTRING = '';
myfunc_2($BIGSTRING); # memory usage is normal.
$b = <STDIN>;
undef $BIGSTRING;
# Pff! Gone from memory! TINYPERL.EXE memory
# usage visibly shrinks in Windows Task Manager.
$b = <STDIN>;
my $STRING2 = myfunc_1(); # memory usage is double!
$b = <STDIN>;
undef $STRING2; # Deletes one copy only
$b = <STDIN>;
CONCLUSION: If you write a sub that reads something, let's say, you write a sub called ReadFile() you don't want to return the contents of the file as the return value of the sub, because then two copies of the data will exist in memory. You pass the buffer as an argument to the sub, and then the sub fills it up using
$_[1] = 'CONTENT';
Perhaps this is why sysread() also works the same way; instead of returning the bytes that were read from the file, it expects the buffer to be passed to it as an argument. The first argument is the file handle, the second is the buffer and the third is the number of bytes to read. And it returns the number of bytes that were read instead of the actual bytes!
ALSO NOTE: Even if your sub does not spell out "return $BIGSTRING;" it still returns the multi-megabyte string if that was the return value of the last statement in the sub. And even if you do not use the return value of the sub, it still gets stuck in memory!!!
#!/usr/bin/perl -w
use strict;
use warnings;
sub MemoryEaterFunc
{
my $A = '';
vec($A, 20000000, 8) = 0; # Create a large string
$A .= $_[0]; # Add to it. Do something.
# Evaluate the last statement,
# and that is the return value of the sub! even if
# you don't write return $A; it still returns it.
# If you want the function to not return the big string,
# then put 'return 0;' at the end of the sub.
}
$a = <STDIN>;
MemoryEaterFunc(2); # Eats memory and doesn't release it
$a = <STDIN>;
You say, "All right fine! I will return zero and then see what happens!"
Unfortunately, the MemoryEaterFunc() will still gobble up 20 megabytes of RAM even if it returns zero, and you do not use its return value. To actually make sure that it doesn't gobble up memory, we need to undef the variable $A before exiting the sub!
Interestingly, if you do not undef $A and you use that same sub repeatedly, it doesn't gobble up an additional 20 MB EACH time. It only uses 20 MB total. Period. (In a way, this proves that Perl uses heap memory instead of stack.)
MemoryEaterFunc(2);
MemoryEaterFunc(2);
MemoryEaterFunc(2);
MemoryEaterFunc(2);
# At this point, memory usage is 20 MB.
$b = <STDIN>;
Okay, so let's fix this thing so it doesn't waste memory anymore:
sub MemoryEaterFunc_FIXED
{
my $A = '';
vec($A, 20000000, 8) = 0; # Create a large string
$A .= $_[0]; # Add to it. Do something.
undef $A;
}
Now the memory eater doesn't eat memory anymore. It uses 20 MB of memory INSIDE the sub, but then once it gets done, it no longer holds onto that memory. | [reply] [d/l] [select] |