http://www.perlmonks.org?node_id=672622

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I read that threads::shared variables are not really shared between threads; but rather, each thread has its own copy of the data, and the variables are tied to some "black magic" which propagates changes between threads. If that's true, then I'm very confused by my empirical results. My results seem to indicate the data is actually shared (i.e. not duplicated).

This simple program stores 100 1MB strings in an array shared between 10 threads. If the data is duplicated between threads, that should come out to 1GB total. Instead, the program reports a VmSize of only 214MB:
use threads; use threads::shared; my @a : shared; sub foo { sleep 5; print @a . " elements in \@a\n"; } for (1..10) { threads->new(\&foo); } my $s = " " x 1e6; for (1..100) { push @a, $s; } system "grep VmSize /proc/$$/status";
So it looks like that 100MB of shared data is really shared! There seems to be an 11MB fixed overhead per thread, which is probably why it reported 214MB instead of 100MB.

I also tried adjusting the size of the array and the strings in the array, tried using ints instead of strings, changed the number of threads, and populated the array collaboratively from within the threads. None of that made any difference -- the VmSize was always perfectly consistent with truly shared data (and 11MB fixed overhead per thread.)

Has Perl recently made its threads::shared data truly shared, or am I doing something wrong here? Thanks for any insight!

Highly confused,

Damon Hastings