Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Solaris + UltraSparc T2 + Threads: Avoid LCK's

by gulden (Monk)
on Apr 23, 2009 at 09:25 UTC ( [id://759489]=perlquestion: print w/replies, xml ) Need Help??

gulden has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks,

I do not understand the behavior of threads, and prevent them coming into LOCK state.

I did two tests, in which the only difference is that one has an inner loop. I wanted to understand why the test with the Inner Loop (Test2) put the CPU USER state at 100% and without this inner loop, the threads are being LOCKED (see column of LCK PRSTAT).

Test1:
#!/opt/coolstack/bin/perl use strict; use threads ('yield', 'stack_size' => 64*4096, 'exit' => 'threads_only', 'stringify'); my $nloaders = 64; #--------------------------------------------------------------- my @thrs_loaders; for(1..$nloaders){ print "START LOAD $_ \n"; my ($thr) = threads->create(\&load, $_); push @thrs_loaders ,$thr; } $_->join for @thrs_loaders; print "STOP: " . localtime() . "\n"; exit; #--------------------------------------------------------------------- +--------- sub load{ my $id = shift; my $tmp; for(1..7235617){ int( rand (10)); } print "$id>LOAD EXIT\n"; }
PRSTAT for Test1:
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  24   3   0 perl/43
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  28   2   0 perl/34
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  26   3   0 perl/63
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  26   3   0 perl/62
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  26   3   0 perl/61
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  23   3   0 perl/51
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  29   3   0 perl/58
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  35   3   0 perl/57
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  35   3   0 perl/56
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  35   3   0 perl/45
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  33   2   0 perl/31
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  35   2   0 perl/19
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  37   3   0 perl/50
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  36   3   0 perl/41
 19883 mgarcia  100 0.0 0.0 0.0 0.0 0.0 0.0 0.0   0  37   2   0 perl/36
Test2:
#--------------------------------------------------------------------- +--------- sub load{ my $id = shift; my $tmp; for(1..7235617){ for(1..100000){ $tmp = $_; } int( rand (10)); } print "$id>LOAD EXIT\n"; }
PRSTAT for Test2:
   PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
 19894 mgarcia   60 0.1 0.0 0.0 0.0  40 0.0 0.0  22  20  36   0 perl/19
 19894 mgarcia   58 0.1 0.0 0.0 0.0  42 0.0 0.0  19  25  48   0 perl/56
 19894 mgarcia   57 0.0 0.0 0.0 0.0  43 0.0 0.0  21  20  41   0 perl/32
 19894 mgarcia   56 0.1 0.0 0.0 0.0  44 0.0 0.0  19  18  31   0 perl/23
 19894 mgarcia   50 0.1 0.0 0.0 0.0  50 0.0 0.0  19  20  52   0 perl/25
 19894 mgarcia   50 0.1 0.0 0.0 0.0  50 0.0 0.0  18  20  54   0 perl/33
 19894 mgarcia   34 0.1 0.0 0.0 0.0  66 0.0 0.0  17  12  47   0 perl/34
 19894 mgarcia   32 0.1 0.0 0.0 0.0  68 0.0 0.0  19  13  53   0 perl/10
 19894 mgarcia   30 0.1 0.0 0.0 0.0  70 0.0 0.0  22  14  43   0 perl/15
 19894 mgarcia   25 0.0 0.0 0.0 0.0  75 0.0 0.0  21  14  36   0 perl/65
 19894 mgarcia   23 0.1 0.0 0.0 0.0  77 0.0 0.0  20  12  42   0 perl/4
 19894 mgarcia   22 0.1 0.0 0.0 0.0  77 0.0 0.0  21  10  47   0 perl/9
 19894 mgarcia   18 0.1 0.0 0.0 0.0  81 0.0 0.0  24  12  63   0 perl/22
 19894 mgarcia   18 0.1 0.0 0.0 0.0  82 0.0 0.0  20  11  51   0 perl/55

Replies are listed 'Best First'.
Re: Solaris + UltraSparc T2 + Threads: Avoid LCK's
by BrowserUk (Patriarch) on Apr 23, 2009 at 11:19 UTC

    I'm not sure if any of this is helpful as I'm on a different platform and I'm not at all sure that anything I'm seeing will be applicable. When I run this version of your code here:

    #! perl -sw use 5.010; use strict; use threads; $_->join for map { async { for( 1 .. 7e6 ) { int rand 10; } }; } 1 .. 64; say "Test 1 complete"; <STDIN>; $_->join for map { async { my $tmp; for ( 1 .. 7e6 ) { for ( 1 .. 1e5 ) { $tmp = $_; } int rand 10; } }; } 1 .. 64;

    I do perceive a very slight drop in cpu sctivity between the first batch and the second.

    The first nails the cpu counter at (or very close to) 75% (100% of 3 of 4 cpus--I have something running on the other cpu). What minor wobbles occur go no lower than 74.8; and it occasionally shows slightly greater than 75%--though that probably just the result of maths.

    The second again gets close to 75%, but the wobbles are larger. It occasionally drops as far as 73% and rarely get as high as 74.8%. So something is preventing the second from maxing the cpu. I do not see any signs of locks, though I may not be instrumenting the right things. I need to look further into the measurements available--the list is very long. But in any case, the drop is nothing like as dramatic here, as your instrumentation appears to show on your platform!

    Following anonymonks lead, I replaced (both) implicit loop counters with lexicals in the second batch:

    for my $i ( 1 .. 7e6 ) { for my $j ( 1 .. 1e5 ) { $tmp = $j; }

    And the drop in cpu utilization "went away", which does tend to indicate that it is somehow related to use of the global $_. Whilst that is global, it isn't shared, (I bleieve it is cloned on a per interpreter basis), so it shouldn't require internal locking. But globals do carry a small penalty over lexicals--regardless of the use of threads--so whether that has anything to do with it I am unsure. I can't see why it would, but I have considerable difficulty follwing the internals in areas that relate to cloning.

    I'll try too look into it in more detail, but that will take a while.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Solaris + UltraSparc T2 + Threads: Avoid LCK's
by roboticus (Chancellor) on Apr 23, 2009 at 13:27 UTC
    gulden:

    I've done plenty of multi-threaded code in C++, but never in Perl, so take this with a grain of salt.

    I suspect that RMGir is right: Since $tmp seems to be shared amongst all the threads, perl may be wrapping reads/writes to it with a mutex, and you may just have a lot of mutex lock contentions on it. If I understand perl well enough (and I likely don't), then I suspect that moving the my $tmp into the body of the first for loop may resolve the problem: that way, each thread should(?) have it's own copy of $tmp.

    ...roboticus
Re: Solaris + UltraSparc T2 + Threads: Avoid LCK's
by RMGir (Prior) on Apr 23, 2009 at 11:22 UTC
    Edit: Ignore all of this, it looks like BrowserUK figured it out just above...

    I don't have a sufficiently recent threaded perl build around to test this theory, but I'd look at that assignment to $tmp in the inner loop in test2.

    What happens if you just copy the int(rand(10)) into the inner loop without an assignment?

    Of course, if this turns out to be the answer, WHY an assignment to a thread-local variable would cause that is a different question...


    Mike
Re: Solaris + UltraSparc T2 + Threads: Avoid LCK's
by Anonymous Monk on Apr 23, 2009 at 09:44 UTC
    Maybe $_ in the inner loop is causing locking? Test with
    for (1..7235617){ for my $i (1..100000){ $tmp = $_; } }
      The LOCKS are not in the inner loop!!! The same behaviour when i put your code in Test1...

      The solution for this is the less important, the motives for that, its what I want?

Re: Solaris + UltraSparc T2 + Threads: Avoid LCK's
by gulden (Monk) on Apr 23, 2009 at 17:32 UTC
    The problem is in the function rand (), if I remove this feature in any case, ceases to have locks.

    This must be related to the way the rand function is implemented ....

      Hmm. That doesn't make sense. You have calls to rand in both examples.

      The only difference being the frequency with which it is called; several thousand times more frequently in the first apparently non-locking example, than in the second apparently locking example. If rand were responsible for the locks then the first example should be the one to display the symptoms of them. Not the second.

      Methinks you are misinterpreting your results.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Ok, I've cheated the test :D, didn't realize that... sorry

        The problem its related with the random function, whenever I use it the LOCKs appears... When I don't use it the CPU reaches the 100%

        Can we assume that we must avoid rand function whenever using threads?
      It's not completely crazy to think that getting random numbers might involve a lock, but it's going to depend on the system, and I don't think that's what's happening on the system gulden is using.

      On Solaris, my perl 5.8.8 gets random numbers by opening /dev/urandom and reading a value. I don't think that's likely to involve locks.

      But one could certainly imagine a (less than ideal) pseudo-random number generator implementation that uses locks to protect shared state.


      Mike
        But one could certainly imagine a (less than ideal) pseudo-random number generator implementation that uses locks to protect shared state.

        I agree.

        What is crazy is blaming rand, when in the data he posted, the code that ran rand in tight loops on 64 threads, showed no locking symptoms, but the version that interspersed each call to rand with a 100k assignments did.

        A conservative test shows the former making ~700,000 calls to rand per second, the latter 105/sec. If either was going to suffer lock contention, you'd expect it to be the former.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://759489]
Approved by ELISHEVA
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (4)
As of 2024-04-19 04:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found