Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Re^4: Recursive locks:killer application. Do they have one? (mu)

by BrowserUk (Pope)
on Feb 02, 2012 at 23:55 UTC ( #951560=note: print w/ replies, xml ) Need Help??


in reply to Re^3: Recursive locks:killer application. Do they have one? (mu)
in thread Recursive locks:killer application. Do they have one?

I find it hard to imagine how "need to count" can have more than the most trivial of impacts on the efficiency of a mutex.

See for yourself. It's not just time but also space efficiency.

Here is perl's current implementation of recursive locking

typedef struct { perl_mutex mutex; PerlInterpreter *owner; I32 locks; perl_cond cond; } recursive_lock_t; void recursive_lock_acquire(pTHX_ recursive_lock_t *lock, char *file, int l +ine) { assert(aTHX); MUTEX_LOCK(&lock->mutex); if (lock->owner == aTHX) { lock->locks++; } else { while (lock->owner) { COND_WAIT(&lock->cond,&lock->mutex); } lock->locks = 1; lock->owner = aTHX; } MUTEX_UNLOCK(&lock->mutex); SAVEDESTRUCTOR_X(recursive_lock_release,lock); }

And that lot -- a mutex and owner, a locks count and a condition variable is built on top of this lot:

115: typedef union 116: { 117: struct 118: { 119: int __lock; 120: unsigned int __futex; 121: __extension__ unsigned long long int __total_seq; 122: __extension__ unsigned long long int __wakeup_seq; 123: __extension__ unsigned long long int __woken_seq; 124: void *__mutex; 125: unsigned int __nwaiters; 126: unsigned int __broadcast_seq; 127: } __data; 128: char __size[__SIZEOF_PTHREAD_COND_T]; 129: __extension__ long long int __align; 130: } pthread_cond_t;

And this:

76: typedef union 77: { 78: struct __pthread_mutex_s 79: { 80: int __lock; 81: unsigned int __count; 82: int __owner; 83: #if __WORDSIZE == 64 84: unsigned int __nusers; 85: #endif 86: /* KIND must stay at this position in the structure to maintai +n 87: binary compatibility. */ 88: int __kind; 89: #if __WORDSIZE == 64 90: int __spins; 91: __pthread_list_t __list; 92: # define __PTHREAD_MUTEX_HAVE_PREV 1 93: #else 94: unsigned int __nusers; 95: __extension__ union 96: { 97: int __spins; 98: __pthread_slist_t __list; 99: }; 100: #endif 101: } __data; 102: char __size[__SIZEOF_PTHREAD_MUTEX_T]; 103: long int __align; 104: } pthread_mutex_t; 105:

Which, when you realise that a non-recursive lock can be built atop a single bit, starts to look just a little indulgent.


With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?


Comment on Re^4: Recursive locks:killer application. Do they have one? (mu)
Select or Download Code
Re^5: Recursive locks:killer application. Do they have one? (counting overhead)
by tye (Cardinal) on Feb 03, 2012 at 04:35 UTC

    Yes, counting requires a tiny bit of space for the counter. Obviously. Even if that doubled the space overhead of the mutex, it would be quite the rare situation when that would matter to me.

    The overhead of the counting indeed looks pretty trivial to me. We are only talking about the following:

    ... I32 locks; ... lock->locks++; ... lock->locks = 1; ...

    The other "overhead" you show is so locking can block, so Perl can have a single interface over multiple implementations of blocking locking on multiple operating systems, so Linux can implement a portable blocking locking interface over the current choice of kernel implementation, so Linux can detect deadlock loops, so Linux users can select different types of "wake order" behavior, so somebody can adjust "spin" to reduce context switches in certain scenarios, etc.

    Which, when you realise that a non-recursive lock can be built atop a single bit, starts to look just a little indulgent.

    I don't see how one can implement blocking locking using a single bit. I certainly see some overhead that could be eliminated for the particular environment you are looking at. I suspect such would require more 'ifdef' work and thus create more complexity at other layers (assuming that dropping support for other platforms is not allowed) for the sake of conditionally reducing some run-time complexity in some environments. It is possible that such might even have a noticeable benefit in such environments.

    Getting rid of the counting there isn't going to make much difference. But surely you are instead implementing a new alternative, so not bothering to implement counting seems worth considering. I don't even see any advantage to exposing an API that would interfere with deciding to add support for counting at some later date (which means not bothering to implement it up front is probably wise).

    - tye        

      I don't see how one can implement blocking locking using a single bit.

      Here you go:

      #include <windows.h> #include <stdio.h> #include <time.h> #include <process.h> typedef struct { void *protected; int loops; } args; void lock( void *protected ) { while( _interlockedbittestandset64( (__int64*)protected, 0 ) ) { Sleep( 1 ); } } void unlock( void *protected ) { _interlockedbittestandreset64( (__int64*)protected, 0 ); } void worker( void *arg ) { args *a = (args*)arg; int i = 0; for( i=0; i < a->loops; ++i ) { lock( a->protected ); *( (int*)a->protected ) += 2; unlock( a->protected ); } return; } void main( int argc, char **argv ) { int i = 0, nThreads = 4; clock_t start, finish; double elapsed; uintptr_t threads[32]; int shared = 0; args a = { (void *)&shared, 1000000 };; if( argc > 1 ) nThreads = atol( argv[1] ); if( argc > 2 ) a.loops = atol( argv[2] ); printf( "threads:%d loops:%d\n", nThreads, a.loops ); start = clock(); for( i=0; i < nThreads; ++i ) threads[ i ] = _beginthread( &worker, 0, &a ); WaitForMultipleObjects( nThreads, (HANDLE*)&threads, 1, INFINITE ) +; finish = clock(); elapsed = (double)(finish - start) / CLOCKS_PER_SEC; printf( "count: %lu time:%.6f\n", shared, elapsed ); }

      And a run with 32 threads all contending to add 2 to a shared integer 1 million times each:

      C:\test\lockfree>bitlock 32 1000000 threads:32 loops:1000000 count: 64000000 time:1.332000

      And implemented using the simplest primitive possible -- one that will be available in some form on any modern processor.

      The other "overhead" you show is so locking can block, so Perl can have a single interface over multiple implementations of blocking locking on multiple operating systems, so Linux can implement a portable blocking locking interface over the current choice of kernel implementation, so Linux can detect deadlock loops, so Linux users can select different types of "wake order" behavior, so somebody can adjust "spin" to reduce context switches in certain scenarios, etc

      And therein lies the rub. Perl implements it own recursive locking in terms of pthreads 0.1 primitives. But those "speced" pthreads primitives have long since been superseded on every modern *nix system by vastly more efficient effective and flexible primitives -- eg. futexes -- which already have recursive capabilities.

      And then on other platforms -- ie. windows -- the pthreads 0.1 primitives are clumsily emulated using oldest, least effective OS primitives.

      Everyone, everywhere is getting big, slow, clumsy emulations of a defunct standard instead of being able to use the modern, efficient, effective mechanisms that have evolved since the pthreads api was frozen in stone.

      And all those "so Linux users can" and "so somebody can" are pie-in-the sky, what-ifs and maybes that can never happen for perl users anywhere. Typical, lowest common denominator stuff.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

        I guess you can call that blocking locking. I call it a spin lock. And it likely requires a bus lock which can be a bad choice if you want high concurrency.

        Nice to see you actually get to the point on efficiency. Sounds like stuff that is worth patching. (Of course, most of that has pretty much nothing to do with counting.)

        Peace.

        - tye        

Re^5: Recursive locks:killer application. Do they have one? (mu)
by sundialsvc4 (Monsignor) on Feb 03, 2012 at 14:44 UTC

    I think you hit the nail on the head with this point:   (emphasis mine)

    you have a class that mostly just deals with the bits that need to be under a specific mutex. So the code to be run under the mutex is kept very small and cohesive by being its own class that just concentrates on doing the locking right.

    The very simplest atomic mechanisms are the most preferable to me; ones that make no attempt whatever to do the right thing for me.   If I need to protect a block such that more than one body of code can be in it, I know how to build that.   If I need to allow a single actor on the stage to grab more than one claim to it at a time, I know how to build that, too.   But I am also going to build other forms of rules and error-detection into that same mechanism such that, if the software ever does something that I did not intend for it to do and certainly did not think that it was capable of doing, the software itself will tell me that it has failed.   In my designs, I want to build those things ... and I can count on the fingers of one hand the number of times in more than thirty years that I have ever had the need to do so.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://951560]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (12)
As of 2014-09-16 14:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (24 votes), past polls