Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid

Re^8: Recursive locks:killer application. Do they have one? (How high?)

by BrowserUk (Pope)
on Feb 04, 2012 at 14:42 UTC ( #951809=note: print w/replies, xml ) Need Help??

in reply to Re^7: Recursive locks:killer application. Do they have one? (bitlock)
in thread Recursive locks:killer application. Do they have one?

I guess you can call that blocking locking. I call it a spin lock.

I take it that when you said "blocking lock", you really meant a 'waitable lock'.

Which are required for some uses, but not by any means all. And they suffer from the need for kernel involvement.

And it likely requires a bus lock which can be a bad choice if you want high concurrency.

How high is "high concurrency"?

Here are the results of runs ranging from 1 to 4096 contending threads performing 100 million additions between them:

C:\test\lockfree>bitlock 1 102400000 threads: 1 loops: 102400000 count: 204800000 time: 4.071000 proce +ssor cycles: 9748471986 C:\test\lockfree>bitlock 4 102400000 threads: 4 loops: 25600000 count: 204800000 time: 4.093000 proce +ssor cycles: 9788765607 C:\test\lockfree>bitlock 16 102400000 threads: 16 loops: 6400000 count: 204800000 time: 4.123000 proce +ssor cycles: 9859762107 C:\test\lockfree>bitlock 64 102400000 threads: 64 loops: 1600000 count: 204800000 time: 4.240000 proce +ssor cycles: 10210001571 C:\test\lockfree>bitlock 256 102400000 threads: 256 loops: 400000 count: 204800000 time: 5.077000 proce +ssor cycles: 11839214313 C:\test\lockfree>bitlock 1024 102400000 threads: 1024 loops: 100000 count: 204800000 time: 5.564000 proce +ssor cycles: 23777289990 C:\test\lockfree>bitlock 4096 102400000 threads: 4096 loops: 25000 count: 204800000 time: 8.286000 proce +ssor cycles: 55413048789

I think that whether you consider clock time or cpu cycles, that scales pretty damn well, regardless of any bus locks involved.

For contrast, here are a couple of runs of the same code, same count, but using a waitable (OS) mutex:

C:\test\lockfree>mutex2 1 102400000 threads: 1 loops: 102400000 count: 204800000 time:73.739000 proce +ssor cycles: 176641186086 C:\test\lockfree>mutex2 4 102400000 threads: 4 loops: 25600000 count: 204800000 time:631.779000 proc +essor cycles: 1247492094759

Now given the shear scale of the lack of scalability here I hope you'll forgive me for not finishing that series. So to get an indication at higher contention levels, I'll drop the count by an order of magnitude:

C:\test\lockfree>mutex2 1 10240000 threads: 1 loops: 10240000 count: 20480000 time: 7.412000 proce +ssor cycles: 17760824082 C:\test\lockfree>mutex2 4 10240000 threads: 4 loops: 2560000 count: 20480000 time:63.428000 proce +ssor cycles: 126683354358 C:\test\lockfree>mutex2 16 10240000 threads: 16 loops: 640000 count: 20480000 time:61.973000 proce +ssor cycles: 122496755463 C:\test\lockfree>mutex2 64 10240000 threads: 64 loops: 160000 count: 20480000 time:61.520000 proce +ssor cycles: 121312774089 C:\test\lockfree>mutex2 256 10240000 threads: 256 loops: 40000 count: 20480000 time:63.874000 proce +ssor cycles: 127362344526 C:\test\lockfree>mutex2 1024 10240000 threads: 1024 loops: 10000 count: 20480000 time:73.238000 proce +ssor cycles: 148526579529 C:\test\lockfree>mutex2 4096 10240000 threads: 4096 loops: 2500 count: 20480000 time:86.104000 proce +ssor cycles: 178245340731

Whilst the scaling doesn't look too bad -- at least once you get over the hump of having some contention -- it is certainly no better than the spinloop. And not only is it doing 1/10 the work, it is taking 10 times longer to do it.

But the real kicker is the cpu usage. Doing 1/10th the work using the waitable lock requires 3 times the clock cycles of the spinlock. So much for the efficiency of not spinning.

Ultimately, whether the spinlock provokes bus locks or not is irrelevant, because they are still vastly cheaper than ring 3-0-3 transitions.

I'll post the code for the above if asked.

With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

The start of some sanity?

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://951809]
[Corion]: So Hyperboria is an envrypted, decentralized IPv6 mesh network. I'm interested and want to look how good coverage is in my area. But the API for querying Hyperboria is only reachable over Hyperboria itself. Thanks but no thanks.

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2017-06-23 07:21 GMT
Find Nodes?
    Voting Booth?
    How many monitors do you use while coding?

    Results (536 votes). Check out past polls.