Beefy Boxes and Bandwidth Generously Provided by pair Networks Frank
Welcome to the Monastery
 
PerlMonks  

OT: Solaris expertise?

by BrowserUk (Pope)
on Sep 28, 2011 at 00:53 UTC ( #928213=perlquestion: print w/ replies, xml ) Need Help??
BrowserUk has asked for the wisdom of the Perl Monks concerning the following question:

Can anyone make sense of the following extract from a paper(pdf):

Compare and swap (casx): This instruction swaps the contents of one memory position allocated in the L2 data cache with the value of a register. This means that this instruction always accesses a memory location in L2 cache.

How (On Solaris, possibly only in assembler?) do you allocated a piece of memory such that "This means that this instruction always accesses a memory location in L2 cache.".

That is:

  1. How do you allocate memory "in the L2 cache"?

    There is no further explanation of this in the paper. They do however mention a pointer chasing arrangement to produce consistent L2 cache misses, which makes sense and suggests they know what they are talking about.

  2. How do you prevent it getting promoted to the L1 cache the first time it is accessed?

    Their very purpose in using the instruction is to benefit from the low-impact high-latency of an L1 cache miss. They further go on to say:

    Its latency is 39 cycles in T1 and between 20 and 30 cycles in T2 (in our experiments it takes almost always about 30 cycles). This instruction does not excessively stress the processor structures that could be used by the active thread. In fact, casx only uses one entry of the shared LSU structure that connects the core to the interconnection network. Moreover, the memory space requirements of using this instruction are very low since all the spin-locks can access the same memory position.

    Which makes it unlikely that the above is a slip of their tongues or otherwise a misinterpretation of their meaning.

I'm trying to work out how to apply their work on a Intel processor. The Perl link is another attempt at trying to make efficient shared memory available to from Perl.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on OT: Solaris expertise?
Re: OT: Solaris expertise?
by Corion (Pope) on Sep 28, 2011 at 07:34 UTC

    This is less Solaris and more (Ultra)SPARC architecture, and that paper discusses some "OS" I've never heard of, "NetraDPS". I found some UltraSparc CPU documentation, but it is somewhat vague on the memory accesses issued by a CASX (or CASXA) instruction.

    My vague interpretation of D.2.5.3 is that a CASX instruction will fetch the appropriate page into L2 cache if it is not already there, and then perform the exchange only in the L2 cache, and not force an immediate write-back to the main RAM. This somewhat matches the error behaviour from 16.9.1.7, where a conflict between ECC corrections and CASX instructions on writeback may occur.

      My vague interpretation of D.2.5.3 is that a CASX instruction will fetch the appropriate page into L2 cache if it is not already there, and then perform the exchange only in the L2 cache, and not force an immediate write-back to the main RAM.

      Thank you for the link and your interpretation. It still took a while for the penny to drop, but it has now.

      On the T1 & T2, L1 (instruction & data) caches are per core. The L2 cache is shared between all cores. Compare & swap instructions are specifically designed for intra-thread & intra-core signalling, they therefore have to be coherent at the L2 cache.

      The L2 cache coherency requirement is what causes their high latency; that they only affect a single L2 cache line, their low impact; thus making them perfect for the task of reducing spin-lock 'burn'.

      Now to seek out the equivalent X64 instruction :)


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: OT: Solaris expertise?
by cnd (Novice) on Feb 20, 2012 at 10:57 UTC
    L2 cache is managed by the hardware inside the CPU - everything goes through here, so you don't (can't) "allocate" in it. However, if you take care to keep your code efficiently small, then the L2 cache will end up containing your entire codeset, and thus run faster.

    It looks like the casx instruction is giving ASM programmers some means to directly use the L2 cache, outside of the normal CPU hardware cache management.

    If "casx" isn't an intel instruction, then forget about using it on that processor!

    None of this seems relevant to perl shared memory though - where does perl come into this???

      "where does perl come into this???"

      It doesn't have to. Threads marked 'OT' or 'Off topic' often have relevance to computing issues loosly associated with perl, while not being perl specific.

      If "casx" isn't an intel instruction, then forget about using it on that processor!

      All modern SMP & multi-core processors have 'atomic compare and swap' instructions. (IBM CPUs invented them back in the early 1970s.)

      On Intel & AMD x64 processors thay are called variously: CMPXCHG8B & CMPXCHG16B (amongst other variations).

      The question was purely about the particular semantics of the Solaris version as that was used by the researchers of the paper I was reading. It was important for me to understand those semantics so that I could work out whether the x64 equivalents were compatible with their algorithms. They are (kinda).

      None of this seems relevant to perl shared memory though - where does perl come into this???

      It is. Or rather it will be soon.


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      The start of some sanity?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://928213]
Approved by andreas1234567
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (16)
As of 2014-04-18 11:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (466 votes), past polls