Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re^6: RFC: Implicit Parallelization Pragma

by BrowserUk (Patriarch)
on Jan 27, 2005 at 17:14 UTC ( #425630=note: print w/replies, xml ) Need Help??

in reply to Re^5: RFC: Implicit Parallelization Pragma
in thread RFC: Implicit Parallelization Pragma

I agree. Fibers are an interesting, and potentially useful architectural feature of Win32, but they do not address the OP topic directly.

There are two types of parellelism that need to be addressed.

  • vector parallelism.

    This kind of parallelism is the kind that is already fairly well handle by dedicated vector processors (as the big iron guys call them) and by GPUs and DSPs. The kind where a single, identical, often CISC, opcode is performed on a large number of datapoints in parallel.

    This kind is quite easily dealt with in hardware. The depth of the parellelisation is basically controlled by how much silicon you are prepared to dedicate to the operation.

    A generalisation of this would be to allow a whole sequence of opcodes (an entire subroutine or block of code) to applied to a large number of datapoints simultaneuosly. Again, the datapoints would be loaded into "special memory", hardwired to operate each opcode on all of those registers in parallel.

    The criteria for what constituted a suitable block of code (ie. reentrant code without side-effects or dependancies beyond it's parameters, and probably with a single result from each call), would be a software (compiler or interpreter) decision. It would also require good optimisers (whether compiler or interpreter) to make best use of such parellelism.

  • Sync point parellelism.

    This is where different sections--usually sequential--of the source code can be run in parallel until a point is reached where they share a common dependancy. Although analogous to the pipelining that many modern processors do, to get best benefit, it needs to be done at a macro (source code) or function point) level rather than the micro (a few sequential opcodes) level as done is with pipelining.

    I think (within the limitations of my sparse knowledge of silicon), that pipelining has gone just about as it can go. The economics of branch-point prediction and the costs of flushing the pipeline when the BPP goes wrong, severely limit the effectiveness of pipelining beyond a certain, rather low, limit.

    In order for best use of syncpoint parellelism to made, OSs will need to radically alter the scheduling schemers they use. The current round-robin within priority groups, with starvation priority promotion, and processes (or threads as currently implemented) as atomic entities will have to be replaced by a much finer grained mechanism. And compilers and interpreters will have to get a lot cleverer to make good use of it.

    In effect, the processor pool will be driven by a 'macro-processor' overseen, queue of 'units of code' that need processing. Each individual process will constitute a stream of VHL opcodes. The macro-processor will enqueue these, interleaving the VHL opcodes from different processes according to priorities etc. The pool of processors will pull the next available unit of code off of the central queue and execute it (without regard to what process it belongs to), and then go back and grab the next available. Reentrancy will be paramount. As will a capabilities based security mechanism.

    An interesting, and site-topical, sideeffect is that interpreted code will probably carry a much smaller penalty relative to compiled code, as the compilers will need to produce streams or groups of small, self-contained units of code.

    Once code is compiled/interpreted into these self-contained units, it then becomes possible to transmit these units to external processors or pools of cooperating machines to achieve massively parallel operations across peer groups of lan/net connected machines.

    It's a hard concept to describe in words, and my attempts at an ASCII art diagram left much to be desired. There are probably no web references I can give either as it is very much a prediction of where I personally think things will go, rather than a recounting of any individual piece I have read.

    A sort of mental mish-mash, reading between the lines of everything I have read.

Just a notion :)

Examine what is said, not who speaks.
Silence betokens consent.
Love the truth but pardon error.
  • Comment on Re^6: RFC: Implicit Parallelization Pragma

Replies are listed 'Best First'.
Re^7: RFC: Implicit Parallelization Pragma
by Anonymous Monk on Jan 27, 2005 at 19:03 UTC

    Indeed, your vision of the sync-point parallelism is probably what the ideas in the original node need as a viable CPU platform.

    I'd love to see the assembly code for that sort of piece-wise programming. This is where Smalltalkish uber-OOP could be actually used on hardware level.

      I'd love to explore these ideas further.

      However, apparently in-depth discussion is beyond the remit of this site. However, if you would like to take this discussion further at another place, you could grab yourself an id for the purpose of our making private contact and we can arrange for that discussion to take place elsewhere.

      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.
      Indeed, your vision of the sync-point parallelism is probably what the ideas in the original node need as a viable CPU platform.

      Do Anonymonk's check for follow-ups?

      With regard to viable CPU platforms--did you see Monday's announcement?

      A little more (speculative) information.It comes to pass :)

      Note the diagram "Distributed processing with cells" and the mention of "software cells".

      Boy, would I like to get my mitts on one of those, along a with a properly threaded OS.

      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://425630]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (2)
As of 2023-09-22 01:13 GMT
Find Nodes?
    Voting Booth?

    No recent polls found