Programming thread pools and syncronisers and stuff feels very similar to programming low level functions in C - it focusses on what I want the computer to do, rather than telling it what I want to get.
I utterly, totally, completely agree with that statement. Which is why I am so pleased to see that Perl6 has recognised the issues, has them covered, and already working!
For (one possible flavour) of the underpinnings, that is not (yet) the case.
So I'm definately in the "future is parallel, but probably not threaded" camp.
There are a whole class of programs, somewhat characterised by the algorithms for which an entire class of specialist processor units (vector processors) have been built to deal with, that can benefit enormously from the multiple cpu/cores that are now becoming common place, but that require (or at least benefit enormously from) shared state, and cannot be as easily or as efficiently performed using processes (or clusters) as they can using threads.
Whilst these algorithm are often described as being for scientific work, and more recently for graphical work (games & audio, for which graphics and sounds cards carry specialist GPUs & DSPs), those same and similar algorithms can also be used for much more mundane and everyday computing tasks once you have the processing power to utilise them.
Three (recent, real-world) examples that have turned up or been referenced here at PM:
All three can be tackled using multiple processes (and by implication clusters), but all three have the need for, or can greatly benefit from, a feed back loop of status information and/or intermediate results to the parent execution context controlling the spawning.
This can be achieved with processes via bi-directional IPC, but if done through pipes, sockets or message queues, the 1 to many/many to 1 reads and writes require both the parent and child to use non-blocking IO. That means every process also has to become a state machine. For clusters, the communications further involves the network and networking bandwidth, latency and topology issues
This can be achieved through the file system, but that requires semaphores and/or file locking and non-blocking IO. Again, every process has to become a state machine. For clusters, the files will at some point be remote and so the networking issues are again a factor.
A threaded solution needs locks, but in every other way is simpler, easier to debug and faster.
For that class of parallelisation problems for which shared state, or child to parent communications is either required or beneficial, no other solution comes even close to be as simple or efficient as threading.
Once you remove the need for the application programmer to worry about locking, even at the penalty of some extra delays when using read-only references and the minor cost of a condition test on every access to shared data, the advantages far outweight the alternatives.
I'm as yet undecided whether Software Transactional Memory (STM) is the right solution to taking locking out of the hands of the application programmer. Having done a fair amount of DB programming, including designing and writing the infrastructure for a unique, widely distributed, multi-transport, DB query mechanism, I've encountered the problems that transactions bring with them.
If the granularity of the transactions is set too big, you kill your performance by blocking concurrent access and the costs of intermediate storage for rollback can get too high to manage.
Set the granularity of the transactions too small, and the costs of rollback when it happens become prohibive, and/or you risk allowing the utilisation of out-of-date or rolled-back data.
STM doesn't have the same issues of protocol and transmission latency usually involved with DB accesses, so this maybe a non-issue, but there is still enough doubt in my mind to cause me some concern.
But got right, STM is one of several promising mechanisms for allowing the "big issue" with shared state to be taken out of the hands of the application programmer and clearing the way for simple, safe and ubiquitous threading.
<code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>