|Perl: the Markov chain saw|
RFC: Implicit Parallelization Pragmaby hardburn (Abbot)
|on Jan 25, 2005 at 16:47 UTC||Need Help??|
A while back, I left a node about concurrency in Perl. I ended that meditation with a question about how to improve concurrency in Perl.
This node recommends a pragma which gives certain guarantees to perl (note lowercase) that should allow implicit parallelization with minimal modification to existing code.
Implicit Parallelization in Functional Languages
Here's a quick-and-dirty introduction to functional languages, and how they can be parallelized by the implementation without modifying any code: Imagine a LISP function that takes an argument x and returns the result of 2 * x + 1. We can implement this as:
When you call this function, the implementation evaluates the inner-most expression first, each expression being reduced until the final result is computed:
With me so far? This should be familer to anyone who has worked through the first chapter of SICP.
Now we define another function that takes an argument x and computes (2 * x) + (3 * x) + 1:
And then we call it, with the implementation replacing the variable as before:
Stop right there. Do we evaluate (* 2 3) or (* 3 3) first? It doesn't matter, as long as the expression contains no side effects. One interesting effect of this is that the implementation can choose to execute these expressions in parallel without any additional help from the programmer.
(Of course, in this simple example, practical issues like communication time between CPUs are likely to make the parallel execution take longer. However, the same concepts can be applied to much more complex expressions.)
Implementing in Perl
Now consider this Perl map statement, which takes an array of strings and returns the length of each of those strings:
This statement can be safely parallelized. Each execution of the function does not depend on the next one. The implementation could parallelize this statement on its own, effectively breaking it from O(n) to O(n/p), where p is the number of concurrent threads doing the processing.
The problem is that Perl provides no way to guarantee that a given piece of code has no side effects. If perl tries to parallelize a statement that has side effects, then the result is non-deterministic. Programmers don't like non-determinism.
Thus, we need a way to tell perl that we guarantee that there are no side effects in an expression (and it's on our head if we break that promise). That's where the threadsafe pragma would come in. This pragma would be lexically scoped:
This means the programmer has fined-grained control over what constructs are marked safe to parellelize. This is important so that safe and unsafe expressions can be in the same source code, and is also useful when there is no practical benefit to parallelization (as in the LISP example above).
Multi-core processors are becoming increasingly popular. Within a few years, even $300 computers at Best Buy will likely have such a beast. Languages and Programmers must start equipping themselves now to make use of these systems.
"There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.