I want to pick up on the serialisation point.
In my work, lots of tasks are of the form "read, transform, transform, transform, write" on quite large datasets. The read and write are always I/O bound, but the transforms can be cpu bound. I'd like to run the subtasks in parallel, to speed up execution, but it isn't worth forking processes for the transformations - the cost of serialisation/deserialisation is too high.
What would help is a threading model like fork, except instead of standard i/o channels the threads/processes would be able to expose native datastructures for direct read and/or write by their siblings.
Is there any facility like this in existence?