|Perl: the Markov chain saw|
Re^5: Reliable asynchronous processingby BrowserUk (Pope)
|on Jul 10, 2005 at 06:08 UTC||Need Help??|
The difference in Perl's ithreads between 5.8.1 and 5.8.7 fixes a ton of memory leaks and crashes, but they constitute no paradigm shift.... Things will continue to improve for 5.10.x, but I don't see it being likely that we'll get COW or real shared data.
I also lurk and I agree that neither of these is likely to happen. However, where I disagree is that either of these would help.
COWThis is only an advantage if most of the items in COW data segments are never written to.
Rather than everything being copied when a thread is spawned, it would either require:
COW is not magical. To quote wikipedia:
One hazard of COW in these contexts arises in multithreaded code, where the additional locking required for objects in different threads to safely share the same representation can easily outweigh the benefits of the approach.
Let's assume that COW was implemented use OS level MMU detection. Sounds efficient, and for a single-threaded processes, it is.
But now think about the implications for a threaded process. When a section of memory in a base thread is marked as COW, the MMU will have to find equivalent sized pieces of virtual address space within that processes memory map and reserve them, for every thread that can potentially write to that memory!
And remember, even if that virtual address space is never written to, none of the virtual address space can be reused for anything else until the reserving thread goes away.
You have 1 thread with 1 MB of 'shared' data. You create 10 more. You mark that 1 MB as COW. Each of the other threads has to have a 1 MB chunk of virtual memory reserved for it in case it should write to it. All those 11 1 MB chunks come out of the one process's memory map. Even if they are never written to, the 'potential space' still has to exist, and the OS still has to allocate and manage that space.
Now think about what happens when two threads start to write to their shared data chunks. Say they both decide to write to a single, different bit within the same page. At the point when they decide to write, the OS, has to suspend the process (not thread) in order for it to locate a physical page of ram to back the virtual page that is written. And it has to do this for both threads. That is, the entire process gets suspended everytime any thread writes to a piece of COW ram. And each time it is suspended, a 4k (or whatever) page of real ram has to be allocated and mapped into the process's virtual address space.
Even user-level read accesses frequently cause internal level write accesses!
And that page is an arbitrary chunk of ram. It could cross the allocation of a Perl data structure. That is, the first part of the 4k could be the tail of a Perl array; the middle could be a few SVs; the end, the start of a large hash, or a string, or coderef. And this would happen the first time any bit is written: Any one of the flags in the SV headers for example, or when the IV or NV fields are updated to reflect the numerical value of the SV because, although the user code didn't actually write to the SV, it read it in a numerical context, and so Perl will write to the IV/NV field.
Perl will have to manage and coordinate these. Ie. Whenever one thread causes an internal update to it's copy of an SV, Perl will have to update all the other copies to reflect the change.
Even taking a reference will mean that that every copy of the SV will need to have their reference counts updated.
Contrast the ongoing costs of this piecemeal copying of pages of memory (often cutting across internals level data structures), with the one-off cost of the up-front copying of just those internals-level items that need to be copied!
Now think about the implications for locking. If you have two threads taking references to the same SV, Perl will have to ensure that the accesses to the reference count field are synchronised. Note: No thread has 'written' to the data, but it still had to be copied--on every thread.
As any write to a data structure, even those caused by user level reads to apparently non-shared user-level items of data, will cause the page that data item is in to be COWed, effectively, every data item must be treated as if it was shared.
Garbage collection and memory management
Now think about the implications for the garbage collector and memory management. Perl manages memory internally in terms of it's own (fat) data structures. These cannot be easily aligned to 4kb page boundaries, so every write will cause a multiplication of copies of not just the written to data item itself, but also any other (possibly partial) data items that share the page with the written to item. And there will be one copy for every thread that can potentially share that item. This makes for an an n^2 problem when trying to delete an item from memory. You don't just delete in one place, but have to inspect every thread's memory looking for copies. And reference counting would need to synchronise the reference counts across all those copies.
COW does not work for threads, because after an item is COWed, all copies need to be maintained, in synchronisation, in perpetuity!
The upshot is, that whilst COW works for processes, it only does so because once something is copied, the copies become independant entities in separate processes, and they never need to be coordinated or synchronised.
Real shared data
This was tried (5005Threads), and was rapidly abandoned, and for very good reasons. If all data is shared, all the problems that apply to shared data under the COW model, affect everything, shared or not. And all of the overheads currently affecting only declared-shared data, then affect all data.
It means that every access, internals initiated and user initiated, to every piece of data--and remember that Perl code is data--will need to perform locking and syncronisation. This includes reference counting, tainting, IV/NV updates, deletetions, extensions etc. Even stuff like blessing (and all other forms of magic), studying, the Boyer-Moore stuff, lvalue refs, pos, shared keys, autovivification, lazy delete, scalar range iterators, glob iterators, m//g iterators, hash iterators, IO iterators.
See Perlguts Illustrated and take a look at the number of internal fields within Perl's datastructures that would all need to be maintained in synchronisation, to get a feel for the complexity of this task.
Every time Perl accessed any scalar, array, array item, array length, hash, hash heys etc. etc., from any thread, it would need to handle the possibility that that item was also being accessed from another thread. Even in a single thread application, it would--at minimum--need to check whether threads were enabled.
The costs and complexities involved in making shared data "really shared", make the idea a non-starter. Making all data shared is even less practical.
The bottom line for Perl5 is that the implementation of threading is unlikely to change in any significant way.
But that does not preclude that the existing implementation, if stable, cannot be useful. It already is. It does require that you use a particular style of coding in order to make good use of them. Most of these involve trying to prevent Perl from implicitly sharing stuff that you don't want shared.
But until people start using threads in a wide variety of situations, the techniques needed to use them successfully, will not evolve.
Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.