Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Re^4: Reliable asynchronous processing

by cmeyer (Pilgrim)
on Jul 10, 2005 at 01:19 UTC ( #473719=note: print w/replies, xml ) Need Help??

in reply to Re^3: Reliable asynchronous processing
in thread Reliable asynchronous processing

I don't mean to be incendiary. I am interested in Perl's ithreads, and wish that they were a viable option. I've played around with threads in 5.6.x, was quite excited at the prospect of them becoming better with 5.8.x and played some more. I've read (as a lurker on p5p) the messages from people like Arther, Liz and Stas working to make threads useful.

However, I feel that Liz's post is still very appropriate, and that's why I bring it up. Perl's ithreads have had many fixes applied to them, but the problems pointed out in that post (all data structures are copied, shared variables take up extra memory) are still true.

codon's update says that he needs to cache a large amount of data in memory. Having long lived threads limits cpu hit incurred by copying that cache. But the lack of either COW or truly shared memory limits the number of threads that can be running simultaneously to (ramsize - os overhead) / (cache + thread overhead). This is a big hit for memory intensive applications.

Basing your judgment on Liz's post, is like saying "Perl can't do structured data", based on the docs for Perl4--it is wildly out of date.
That's quite an exaggeration. The difference in Perl data structures between 4 and 5 is dimensional. The difference in Perl's ithreads between 5.8.1 and 5.8.7 fixes a ton of memory leaks and crashes, but they constitute no paradigm shift.

By recent talk on p5p, I'd guess that Liz's future speculations remain as good a prediction as ever (don't expect for better threads (regarding memory usage) until ponie/parrot/p6). Things will continue to improve for 5.10.x, but I don't see it being likely that we'll get COW or real shared data.



Replies are listed 'Best First'.
Re^5: Reliable asynchronous processing
by BrowserUk (Pope) on Jul 10, 2005 at 06:08 UTC
    The difference in Perl's ithreads between 5.8.1 and 5.8.7 fixes a ton of memory leaks and crashes, but they constitute no paradigm shift.... Things will continue to improve for 5.10.x, but I don't see it being likely that we'll get COW or real shared data.

    I also lurk and I agree that neither of these is likely to happen. However, where I disagree is that either of these would help.


    This is only an advantage if most of the items in COW data segments are never written to.

    Rather than everything being copied when a thread is spawned, it would either require:

    • All memory in threaded builds to carry the overhead of 'write detection'--shared or not.

      This overhead would exist even when threads were not used.

    • All shared items would need to be so marked at the point of thread spawn and, each block would need to be copied whenever any item within that block was written to.

    COW is not magical. To quote wikipedia:

    One hazard of COW in these contexts arises in multithreaded code, where the additional locking required for objects in different threads to safely share the same representation can easily outweigh the benefits of the approach.

    MMU cow

    Let's assume that COW was implemented use OS level MMU detection. Sounds efficient, and for a single-threaded processes, it is.

    But now think about the implications for a threaded process. When a section of memory in a base thread is marked as COW, the MMU will have to find equivalent sized pieces of virtual address space within that processes memory map and reserve them, for every thread that can potentially write to that memory!

    And remember, even if that virtual address space is never written to, none of the virtual address space can be reused for anything else until the reserving thread goes away.

    You have 1 thread with 1 MB of 'shared' data. You create 10 more. You mark that 1 MB as COW. Each of the other threads has to have a 1 MB chunk of virtual memory reserved for it in case it should write to it. All those 11 1 MB chunks come out of the one process's memory map. Even if they are never written to, the 'potential space' still has to exist, and the OS still has to allocate and manage that space.

    Now think about what happens when two threads start to write to their shared data chunks. Say they both decide to write to a single, different bit within the same page. At the point when they decide to write, the OS, has to suspend the process (not thread) in order for it to locate a physical page of ram to back the virtual page that is written. And it has to do this for both threads. That is, the entire process gets suspended everytime any thread writes to a piece of COW ram. And each time it is suspended, a 4k (or whatever) page of real ram has to be allocated and mapped into the process's virtual address space.

    Even user-level read accesses frequently cause internal level write accesses!

    And that page is an arbitrary chunk of ram. It could cross the allocation of a Perl data structure. That is, the first part of the 4k could be the tail of a Perl array; the middle could be a few SVs; the end, the start of a large hash, or a string, or coderef. And this would happen the first time any bit is written: Any one of the flags in the SV headers for example, or when the IV or NV fields are updated to reflect the numerical value of the SV because, although the user code didn't actually write to the SV, it read it in a numerical context, and so Perl will write to the IV/NV field.

    Perl will have to manage and coordinate these. Ie. Whenever one thread causes an internal update to it's copy of an SV, Perl will have to update all the other copies to reflect the change.

    Even taking a reference will mean that that every copy of the SV will need to have their reference counts updated.

    Contrast the ongoing costs of this piecemeal copying of pages of memory (often cutting across internals level data structures), with the one-off cost of the up-front copying of just those internals-level items that need to be copied!


    Now think about the implications for locking. If you have two threads taking references to the same SV, Perl will have to ensure that the accesses to the reference count field are synchronised. Note: No thread has 'written' to the data, but it still had to be copied--on every thread.

    As any write to a data structure, even those caused by user level reads to apparently non-shared user-level items of data, will cause the page that data item is in to be COWed, effectively, every data item must be treated as if it was shared.

    Garbage collection and memory management

    Now think about the implications for the garbage collector and memory management. Perl manages memory internally in terms of it's own (fat) data structures. These cannot be easily aligned to 4kb page boundaries, so every write will cause a multiplication of copies of not just the written to data item itself, but also any other (possibly partial) data items that share the page with the written to item. And there will be one copy for every thread that can potentially share that item. This makes for an an n^2 problem when trying to delete an item from memory. You don't just delete in one place, but have to inspect every thread's memory looking for copies. And reference counting would need to synchronise the reference counts across all those copies.

    COW does not work for threads, because after an item is COWed, all copies need to be maintained, in synchronisation, in perpetuity!

    The upshot is, that whilst COW works for processes, it only does so because once something is copied, the copies become independant entities in separate processes, and they never need to be coordinated or synchronised.

    Real shared data

    This was tried (5005Threads), and was rapidly abandoned, and for very good reasons. If all data is shared, all the problems that apply to shared data under the COW model, affect everything, shared or not. And all of the overheads currently affecting only declared-shared data, then affect all data.

    It means that every access, internals initiated and user initiated, to every piece of data--and remember that Perl code is data--will need to perform locking and syncronisation. This includes reference counting, tainting, IV/NV updates, deletetions, extensions etc. Even stuff like blessing (and all other forms of magic), studying, the Boyer-Moore stuff, lvalue refs, pos, shared keys, autovivification, lazy delete, scalar range iterators, glob iterators, m//g iterators, hash iterators, IO iterators.

    See Perlguts Illustrated and take a look at the number of internal fields within Perl's datastructures that would all need to be maintained in synchronisation, to get a feel for the complexity of this task.

    Every time Perl accessed any scalar, array, array item, array length, hash, hash heys etc. etc., from any thread, it would need to handle the possibility that that item was also being accessed from another thread. Even in a single thread application, it would--at minimum--need to check whether threads were enabled.

    The costs and complexities involved in making shared data "really shared", make the idea a non-starter. Making all data shared is even less practical.

    The bottom line for Perl5 is that the implementation of threading is unlikely to change in any significant way.

    But that does not preclude that the existing implementation, if stable, cannot be useful. It already is. It does require that you use a particular style of coding in order to make good use of them. Most of these involve trying to prevent Perl from implicitly sharing stuff that you don't want shared.

    But until people start using threads in a wide variety of situations, the techniques needed to use them successfully, will not evolve.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://473719]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (6)
As of 2018-06-20 19:27 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (117 votes). Check out past polls.