Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Reliable asynchronous processing

by BrowserUk (Patriarch)
on Jul 07, 2005 at 22:34 UTC ( [id://473266]=note: print w/replies, xml ) Need Help??


in reply to Reliable asynchronous processing

Perl threads are not ready for Prime Time.

What does that mean? Who said it? And how will you know unless you try it?

Does it get easier? Of course, the devil is in the details, but then, you didn't give us any.

#! perl -slw use strict; use threads; use Thread::Queue; use Data::Dumper; our $QMAX ||= 1000; our $TMAX ||= 3; our $N ||= 1000000; my $Q = new Thread::Queue; sub thread { my $tid = threads->self->tid; for( 1 .. $N ) { $Q->enqueue( join ':', $tid, int rand( 10 ) ); select undef, undef, undef, 0.01 while $Q->pending > $QMAX; } $Q->enqueue( undef ); } threads->new( \&thread )->detach for 1 .. $TMAX; my %collate; for ( 1 .. $TMAX ) { while( my $data = $Q->dequeue() ) { my( $src, $value ) = split ':', $data; $collate{ $src }{ $value }++; } } print Dumper \%collate;

Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

Replies are listed 'Best First'.
Re^2: Reliable asynchronous processing
by cmeyer (Pilgrim) on Jul 09, 2005 at 00:53 UTC
    Perl threads are not ready for Prime Time.
    What does that mean? Who said it? And how will you know unless you try it?

    threads::shared says it:

    BUGS bless is not supported on shared references. In the current version, bless will only bless the thread local reference and the blessing will not propagate to the other threads. This is expected to be implemented in a future version of Perl. Does not support splice on arrays! Taking references to the elements of shared arrays and hashes does not autovivify the elements, and neither does slicing a shared array/hash over non-existent indices/keys autovivify the elements. share() allows you to share $hashref->{key} without giving any error message. But the $hashref->{key} is not shared, causing the error "locking can only be used on shared values" to occur when you attempt to lock $hasref->{key}.

    DBI says it:

    Threads and Thread Safety (...) Using DBI with perl threads is not yet recommended for producti +on envi- ronments. For more information see <http://www.perl- monks.org/index.pl?node_id=288022>

    liz says it on PerlMonks: 288022 (the node referenced by DBI above).

    rt.perl.org says it.

    Considering those references, I wouldn't feel comfortable using Perl ithreads in an application designed to accomodate high transaction volumes. I understand that there have been a lot of fixes to ithreads since 5.8.1 (current at the time of liz's post), but there's still no COW. That alone makes it very undesireable.

    -Colin.

    WHITEPAGES.COM | INC

      Let's take those one at a time.

      • Liz's node says: "Perl's ithreads are not light.".

        That's true, but then they could not be so.

        It's like saying that trucks weight more than cars. It's true, but that doesn't stop trucks being useful or usable. It just says, that you shouldn't pretend you are driving a car when your driving a truck.

        Many of the other issue's Liz raises are true limitations of iThreads, and indeed, she provides several modules that can be used to make many of these problems less apparent.

        But as you rightly point out, Liz's node was written at the time of 5.8.1, which as those of us that have been following along know, was the very worst build for threads problems ever. It was worse than it's predecessor and was very rapidly superseded by 5.8.2 which went a long way to fixing many of the problems it introduced.

        Basing your judgment on Liz's post, is like saying "Perl can't do structured data", based on the docs for Perl4--it is wildly out of date.

        Since then, there have been several more builds which have each cleaned up outstanding bugs further until, in my estimation, Perl's threads are stable. That is, they (mostly) comply to their "specification".

        Note: That does not by any means say:

        1. They are perfect--nothing ever is!
        2. They are the easiest thing in the world to use--their specification unfortunately guarantees that they cannot be!
        3. That they are guaranteed bug-free--but nor is anything else in Perl.

        Many of the other issues that Liz raised in that article are non-issues when you stop expecting threads to act like forked processes.

        Threads and forks are different.

        IMany of the issues Liz raises come directly because ithreads have been design to work in a fork-like manner. Ie. duplicating all existing code and data at the point of thread creation.

        Indeed, if this was not done, many of the issues with ithreads--including the perceived need for COW-- would disappear!.

        If thread->create( <coderef>, ... ); simply created a new thread running a new interpreter running the coderef supplied, and left the programmer to decide what needed to be loaded into that interpreter and shared with that new interpreter, most of the issues would not exist.

        With the greatest respect to Liz, her Things you need to know before programming Perl ithreads has completely undone ithreads because it continues to be used as the derigour reason for not using threads, which means no-one uses them, which means the issues with them never get addressed and little or nothing happens by way of improvement. It's a vicious circle that leads to the next issue.

      • Using DBI with perl threads is not yet recommended for production environments. For more information see <http://www.perlmonks.org/index.pl?node_id=288022>

        Notice how it references the same, out-of-date information.

        Indeed, in my (admittedly limited) experience, there is no problem using DBI in conjunction with iThreads provided you only use DBI from one thread only.

        The same is true of many other things like Tk.

        And whilst that may seem like a major restriction, in practice, using one thread to take care of your DBI interactions and another thread to maintain your user interface, whether a GUI or HTTP, is a very sane way to structure your application. In fact, even if both Tk and DBI could guarantee thread-safety on all platforms and with all DBs, and all DB interface libraries--which is unlikely to ever be true--I'd still recommend using separate threads for each anyway.

        And that bring me to:

      • rt.perl.org

        If you look at those outstanding tickets, several of them, including the first half dozen on the list, are issues to do with 5005threads, and literally nothing to do with ithreads

        It is also possible to produce a list of outstanding bugs for many other areas of Perl, but that isn't stopping those features from being used in production.

      • Finally, there are the limitations described in threads::shared.

        Most of these restrictions could be lifted.

        And if there was more demand for them, they quite probably would have been lifted by now. The skills of the p5p guys are certainly up to the task as most of them are not that complicated, but without demand, there is little incentive for the work to happen.

      Again, none of that means that iThreads is the perfect api or that I wouldn't like to change things--but then there are many other things in Perl5 that are less than perfect. IO state in globals, the object model, syntax inconsistencies etc., but none of these things prevent Perl being usable in production environments to good effect. It simply means that you have to work within and around them.

      I think iThreads, and their restrictions are the same. Work with them and they can greatly simplify many programming problems that are awkward, messy, non-intuitive and a maintenance headache to deal with using the alternatives.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      The "good enough" maybe good enough for the now, and perfection maybe unobtainable, but that should not preclude us from striving for perfection, when time, circumstance or desire allow.

        I don't mean to be incendiary. I am interested in Perl's ithreads, and wish that they were a viable option. I've played around with threads in 5.6.x, was quite excited at the prospect of them becoming better with 5.8.x and played some more. I've read (as a lurker on p5p) the messages from people like Arther, Liz and Stas working to make threads useful.

        However, I feel that Liz's post is still very appropriate, and that's why I bring it up. Perl's ithreads have had many fixes applied to them, but the problems pointed out in that post (all data structures are copied, shared variables take up extra memory) are still true.

        codon's update says that he needs to cache a large amount of data in memory. Having long lived threads limits cpu hit incurred by copying that cache. But the lack of either COW or truly shared memory limits the number of threads that can be running simultaneously to (ramsize - os overhead) / (cache + thread overhead). This is a big hit for memory intensive applications.

        Basing your judgment on Liz's post, is like saying "Perl can't do structured data", based on the docs for Perl4--it is wildly out of date.
        That's quite an exaggeration. The difference in Perl data structures between 4 and 5 is dimensional. The difference in Perl's ithreads between 5.8.1 and 5.8.7 fixes a ton of memory leaks and crashes, but they constitute no paradigm shift.

        By recent talk on p5p, I'd guess that Liz's future speculations remain as good a prediction as ever (don't expect for better threads (regarding memory usage) until ponie/parrot/p6). Things will continue to improve for 5.10.x, but I don't see it being likely that we'll get COW or real shared data.

        -Colin.

        WHITEPAGES.COM | INC

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://473266]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chilling in the Monastery: (4)
As of 2024-03-19 11:42 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found