Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

what the history behind perl not having "real" threads

by perl-diddler (Hermit)
on Feb 25, 2013 at 07:36 UTC ( #1020469=perlquestion: print w/replies, xml ) Need Help??
perl-diddler has asked for the wisdom of the Perl Monks concerning the following question:

First I need to define what I mean... because people might point at ithreads and say it has thread support.

But I refer to the manpage 'perl' that describes perl5 (when it was new) as having "lightweight" threads.

Lightweight, means they are less than a separate process (to some level). At the very least, though, it would mean that they are NOT completely separate copies of the perl-interpreter which is what we have in the current implementation (perlthtut):

In Perl interpreter threads, each 'thread' runs in is own Perl interpreter and any data sharing between threads must be explicit.
More clearly it is stated that:
Perl Threads are Not X Threads for all values of X.

It sounds like there was an earlier version of threads in 5.005, that is referred to as having problems... was it that version that tried to use 'lightweight' threads?

Was the problem found that it was too hard too make the core interpreter thread safe? Or was it in something else?

I think about this only because sometimes I look at code and think about how this or that could be run in parallel, but if it is in a separate process -- that the overhead of setup and going into a separate process would negate any benefits of various low-level optimizations.

Sure, there are many applications where the overhead of creating separate processes is far outweighed by the work done in each process -- so that doesn't mean the concept is worthless or fatally flawed, just that the "bar" for them to be useful is higher -- especially combined with the inability to share anonymous data between them.

I mean, if you write object oriented code and want a class to provide an object for you, how are you going to specify that object such that it can be shared without jumping through hoops and restrictions?

Maybe it's a trade off between perl's flexibility and power (and ability for data to be interchangeably made program and vice-versa) that creates extra pitfalls in this area?

Just wondering if anyone knew the history of this area?



  • Comment on what the history behind perl not having "real" threads

Replies are listed 'Best First'.
Re: what the history behind perl not having "real" threads
by dave_the_m (Prior) on Feb 25, 2013 at 12:24 UTC
    Perl has had two threading implementations: "5.005 threads" and "interpreter threads". They both use the OS's underlying threading facilities; they differ in whether perl data structures are shared by default.

    5.0005 threads (introduced with perl 5.005) by default shared all data and data structures. This turned out to be almost impossible to make thread-safe, since almost any perl-level "read" operation can actually end up modifying an SV (scalar value). For example:

    my $x = 1; print $x; # whoops $x has been modified: converted from int to string $y = \$x; # whoops $x has been modified: its ref count has increased.

    To get this to work right would involve locking before just about any operation. So that threads model was abandoned.

    There was a separate effort to allow fork emulation under Windows (which doesn't support fork()). This worked by collecting all perl's state into a single interpreter struct and allocating all SVs from per-interpreter pools. When fork() was called, the interpreter and all its SVs etc etc would be copied, and a new thread created which ran using that new data. So each "process" (actually just a thread) had its complete own copy of everything and could run independently without affecting any other threads; no (or very little) locking required. This first appeared with 5.6.

    Then someone had the idea of exposing this interface at the perl level (rather than just via fork() under windows). Thus was born the module, which did a similar thing to the fork (cloned the current state), but started the new thread with fresh code rather than running from the same point as the caller (a la fork()). Someone also added threads::shared, which via a mechanism similar to tying, allowed data structures to be shared across threads. These came out with 5.8.


      dave the m wrote:
      5.0005 threads (introduced with perl 5.005) by default shared all data and data structures. This turned out to be almost impossible to make thread-safe, since almost any perl-level "read" operation can actually end up modifying an SV (scalar value).
      Is this required by the language, or, evolving from your examples:
      my $x = 1; print $x #don't care if modified! our $package_X; print $package_x; #Now I care!
      I wouldn't see a simple 'my' var as needing sharing, unless you take a reference to it..., package vars might be ideal for something like a Fortran COMMON section, if I remember what I'm talking about... i.e. GLOBAL vars/package (that would be shared).

      But if I print $x, does it have to modify "$x", or -- rather why not leave it alone and have print modify a copy -- it's not like it is being stored somewhere that a shared implementation might expect to be able to access it's 'mutated form' ;-).

      As for your 3rd line, referring to the ref count, that's definitely something the interpreter would need to track, but wouldn't be hard to implement on the x86 as, as long as the counter is arch-word (32/64bit) aligned, an inc/dec operation is atomic.

      The thing that is annoying about the current model is, from my understanding, the limitation on having to pre-declare something as shared or not -- which would, it seems, preclude using it with object oriented programming where specific objects could have global state (and need locking in the presence of multiple writers) -- but not multiple readers.

      But the good news, as I understand you saying, is that the current code uses native OS threads -- it's just that they don't share much [if any] data...that's slightly better than I thought it might be given that under linux today, a fork-exec you can choose multiple levels of sharing and code segments of compiled programs can automatically share the same code memory (presuming they weren't built statically).

      Thanks for the info....

        Is this required by the language
        There's nothing in the language that precludes a 5.0005-style threading implementation. The difficulty was in retrospectively trying to make the existing implementation thread-safe, where it had never been designed for that possibility. This is one of the (many) reasons why it was concluded that a complete from-the-ground-up rewrite of the perl interpreter was required, i.e. perl6.

        The main drawbacks of the ithreads model are: that cloning the existing interpreter when creating a new thread is slow; that it uses lots of memory, since the new interpreter doesn't make any use of the OS facilities that a fork() would, of sharing memory by default with copy-on-write pages; and that having shared variables is slow, clunky and is memory-heavy.


Re:Perl has real threads.
by BrowserUk (Pope) on Feb 25, 2013 at 09:59 UTC

    Perl does have real threads. The rest is irrelevant history not worth digging over.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1020469]
Approved by vinoth.ree
[marto]: good morning all
[Corion]: Hi marto!

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (11)
As of 2018-05-24 07:15 GMT
Find Nodes?
    Voting Booth?