http://www.perlmonks.org?node_id=867637


in reply to Re^11: Utter FUD!
in thread is ||= threadsafe?

By common definition of threads,

In C yes. How about Erlang? Or Haskell? Or Java? Or.... Why should Perl alone have to have to copy C?

iThreads are not simple Kernel threads. They could not be so in an interpreted language with fat variables.

But neither are they new processes--forks defining characteristic--so any allusion to that is simplistic, inaccurate and deliberately misleading.

You might wanna get off of those coat-tails....


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^13: Utter FUD!
by ikegami (Patriarch) on Oct 28, 2010 at 00:03 UTC

    Java, yes. C++, yes. Modula-2, yes. I don't know Erlang and Haskell.

      Erlang "threads" are also called Erlang "processes" and they neither have shared data/state nor wholesale copies of data. I don't often see them referred to as just "threads", probably because they aren't much like C, nor C++, nor Java, nor Modula-2, nor Visual BASIC, nor Unix kernel threads nor Win32 native threads (nor are they much like iThreads nor fork).

      You can create and destroy Erlang "processes" similar to what one might be used to with threads in a procedural language (since Erlang is functional but also is all about interfacing with things in the Real World).

      Haskell has more emphasis on the "functional" (the meaning from computing theory, not plain English) and less on the "interfacing" and so "threads" there tend to be more just purely useful for using more CPUs at once in hopes of finishing sooner and you don't so much control the "threads" as you declare where the compiler is allowed to make use of threads for you.

      Purely functional programs don't have a traditional flow, traditional "blocking" operations, nor variables much less shared variables, so traditional threads just don't really apply much.

      A lot of times you'll see things called "threads" with some qualifier(s) and then described as "not the same as 'vanilla' threads" and then called "light-weight processes". Unfortunately, you can't call iThreads "light-weight processes" since in some significant ways they weigh more than vanilla processes.

      So, iThreads are actually more like fork than like any of these things that are sometimes calls "threads" in other languages. And the things that aren't pretty much exactly like C threads and Unix kernel threads don't tend to get called just plain "threads" much, IME.

      Though, their emulation of the lion's share of work done by fork() (copy of the majority of the process, not the myriad bookkeeping bits like setting the program counter or assigning PIDs, etc.) is significantly less efficient than fork().

      iThread's copying used 10x more CPU in the trivial case. It was trivial to create some data and make iThread's copying use 100x the CPU of fork(). Even if I pessimize for fork() by modifying all of the initial data so it all gets unshared, iThreads still used just over 70x more CPU.

      Comparing memory usage is not trivial so I didn't try to come up with any numbers to compare that.

      But that only applies to (part of) why I don't use iThreads in Unix. I look forward to trying to use iThreads again under Windows.

      The name iThreads has probably discouraged use of the technology. I find many eschew threads, often in a rather stark "threads vs fork" mindset. Well, iThreads have more in common with multi-tasking via fork than with traditional multi-tasking via threads, so an ardent "fork not threads!" stancer should well consider iThreads, certainly before threads.

      I tend to focus more on the details of communication between the parts (solid interfaces lead to solid systems) and so don't tend to reach for the convenient "share a few variables willy, nilly" framework. But iThreads have advantages and can be used effectively even in Unix (yes, you usually need to be aware of their disadvantages; for example, don't spawn a new thread for each little task).

      - tye        

        Erlang "threads" are also called Erlang "processes" and they neither have shared data/state nor wholesale copies of data.

        Scant, simplistic and largely inaccurate since the release of the R11B in 2006. Rather more inaccurate since the release of R13B.

        Erlang "processes" and Erlang "threads" are entirely different beasts. Indeed, there is no such concept as an Erlang thread as such. And, like Java green threads, (and Coros) Erlang processes were (and still are, but I'll get back to that), entirely user-space entities and as such are neither processes nor threads in the conventional (OS) sense.

        However, in circa. 2004, the lack of SMP scalability was recognised as a significant limitation, and development was started to address that which culminated in the R11* releases of the VM. The approach taken was to start one (kernel) thread per core, feeding off a single shared (note that word) queue (and that one also). Each thread is a separate interpreter that take messages off of the shared queue and executes them until they either a) finish; b) block; c) error.

        Now it was quickly realised that the shared queue (and the associated locking) was a significant drag on performance, so having got it working, they set about improving the performance. To this end, they developed the R13B VM which uses separate queues for each interpreter, thus avoiding (some) of the lock contention. To achieve this, they had to add "process migration logic". That is Erlang "processes" not OS processes. And "migration logic", means moving "processes" to other queues if the current queue has more than some pre-configured maximum number of "runnable processes" (Again; Erlang "processes", not OS processes!).

        Now back to your "no wholesale copying of data". As Erlang is a functional language--with immutable variables--every time you send a message to a "process" that causes it to (for example) append a character to a string; or push to an array; or add, change or remove a key/value pair to a hash; or add, remove or (say) reverse the order of elements within a list; it (at least notionally) copies the entire data structure.

        Of course, we know that in reality such copying is impractical in the real world, and like (for example) Haskell, that notional immutability is enforced at the language level, but is done by "smoke&mirrors" at the implementation level. So, Erlang's "message queues" are basically, simply linked-lists of heap-allocated memory structures (as might be used in C (I wonder what language Erlang is implemented in?)). In other words--shared state at the OS level.

        And, should you doubt any of this, please download and read: this pdf

        Now, does any of that sound familiar?

        One thread per core. Queue(s) to facilitate communications. The absence of direct access to shared state. Internal locking.

        Does that sound anything like the iThreads model I've been taking about?

        I chose Erlang as one of my examples, because I happen to have made a something of a study of it.

        So, iThreads are actually more like fork than like any of these things that are sometimes calls "threads" in other languages.

        Congratulations on dropping the phrase "fork emulation". Threading in Erlang is quite different from threading in C. Why should threading in Perl have to be the same?

        And doesn't the above, (or the pdf if you bothered) sound a lot like the very type of thread-pool + queues mechanism I (amongst other) have been advocating here for years?

        I tend to focus more on the details of communication between the parts (solid interfaces lead to solid systems) and so don't tend to reach for the convenient "share a few variables willy, nilly" framework. But iThreads have advantages and can be used effectively even in Unix

        If I didn't know better, I'd suggest that we might be singing from the same song sheet--though perhaps with different accents.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      Java, yes. C++, yes. Modula-2, yes

      That's like picking Robin, Maurice & Barry to represent all the musicians in the world.

        Not so. Whoever they are, I doubt they've produced most of the music in the world, nor do they represent the formal definition of music.