Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Parrot, threads & fears for the future.

by merlyn (Sage)
on Oct 23, 2006 at 12:09 UTC ( #580008=note: print w/replies, xml ) Need Help??


in reply to Parrot, threads & fears for the future.

I stopped reading here:
At the software level, very little existing software makes use of threading.
Shall I grep the Linux and Darwin and BSD user-level source programs that contain instances of fork or system or popen for you? I think "very little" should be changed to "lots of".

Don't equate hardware threading support with the software crappy "thread" thing. In unix, thread is spelled "f-o-r-k".

-- Randal L. Schwartz, Perl hacker
Be sure to read my standard disclaimer if this is a reply.

  • Comment on Re: Parrot, threads & fears for the future.

Replies are listed 'Best First'.
Re^2: Parrot, threads & fears for the future.
by Corion (Pope) on Oct 23, 2006 at 12:16 UTC

    You stopped reading too soon, then. There is much parallelism which could be exploited by having (say) Perl6 constructs be natively parallel, for example the hyperoperators, and grep/map.

    Also, having real, painless userspace "threads" (i.e. cooperative multitasking) would be a huge benefit too.

    Your reaction is exactly the point that BrowserUk is trying to make, or at least how I interpret it - actually and transparently using and supporting threads is an important asset for a programming language. Having to manually implement IPC, like you have to do if you want to keep on using fork and still reap the benefits of parallelism is a pattern and hence a weakness in the language.

      You can only transparently paralelize map{} if the executed block is side-effect-free. Perl is far too dynamic for the compiler/optimizer to safely check that and you might very well want to have side-effects in your map{}s so you'd have to have two map{}s, a paralellisable one and a garanteed-to-be-serial one. It's one thing to paralelize a purely functional, side-effect free language and to paralelize something that may do whatever it bloody wishes. Everything comes for a prize, even freedom.

        Sure - but I'm not talking about doing that for Perl5, but postulating it for Perl6, where such stuff is still possible.

        What if there was a pragma for enabling parallelization? It could be use parallel, in the spirit of integer.
        my @list = ('aaaa' .. 'zzzz'); use parallel 'map'; # Just for maps my @ordlist = map ord, @list; # Defaults only # Probably something like any, all, map, grep, and maybe even sort use parallel; # Sort by last letter in parallel @list = sort {($a =~ /\w+(\w)/) <=> ($b =~ /\w+(\w)/)} @list; # This subroutine can operate in parallel use parallel 'foreach'; foo($_) foreach @list; no parallel; # disable everywhere else
Re^2: Parrot, threads & fears for the future.
by shmem (Chancellor) on Oct 23, 2006 at 12:20 UTC
    In unix, thread is spelled "f-o-r-k".

    That isn't true even stripping "f-o-r-k" of the flame inducing hyphens, and you know that. fork() means fully cloning a process, proper with environment, file handles, slot in the process table and all mumbo, with no further sharing of resources between the two fully-fledged processes save the mechanisms suitable for any inter-communication of processes (IPC), whereas threads... well, you know. Or you don't. Sad you stopped reading there, it might have been an interesting read further down.

    --shmem

    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      No, son. Maybe I've been around a bit longer than you, but we were accomplishing everything that people are trying to do with threads with simple forks some 20 years ago. And the coding was much simpler, easier to debug, easier to maintain, easier to prove correct.

      If the answer is "threads", you asked the wrong question. Threads suck. Event-driven programs and processes are a much cleaner model, with a far better defect ratio.

      And a forked program cluster will use your two, four, and 20-way core boxes just fine. No need to introduce the bizarre complexity of threads.

      Using threads are like using globals... sure you can do it, but it requires a lot more care. Using fork is like having everything be local to its code and data, which is what you want in a large system anyway.

      -- Randal L. Schwartz, Perl hacker
      Be sure to read my standard disclaimer if this is a reply.

        Son? Hmm, well. I just happen to be some silly 11 days older and sillier than you {g}.

        You may have been a bit longer in this business than I (who came to computers off studying architecture around 1989) but I haven't been asleep in the meantime. I also remember the times when a good word processor on the Atari ST (RIP) was worth ~250kBytes on disk; I had my first perl experience with v4.36, and was glad and proud with my simple coding and delighted with Gurusamy's port to the Atari.

        That's for that, Father ;-)

        But as for threads - set apart the usefulnes of threads, your statement "threads in unix are spelled fork" is blatantly wrong. Threads were invented for a reason which shurely didn't escape you. Which reason? There.

        Do threads solve the issues they address, and at which price? Here we might start a discussion, not just saying "threads suck" ("No, they don't suck." - "Yes, they do!" -"NO!") and throwing in false statements in an inflamatory way.

        While I also never had the need to use threads, did all I had to do with fork or an event driven model (Event back then, then POE) - so I'm included in that "we were accomplishing" -, I also have to say that "Using threads are like using globals" is misleading, that would be more appropriate "using threads is like using our()".

        I agree with you saying "Using fork is like having everything be local to its code and data", which is fine as long as you just want it that way and never ever are in need of shared variables. If you need shared data, you have the option of delegating the data to an external instance or to share it somehow via classical IPC mechanisms, shared memory segments, message queues, pipes and such - but you basically need the same synchronisation mechanisms as with threads (in come semaphores), and sharing complex data structures is IMHO some hells more difficult using IPC.

        --shmem

        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
Re^2: Parrot, threads & fears for the future.
by BrowserUk (Pope) on Oct 23, 2006 at 12:17 UTC
    I stopped reading here: ... In unix, thread is spelled "f-o-r-k".

    Had you read on, you might have learnt something.

    twinunix - The world is not unix!

    Fork is not threading.

    Let's see you utilise multiple processes to evaluate

    @array1 >> += << @array2;

    On multiple processors concurrently.

    Is that sand in your hair?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      In wonder, how large must the arrays be for this to be faster in threads than in sequence? Say you split this into four - OS level - threads (because if it's just threads within the same process, you gain nothing at all), hoping it gets evaluated on four CPUs (or cores) in parallel. But that means three out of the four threads will be queued by the OS, and will have to wait their turn to get a slice of CPU time.

      Now, if the process is only halfway its current time slice, and has to give up its timeslice because it needs to wait for the other three thread to join, now, that would be a shame.

      I'm not saying that build in support for threading is a bad idea. But I do think it's a bad idea to do things in parallel without the programmers explicit permission. And it ain't going to be easy, as you need to cooperate with the OS, and Perl has to run on many OSses.

        I look at thread management much like I now look at memory management. Of course, Perl sacrifices a lot of performance to the reference counting, but it buys a lot of flexibility, convenience and security by using the automatic memory management instead of leaving the programmer to handle the memory, even if that could be more efficient.

        Let me speculate a little.

        Imagine a VM-level 'thread' was not a whole interpeter running on a single kernel thread, but instead was a set of VM registers, a VM 'stack' pointer and a VM program counter. And capable of running on any interpreter in the process.

        Now imagine that the VM queried the OS at startup for the number of hyperthreads/cores/CPU (N), and started 1 kernel thread running a separate interpreter in each. And these interpreters all 'listened' to a single, central queue of (pointers to) threads.

        When a threadable opcode is reached, if the runtime size of the data to be manipulated is large enough to warrent it, the opcode splits up the dataset into N chunks and queues N copies of itself plus the partition dataset onto the thread queue ('scatter'); prefixed by a 'gather' opcode. Each available processor pulls a copy off, executes on it's partition of the dataset and stores the results.

        The first interpreter finished, will pull the 'gather' opcode. It decrements an embedded count and waits upon an embedded semaphore. As each of the interpreters finishes and goes back to get the next piece of work, it does the same until the last finishing interpreter decrements th count to zero, freeing the semaphore and clearing the 'gather' opcode. That frees up all the interpreters to pull the next opcode off the queue and continue. Each waiting interpreter is in a wait state allow the other interpreters to continue, consuming no timeslices, preemptively multitasked by the OS.

        The array has been processed, in parallel, by as many threads are available. No user space locking required. No threads needed to be started, nor torn down. Additional opcodes processed by the threaded code pushes are pushed onto the queue (stackwise).

        The same VM thread (register set) forms the basis of user space threading for cooperative multi-tasking like in Java, Ruby and many more. The same thread structure is used for both forms of threading.

        Just a thought (or five:).


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://580008]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (4)
As of 2021-06-16 18:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)












    Results (76 votes). Check out past polls.

    Notices?