Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Parrot, threads & fears for the future.

by BrowserUk (Pope)
on Oct 23, 2006 at 12:04 UTC ( #580004=perlmeditation: print w/ replies, xml ) Need Help??

The future is threaded.

At the commodity hardware level, this is already the case thanks to Intel Hyperthreading, Intel, AMD (and others) dual core processors. AMP, SMP & NUMA. Dual, quad & 8-way motherboards. In the future it will be more so. Eg. The Cell processor. Quad-core technology from Intel and AMD coming in 2007.

At the software level, very little existing software makes use of threading. There are several reasons for this:

  1. Software needs to be written from the ground up with threading in mind in order to properly benefit from it.
  2. Retro-fitting threading to existing applications is rarely effective because threading exacerbates the effect of every bad programming practice.
    • Re-entrancy issues can not be glossed over. Whether in the application code, language runtime or OS.
    • Global data-structures become even more vulnerable.
    • Tight coupled code causes low granularity. Low granularity can make threading expensive.
    • Memory management becomes paramount.
      • Stop the world GC will have a disastrous effect upon performance.
      • Monolithic heap management will suck the life out of efficiency.
  3. Existing code that can benefit from threading, often uses event driven and/or state machine techniques to approximate those benefits, but code designed to utilise those techniques is usually structured such that it does not lend itself to conversion to threading.
  4. Existing code that already achieves parallelism through forking, that could also benefit from shared state, is often difficult to adapt to threading due to the assumptions that can be safely made when using forking that no longer hold true with threading.

Redeveloping existing, successful applications that could benefit from threading is often slow to happen. Again, for a variety of reasons:

  1. Mid-life, ground-up, redesign of an existing application is always a major undertaking, even where there are clear benefits to doing so.
  2. In many environments threading is seen as hard.
  3. In many environments, there is a lack of both understanding and the skills required to implement threading well.

Even when developing new applications, that could obviously benefit from threading, it is overlooked, ignored or explicitly ruled out. Again there are a variety reasons why this happens, most of which are already covered above.

Much of the resistance/reluctance to utilise threading can be attributed to a single factor--there are few if any good tools available.

  • Most languages--that the majority of people use--either do not support threading at all, or support it as an afterthought. And then only at the lowest level.
  • There are few good abstractions of shared state. If every program still had to deal with disk-bound data by directly manipulating physical blocks and freespace chains, very few programs that manipulate file-based data would exist--that's most programs in existence. Today's ubiquitous hierarchical filesystems make it seem as if there was never any other way, and that tends to imply that they are perfect. But we also have RDBMSs, which are most definitely not hierarchical, and not file-based, (Although they often live on hierarchical filesystems.).
  • Most of todays programming tools--compilers, interpreters, editors, debuggers, runtime libraries etc.--are the latest evolutions of the same tools that go back years., In many cases, decades.

And like any other type of existing application, adaption to threading is difficult, resisted ,expensive and risky. Redesign from the ground up with threading in mind, implies throwing away thousands or millions of development hours in thoroughly tried and tested tools and libraries.

Could that possibly be worth the trouble?

What if an existing, popular, powerful and flexible language was already being redesigned from the ground up.

If that language was already looking to support simple syntax and intuitive semantics for Distributed Operations?

if( any( @list ) == constant ) ... @list1 >> += << @list2

And those DistOps inherently lent themselves to being run concurrently on multiple hyperthreads/cores/CPUs?

What if the entire tool-chain to support that new language was also already being redesigned from the ground up?

Doesn't it make sense to write those tools with threading not just "in mind", but as a high priority?

Will those tools be all they could be, if their architects "Don't do threads"? If the implementors "do not see the need for threads"?

Indeed, does it bode well for the future of those tools if the implementors do not use the languages that those tools are to support, And don't have the slightest feel for what will drive the needs and uses of those languages in the future?

Does the

  • complete absence of a threads.pdd from the specification;
  • that the term "threads" appears only 35 times in the entire documentation set;
  • that the "failed" ithreads model, so widely denigrated and despised, is being nearly exactly replicated for the underpinnings of the new language;

inspire you with confidence?

What about userspace threading? Many languages provide this and many programmers find it's determinism and light weight lends itself to many things that they want to do. It's not a replacement for kernel threads, but if threads != interpreter, then providing primitives to allow cooperative, user threading to run within preemptive kernel threading becomes not just possible but almost trivial. Whether this is provided at the VM level or the language level. Trying to hack user space threading into a language when thread == interpeter becomes a completely different ballgame.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Comment on Parrot, threads & fears for the future.
Download Code
Re: Parrot, threads & fears for the future.
by merlyn (Sage) on Oct 23, 2006 at 12:09 UTC
    I stopped reading here:
    At the software level, very little existing software makes use of threading.
    Shall I grep the Linux and Darwin and BSD user-level source programs that contain instances of fork or system or popen for you? I think "very little" should be changed to "lots of".

    Don't equate hardware threading support with the software crappy "thread" thing. In unix, thread is spelled "f-o-r-k".

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      You stopped reading too soon, then. There is much parallelism which could be exploited by having (say) Perl6 constructs be natively parallel, for example the hyperoperators, and grep/map.

      Also, having real, painless userspace "threads" (i.e. cooperative multitasking) would be a huge benefit too.

      Your reaction is exactly the point that BrowserUk is trying to make, or at least how I interpret it - actually and transparently using and supporting threads is an important asset for a programming language. Having to manually implement IPC, like you have to do if you want to keep on using fork and still reap the benefits of parallelism is a pattern and hence a weakness in the language.

        You can only transparently paralelize map{} if the executed block is side-effect-free. Perl is far too dynamic for the compiler/optimizer to safely check that and you might very well want to have side-effects in your map{}s so you'd have to have two map{}s, a paralellisable one and a garanteed-to-be-serial one. It's one thing to paralelize a purely functional, side-effect free language and to paralelize something that may do whatever it bloody wishes. Everything comes for a prize, even freedom.

      I stopped reading here: ... In unix, thread is spelled "f-o-r-k".

      Had you read on, you might have learnt something.

      twinunix - The world is not unix!

      Fork is not threading.

      Let's see you utilise multiple processes to evaluate

      @array1 >> += << @array2;

      On multiple processors concurrently.

      Is that sand in your hair?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        In wonder, how large must the arrays be for this to be faster in threads than in sequence? Say you split this into four - OS level - threads (because if it's just threads within the same process, you gain nothing at all), hoping it gets evaluated on four CPUs (or cores) in parallel. But that means three out of the four threads will be queued by the OS, and will have to wait their turn to get a slice of CPU time.

        Now, if the process is only halfway its current time slice, and has to give up its timeslice because it needs to wait for the other three thread to join, now, that would be a shame.

        I'm not saying that build in support for threading is a bad idea. But I do think it's a bad idea to do things in parallel without the programmers explicit permission. And it ain't going to be easy, as you need to cooperate with the OS, and Perl has to run on many OSses.

      In unix, thread is spelled "f-o-r-k".

      That isn't true even stripping "f-o-r-k" of the flame inducing hyphens, and you know that. fork() means fully cloning a process, proper with environment, file handles, slot in the process table and all mumbo, with no further sharing of resources between the two fully-fledged processes save the mechanisms suitable for any inter-communication of processes (IPC), whereas threads... well, you know. Or you don't. Sad you stopped reading there, it might have been an interesting read further down.

      --shmem

      _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                    /\_¯/(q    /
      ----------------------------  \__(m.====·.(_("always off the crowd"))."·
      ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
        No, son. Maybe I've been around a bit longer than you, but we were accomplishing everything that people are trying to do with threads with simple forks some 20 years ago. And the coding was much simpler, easier to debug, easier to maintain, easier to prove correct.

        If the answer is "threads", you asked the wrong question. Threads suck. Event-driven programs and processes are a much cleaner model, with a far better defect ratio.

        And a forked program cluster will use your two, four, and 20-way core boxes just fine. No need to introduce the bizarre complexity of threads.

        Using threads are like using globals... sure you can do it, but it requires a lot more care. Using fork is like having everything be local to its code and data, which is what you want in a large system anyway.

        -- Randal L. Schwartz, Perl hacker
        Be sure to read my standard disclaimer if this is a reply.

Re: Parrot, threads & fears for the future.
by audreyt (Hermit) on Oct 23, 2006 at 14:08 UTC
    What if, the current non-Parrot Perl 6 implementation -- i.e. the one that has arrays and hashes and subroutines and hyperops and junctions and can natively use CPAN modules -- already implements SMP parallelism for hyper/junctions, exactly as you described?

    http://pugs.blogs.com/pugs/2006/10/smp_paralleliza.html

    What if the runtime system for this parallelization already supports huge numbers of "async{}" operations, with its preemptive lightweight concurrency, that scales linearly with the number of CPUs?

    http://pugs.blogs.com/pugs/2006/10/more_smp_parall.html

    What if, the said implementation has a comprehensive support for Software Transactional Memory, with lockless contention resolution, rollback via "defer", choice of fallback via "maybe", and will soon provide invariants via "ALWAYS{...}"?

    http://svn.openfoundry.org/pugs/examples/concurrency/stm-contend.pl

    What if the syntax of of such STM primitives as designed by TimToady++ is supported by the semantics outlined by lizm++, one of the most knowledgeable person of concurrent Perl?

    http://svn.openfoundry.org/pugs/docs/Perl6/Spec/Concurrency.pod

    What if the community behind the runtime system is actively working on GPU co-processors, native support for SSE2 for numerical operations (already implemented for x86_64), and Cell/Grid parallelism?

    http://haskell.org/haskellwiki/GHC/Data_Parallel_Haskell

    What if the Intel folks I've met last week, who are actively profiling and tuning their multicore CPUs to work with GHC's pure computation and concurrent computation notions, are delighted that Perl 6 can take advantage of those optimizations natively?

    Does that not inspire you, if not with confidence, at least curiosity and excitement? :-)

      Yea verily!

      Of course what this says is that specifying something by a committe is fraught with difficulty and rarely will the documentation match reality until we are at the end of the process. Whilst BrowserUK does make the valid point that the spec documents don't yet reflect much of the work in the developer community, it's nice to hear that those of us in the non *nix world don't have to worry about merlyns often somewhat myopic view of the world of computing. We might not like *doze, but we do have to deal with it.

      jdtoronto

      This is great news. Howsever, I think BrowserUk's point is more that Parrot is doomed. I'm not sure why he chose to never say "Parrot" except in the title.

      So audreyt's reply rather reinforces at least that part of BrowserUk's thesis. Parrot appears doomed. Though it'd be interesting to hear responses from those who know what is going on with Parrot as to whether this particular design mistake is as characterized by BrowserUk.

      - tye        

      Yes AudreyT, it does inspire me. With curiosity, and excitement--and confidence.

      I've been keeping loose tabs on your progress ever since your first post in the Perl6 fora received it's initially rather brusque reception.

      Thanks to Mr. Wall's ability to prevaricate, stall and change his mind--until he gets it right, and the support of his team. Your vision; your ability to follow through on your vision; and your ability to inspire others to go with you; combines to give me a great confidence that not only will Perl6 be real--it'll be bloody amazing.

      My doubts, and this thread, are solely aimed at Parrot.

      Is Parrot important given Pugs? Pugs is still somewhat slow, though the latest build seems to be a great deal faster than previous builds. Can Pugs ever hope to achieve acceptable production performance?

      As far as I am aware, the intention is still to have a Perl6 compiler, written in Perl6, that can compile itself, and will (primarily?) target Parrot. My doubts centre on whether, given what I knew of the Parrot development up to ten months ago, it could ever hope to match the abilities you've already achieved with Pugs?

      Of course, you've leveraged a great deal of high quality development that has taken (10+?) years of some of the brightest minds in academia to get to where it is now. That's a pretty big heft up over where the Parrot guys started from.

      But the significant thing is that the guys at Glasgow see beyond the world of Unix, even though that's where they live. See the complementary merits of forks and threads. See the benefits of concurrency, and know that to achieve it, you have to build it in from the ground up--not tack it on afterwards.

      As another responder in this thread has shown very clearly, not only are not everyone in the Perl community so enlightened. Many actively decry that threads have any merit whatsoever. And up to 10 months ago, very few if any in the Parrot community were any better.

      Yet another responder, now a part of the Parrot team has pointed out that my knowledge of Parrot is out of date--something I pointed out myself elsewhere here recently. Still, on the basis of what particle has said, it still seems (to me at least), that threading has not been tackled early enough in the project to really ensure that it can be fully integrated into the project. At best it might require substantial rewrites of existing code to achieve that integration. At worst, it'll end up being a 'tack on' solution.

      As always in our few brief interactions, let me take the opportunity to thank you for your amazing work and inspiration.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Thanks for your kind words. *blush*

        The concept of One True Official Perl 6 is, well, officially gone. Much like other standard-based languages, anything that passes the conformance test suite is Officially Perl 6.

        http://perlcabal.org/syn/S01.html#Project_Plan

        Pugs's interpreted core will always focus on semantics rather than speed. In that regard it may one day be comparable to Perl 5, but not fundamentally faster than it. However, much as GHC has two cores (interpreted GHCi and natively generated C--), so will Pugs provide an ahead-of-time compilation mode that supports runtime modification through dynamic linking.

        The AOT mode scheduled for the next release is based on a new Meta Object model that can generate fast, native embeddings in both of Pugs's runcores, namely GHC and Perl5. That will get Pugs a performance comparable to compiled Haskell code on the GHC runcore, and as fast in the Perl5 runcore as we can, probably with help from new opcodes hacked into Perl 5.12. Both will be likely faster than what Perl 5 currently is.

        http://nothingmuch.woobling.org/MO/
        http://perlcabal.org/~cmarcelo/moh/

        Pugs does not aim to be a cross-language bytecode interpreter that supports both static and dynamic method dispatch. CLR3, Java6, Rhino and plenty of other less mature VMs (Parrot, StrongTalk, HLVM, IO, YARV, PyPy) already do them pretty well, and all we have to do is to write a backend to generate bytecode for those platforms, in order to use their libraries.

Re: Parrot, threads & fears for the future.
by particle (Vicar) on Oct 23, 2006 at 15:09 UTC

    Does the

    • complete absence of a threads.pdd from the specification;
    • that the term "threads" appears only 35 times in the entire documentation set;
    • that the "failed" ithreads model so widely denigrated and despisedis being nearly exactly replicated for the underpinnings of the new language;

    inspire you with confidence?

    the above statements are incorrect. parrot has a threads design document. see http://www.parrotcode.org/docs/pdd/ for the list of design docs, and note threads (pdd25) is there, and is clearly labeled as a draft. it is not completely absent.

    parrot's threading spec has not yet been redesigned, since originally conceived. as it was conceived some time ago, it's bound to look out-of-date, especially as when it was conceived, perl6 looked a lot more like perl5, so it's natural that the model looked a lot like ithreads.

    parrot work is in progress--currently the exception model is on the top of the design list, followed by threads and i/o (however, i'm not sure of the specific order of those two.) allison's last commit (two weeks ago) to the threads doc has a log that reads, "Partial update of Threads PDD with collected wisdom from prior discussion."

    as an aside, for those of you who have not seen me around here lately, i've been devoting my free time to work on parrot rather than to hang out here. i'm responsible for the parrot test suite, and am having a blast and learning a great deal by working with patrick michaud on the perl6 compiler as well. i wish you'd all join the parrot-porters@perl.org mailing list and contribute to any discussion where you feel it appropriate. parrot has made major progress in some respects (it has fully specified and implemented namespace and lexical implementations to name two,) but it's success depends almost solely on the contributions of volunteers. if you feel you have something to add to the components of parrot which are still in draft status (like threads,) please don't hold back.

    ~Particle *accelerates*

      Things have moved on. The ppds in my (dormant) Parrot/docs/ppds directory run from 00 through 18.

      It's good to hear that threading is being actively thought about. As I've expressed elsewhere, I fear that any implementation of threads that work well, will have considerable impact upon existing code that was not written with threading in mind. Or it will be a 'tack on' solution.

      I also fear that using the POSIX pthreads api as the basis for the design of the threading support will severely limit the functionality of threading on all platforms that have a richer set of primitives and/or higher level encapsulations. This includes not only Win32, but also many flavours of *nix, which have non-POSIX extensions to the basic pthreads apis. And many different, incompatible extensions to boot.

      I think that a richer, virtualised API--not specific to any single existing platform's set--is required to allow access to the full range of threading API's on all platforms. Actually, I think a VM should virtualise all OS interactions, including and especially memory management. And that this should be done way down low in the architecture, so that the vast majority of the VM's infrastructural code has no direct access to the OS or (C) runtime libraries whatsoever. I see that as being the only way that it could ever hope to make full use of the strengths and variations in native OS/runtime facilities on all target platforms.

      And that does not mean simply hiding the basic POSIX apis behind sets of macros as is mostly the cae with Perl5. Whilst I agree that POSIX is the nearest thing to a cross-platform OS specification available, it is sadly out of date even on modern *nix platforms. As such it is a 'least common denominator solution. On non-*nix platforms it completely hamstrings the use of native non-POSIX facilities.

      Eg. On Win32; the pervasive resource security model; domain based networking; filesharing & range locking;sparse, compressed, encrypted and indexed file attributes; Overlapped IO; much of threading; all of fibres; ... and much, much more; all have to be 'tacked on' as 'after market' extensions (if they can be accessed at all), and so fight with the native Perl facilities instead of complementing them.

      To my way of thinking, a truly cross-platform VM needs to be a superset of all platform facilities. And when specified facilities are not available, then either fall back to lesser facilities where possible, or report errors when not.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parrot, threads & fears for the future.
by chromatic (Archbishop) on Oct 23, 2006 at 15:55 UTC

    Thank you for posting this.

    I have a question about the failed ithreads experiment though. I thought the biggest problems with ithreads were that:

    • They don't use native threads; the entire program is still a single process. (I don't understand the Windows threading model, but I'm pretty sure what I said is true on Unix-like platforms.)
    • There's still way too much global state in Perl 5.
    • Cloning an interpreter and the global state is hideously expensive.

    Those all seem like implementation details, where the most important point of an ithreads model is default shared nothing -- which has tremendous advantages with regard to locking semantics.

    What am I missing?

    Update: audreyt and Liz are right; I read the threads source code, rather than grepping through the *.c files for Pthreads calls.

      Hmm. Having all builtin Array/Hash/Scalar structures be implicitly transactional and locklessly shared, then allow explicit non-shared cloned state and channels on top of that, seems to me to be easier to scale and reason about, than the other way around.

      Also, native OS threads are still in a single process in Unix, and Perl 5 does use 1:1 mapping from Perl threads to native OS threads on Unix, where pthreads is available (see thread.h). Which is also expensive, as 1:1 mapping is only necessary if you do blocking system calls. So I'm not sure your first point (that the entire program only use one pthread) holds...

      They don't use native threads; the entire program is still a single process still uses only one native thread. (I don't understand the Windows threading model, but I'm pretty sure what I said is true on Unix-like platforms.)
      This is not true generally. On unixen, the local thread implementation is used for providing threads, but each thread has its own interpreter and its own copy of the data structures.

      There's still way too much global state in Perl 5.
      I don't think this is necessarily true in the core Perl modules. It's third party XS based modules with their own ways of accesing perl's internal structures, that cause problems in many cases.

      Cloning an interpreter and the global state is hideously expensive.
      Indeed. All stashes and their contents, as well as all live lexical values, are copied. They only thing I understand that is not copied, are optrees. Shared variables are even worse, because they not only exist in a thread and the threads started from that thread, they also exist in the "shared variables" thread (which is a hidden thread used as a safe haven for shared variables).

      Liz

Re: Parrot, threads & fears for the future.
by liz (Monsignor) on Oct 23, 2006 at 17:29 UTC
    The future is threaded.
    Maybe on a hardware level. But on a software level, I would say:

    Threads Are Primitive And Should Die!

    Why would you need to use threads? To do things in parallel? Why not let the computer figure that out for most cases (generally any action on more than one element without side effects).

    This is probably a broken analogy, but I see threads as wheels on a car. Do you have turn each frontwheel seperately? No, you have something like a steering wheel or a joystick. Do you have to think about the fact that, when turning left, the left wheel has to turn a little more than the right wheel? No. The car does that for you.

    I think the focus on threads (and fork) is what is so past century. We need new ways to think about using computer systems that are no longer capable of only doing one thing at the time. And rid ourselves of engrained thought patterns.

    Software transactional memory is such a way, and it is already available in Pugs already supports it.

    Combined with Continuations, which are already supported by Parrot, this is a very powerful base for language definition and implementation.

    Personally, I welcome the lack of a threads.ppd. Ever really tried to program more than trivially threaded programs in Perl or C? Or tried to debug them? Get all of the locking right? Eridicate all possible race conditions and dead locks? For me, it is the nearest thing to hell for a programmer I can imagine. That's why I heartily support the death of threads in HLL's!

    Liz

      That's why I heartily support the death of threads in HLL's!

      Funnily enough, I agree. I've been banging on about that for a couple of years. I think I even recall having a conversation with you about it by email 2 or more years ago.

      Parrot is not a high level language.

      If the HLL Perl6, is to provide for user transparent threading semantics; and if Perl6 is to run atop the Parrot VM; then Parrot has to do threads. And be written from the ground up to do threads well.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Parrot, threads & fears for the future.
by tilly (Archbishop) on Oct 23, 2006 at 18:11 UTC
    You lost me at, The future is threaded.

    High performance work tends to move parallel into clusters, and it has been known for years now that forked processes are easier to scale for clusters than multi-threaded processes. The reasons have nothing to do with Unix versus Windows, and everything to do with minimizing necessary interactions between parallel jobs. (And moreover, letting the OS know that interactions will be minimized.)

    Naive parallelism also has a great future. Think "webserver". Performance is just fine with unthreaded code, it is easier to manage development, and you can get as much parallelism as you need by running lots of concurrent processes.

    I'm sure that there is also a great future for threaded programs. However my current take is that that future will tend to be either very specialized code, or else for native GUI applications.

    Now if Perl 6 wants to be all things to all people, it probably should include support for threaded programming. But even if it has great support for that (and Audrey's implementation may), I'd be willing to bet that the multi-threaded part of Perl 6 will not be used in most Perl.

      You lost me at, The future is threaded.

      Indeed; there's a reason the shared-nothing architecture tends to beat shared-everything in high-volume web serving, for example. Now the feature may be parallel, but threaded?

      Clusters are an expensive and complex, workaday solution to the memory limits imposed by 32-bit processors and 32-bit processes.

      64-bit processors (theoretically) capable of addressing 16 million Petabytes. Already routinely having 4 and 16 Terabyte process address spaces. Add to that multiple cores and multiple array processors in a single core and you have the potential to do away with the latency, bandwidth restrictions and topology bottlenecks of networked clusters.

      Not to mention the need to partition datasets into many files and constant shuffle data on and off disk, and between machines.

      Once you can address entire huge datasets through the simple expedient of opening them as a memory mapped file, great chunks of the processing time simple disappear. All that is needed then to fully utilise the multiple processors is a few threads mostly processing independent sections of data (memory), but with the threading unique ability to share data and state directly without serialising it through high latency channels.

      The future is threaded.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        Sorry, but that's just silly.

        It is a basic economic fact that price per performance for commodity hardware is far, far cheaper than for big servers. Clusters are a way for businesses to take advantage of this to get the performance and reliability they want at a much better price point.

        That 64-bit versus 32-bit is irrelevant can be trivially demonstrated. Big 64-bit servers are old news, the big Unix vendors went through that transition a decade ago. (I don't know when IBM's mainframes went through it, but I think it was earlier than that.) Yet in the last decade big iron not only did not replace clusters, but they actually lost ground to them. Why? Because clusters are a lot cheaper.

        Now I'm not denying that big machines offer performance advantages over clusters. You have correctly identified some of those advantages. And I grant that there are plenty of problems that can only be done on a big machine. If you have one of those problems, then you absolutely must swallow the pricetag and buy big iron. But if you can get away with it, you're strongly advised to get a cluster.

        Most problems do not have to run on a huge machine. Clusters are far cheaper than equivalent performance on a big machine. Neither fact seems likely to change in the forseeable future. As long as they remain true, clusters are going to remain with us.

        I want to pick up on the serialisation point.

        In my work, lots of tasks are of the form "read, transform, transform, transform, write" on quite large datasets. The read and write are always I/O bound, but the transforms can be cpu bound. I'd like to run the subtasks in parallel, to speed up execution, but it isn't worth forking processes for the transformations - the cost of serialisation/deserialisation is too high.

        What would help is a threading model like fork, except instead of standard i/o channels the threads/processes would be able to expose native datastructures for direct read and/or write by their siblings.

        Is there any facility like this in existence?

Re: Parrot, threads & fears for the future.
by perrin (Chancellor) on Oct 23, 2006 at 19:28 UTC

      Sorry, but anyone who thinks that programming state machines with iddy-biddy bits of code, global state and

      for my $counter( GLOBALSTATE->{LOOPCOUNT} .. 1000 ) { IsSomethingElseReadyToRunYet(); ## do a bit IsSomethingElseReadyToRunYet(); ## Do a bit more OhGodSomethingElseIsReadyToRun() and do{ GLOBALSTATE->{LOOPCOUNT} = $counter; # 3! last; }; ## Stuff we might get around to doing next time through...maybe }

      is easier than remembering to lock() a variable, is perverse :)

      This is humour and no reflection on you, Matts or anyone else. I think state machines suck.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I think non-blocking I/O is pretty hard, as evidenced by the fact that POE is still not very widely used. It can provide impressive results in the hands of skilled programmers though. It was the secret sauce that allowed the original (pre-web) AOL to scale on the hardware that was available back then.
        I think state machines suck

        Heh, I just can't resist pointing out my favourite quote by Alan Cox, which pretty much answers the OP as well:

        A computer is a state machine. Threads are for people who can't program state machines.

        (Cue people screaming that he's talking about OS threads, not userland threads, yadda, yadda ;-)


        All dogma is stupid.
Re: Parrot, threads & fears for the future.
by jepri (Parson) on Oct 24, 2006 at 16:38 UTC
    While I was reading this thread, I had a sudden, blinding vision of the future.

    Programmers will start long, boring flamewars claiming that people who don't know how to use thread() aren't real programmers and are using toy languges, in much the same way C programmers have been annoying Perl programmers over the lack of malloc() and pointer arithmetic. The discussion threads will continue as various camps make their points, occaisionally interrupted by somebody claiming inanely that when they were programming 50 years ago, nobody needed threads, and everybody built their own computers too. This state of affairs will continue until someone releases a language that is good and fun to program in and contains enough cool parallelising tools that everyone is happy. It might even be perl7.

    Don't believe me? Scroll up

    For my part, it's enough that I use a language where I can indicate that certain sections can be parallel and then the compiler can choose to use threads, forks or clusters. This language is not perl, incidentally.

    ___________________
    Jeremy
    I didn't believe in evil until I dated it.

        I didn't want to turn this into a language thread, but since you ask... it's Scheme.

        Now, Perl5 and threading... that's not a good combination. Perl programmers habitually use a lot of side effects. Any time someone writes code like $x =~ s/a/o/ the parallelism goes away (maybe). To write a parallel program, every part of it has to be written with parallelism in mind. That sounds fine, except that most CPAN modules aren't written like that, so each one has to be inspected by hand.

        And as someone noted above: even reading a perl variable can change it, and that's not counting what happens if someone hands you a list of Tie:: variables or objects or something like that...

        In Scheme, all I have to do is signal that a part of my code has no side effects and then my macros can do things like parallelise all the arguments to each function call*. I don't have to think about it any further.

        *I'm not saying that's a good idea, I'm just saying I can do it...

        ___________________
        Jeremy
        I didn't believe in evil until I dated it.

      So, which is it?

      Erlang, Ocaml, Mozart/Oz, Clean, Rudy, D, Occam, Concurrent Haskell, Concurrent C... I have (and try to use after some fashion) all but the last three.

      And funnily enough, the existence of all those is (another) pretty good argument for why it is important to speak up now for good support of threading in Perl6 and its underpinnings.

      I don't see any flame war (with the exception of one named monk and one anonymonk--who could well be the same person.)

      I see an important debate to counter the "thread is spelt F-O-R-K" mentality that is so pervasive in the Perl, Parrot & Unix communities that are playing such a big part in shaping the future of my preferred language. Which, incidentally, happens to also be the subject and reason for existance of this place.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        I haven't really had a good look at Perl6 yet so I don't want to make any comments as to how well it adresses threading.

        I was commenting that it's not the threads-forks-clusters that make a difference, it's being able to hint to the compiler that some code *could* be parallelised, and having a language that provides some high-level constructs when I want to force something to be in a separate thread.

        Programming thread pools and syncronisers and stuff feels very similar to programming low level functions in C - it focusses on what I want the computer to do, rather than telling it what I want to get. So I'm definately in the "future is parallel, but probably not threaded" camp.

        ___________________
        Jeremy
        I didn't believe in evil until I dated it.

        I see an important debate to counter the "thread is spelt F-O-R-K" mentality that is so pervasive in the Perl, Parrot & Unix communities that are playing such a big part in shaping the future of my preferred language.

        "Do not try and bend the fork. That's impossible. Instead... only try to realize the truth."
        "What truth?"
        "There is no fork."
        "There is no fork?"
        "Then you'll see, that it is not the fork that bends, it is only yourself." --Allison

Re: Parrot, threads & fears for the future.
by aufflick (Deacon) on Oct 25, 2006 at 05:53 UTC
    People who are interested in this might be interested in this blog entry:

    Futures, Continuations, Closures and Oh My!

    It looks at optimising a bytecode vm (in this case smalltalk) to automatically parallelise code that makes use of futures. Very neat ideas.

      I was sure I'd read that before somewhere. So I went off on a scavenge hunt of code and PM threads and my reams of random thoughts and jottings trying to remember where and when. That eventually lead me through some of my experiments with Erlang and then back to Re^18: Why is the execution order of subexpressions undefined?, and indeed that whole (nauseatingly frustrating) thread Why is the execution order of subexpressions undefined?.

      The whole motivation behind that thread was an attempt to make the case (that I still firmly believe to be true), that if Perl(6) had a defined execution order, it would greatly increase the potential for fine grained, interpreter induced, parallel execution. Even down to parallelising the clauses within an individual statement or line of code.

      Having tracked down that Erlang reference I posted, and followed a few of the links from it, although I hadn't tracked my way back to your reference, I was all ready (had started typing:), to claim to have seen it back then (circa. April 2004).

      It was only then that I noticed the date! (less than a month ago.)

      So, maybe I saw something which the author also saw that inspired his blog entry, or maybe it was the simply the similarity between its title and Insight needed: Perl, C, Gtk2, Audio, Threads, IPC & Performance - Oh My! that has sat in my Personal Nodelet for several months that triggered the feelings of deja vue. Either way, thanks for the link, it was an interesting read.

      On the subject of continations and the continuation passing style. That is another of my fears regarding Parrot. That blog entry suggests that the costs of CPS is a single stack frame and very low, light and fast.

      However, other stuff I read when trying to get to grips with the meaning of CPS and the implications of its use within Parrot when combined with Parrots

      • register-based architecture (and register spill files);
      • very large instruction (opcode) set;
      • exception mechanism based around stacking continuation snapshots.

      All this combined to give me the impression that Parrot is going generate a huge runtime stack requirement. Albeit that the stack will be implemented as some kind of linked list in the heap. This impression was somewhat confirmed when someone (possibly Dan?) posted a description of what happened when they tried to run a fairly large, computer generated PAR program.

      Unfortunately, I never did understand enough of the Parrot source code, much of which was still temporary and was changing, evolving and being re-written in huge part on a daily and weekly basis back then, for me to properly make the transition from unedrstanding to being able to confirm and formally describe those fears.

      The impression remains, but things have moved on too far for me to hope to catch up now. It was another of those gut feelings that lead to my agreement with tilly elsewhere, that Parrot was unlikely to produce something that would run Perl6 efficiently.

      Maybe in the pre-compiling world of GHC, with its huge depth of code analysis and very advanced compile time optimisations, the costs of using CPS snapshots to implement exception mechanisms mean that large swaths of intermediate snapshots between points in the HLL code that raise exceptions and points in the code that catch those exceptions can be optimised away at compile time so reducing the peek stackframe snapshot costs?

      Maybe, I simply misunderstand the implications of what needs to be stored when you combine exceptions, closures, nested lexical stashes, nested namespaces and CPS?


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
      People who are interested in this might be interested in this blog entry...
      I like the futures model of concurrency. I haven't decided yet if it will make it into Parrot. This blog post is an interesting twist on the idea. Thanks. --Allison
Re: Parrot, threads & fears for the future.
by Limbic~Region (Chancellor) on Oct 31, 2006 at 00:40 UTC
    BrowserUk,
    I would like to point you to a google groups message but apparently the parrot-porters alias to perl6.internals is not getting archived. In any case, Allison just posted the following to the list:

    I've finished a first pass through PDD 25 on threading/concurrency. It's largely a collection of prior thinking on the subject. Before I start kicking it into a more structured form, I'd like to do an initial round of discussion. This is your chance to mention anything you hoped or expected from Parrot's concurrency models. How do you plan to use concurrency, and in what contexts? What's your favorite concurrency model and why should we consider using it? How integral a role should the new STM play in Parrot's concurrency? Etc.

    I've changed the name of the PDD from "Threads" to "Concurrency" because: a) the notion of "threads" seems to have taken on mythical proportions, so this is a symbolic step toward practicality, and b) Parrot has more than one concurrency model, and this PDD will be the overview of all them, plus the ways the various models interact.

    I intend to point her to this thread in case she does not already know about it.

    Cheers - L~R

      Unfortunately, it hasn't shown up here yet either,


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        BrowserUk,
        Apparently it doesn't go there until after it has been hashed out. I really don't follow Parrot development that closely these days. In any case, you can find it here.

        Cheers - L~R

Re: Parrot, threads & fears for the future (Addendum).
by BrowserUk (Pope) on Mar 19, 2008 at 06:15 UTC

    If anyone is still in any doubt that the future is indeed threaded, then read this & this. (Updated link for some that don't require subscription.)

    And do not let your distaste for one or more of the companies involved allow you to be dissuaded of the importance of their endevour. They did not get to be "industry giants" by spending large sums of money on a whim.

    Now, if only the potential of Perl 6 & Parrot as a player in this brave new world could be demonstrated, maybe they would consider investing in its development. It's a nice thought anyway.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Reaped: Re: Parrot, threads & fears for the future.
by NodeReaper (Curate) on Nov 18, 2011 at 03:03 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://580004]
Approved by Corion
Front-paged by coreolyn
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (12)
As of 2014-12-25 04:32 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (159 votes), past polls