Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Why Coro?

by xiaoyafeng (Chaplain)
on Jul 28, 2010 at 03:09 UTC ( #851650=perlquestion: print w/ replies, xml ) Need Help??
xiaoyafeng has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I intend to write in a multi-threads program in perl. In general, it'll run 3000 threads in parallel and max to 12000 .

I thought ithreads should be the best choice because firstly, it the official module and distribute with perl, and secondly, I've read some post calaimed ithreads reliable and fast in perlmonks. But along with searching it in google and CPAN, I confused. Many perl gurus(like Audreyt Tang , WEBGUI etc.) recommend Coro instead of ithreads

So my questions are

  • Is Coro really good and reliable?
  • if the first qeustion's answer is yes, what is advantage compared with ithreads?
  • If the second question is also clear, Why don't include Cora in perl official distribution and replace ithreads?

Hope monks enlighten me, TIA!





I am trying to improve my English skills, if you see a mistake please feel free to reply or /msg me a correction

Comment on Why Coro?
Re: Why Coro?
by BrowserUk (Pope) on Jul 28, 2010 at 04:11 UTC

    The biggest problem with Coro is the bullshit factor.

    Coro - the only real threads in perl

    Bullshit! See last element below.

    Unlike the so-called "Perl threads" (which are not actually real threads but only the windows process emulation (see section of same name for more details) ported to unix, and as such act as processes)

    As is this.

    They are "actually real threads". Real, kernel scheduled, core concurrent, execution contexts. (One use of which on Windows, is to emulate fork).

    And the segregated address space, despite the limitations, allows Perl threads to avoid the main "problem" with threads in other languages: that of unintentional sharing of thread-scoped data.

    A parallel matrix multiplication benchmark runs over 300 times faster on a single core than perl's pseudo-threads on a quad core using all four cores.

    As is this.

    See Re^6: If I am tied to a db and I join a thread, program chrashes for why the benchmark is a totally broken piece of marketing hype that serves little purpose beyond forwarding an agenda.

    In this module, a thread is defined as "callchain + lexical variables + some package variables + C stack),

    As is this.

    Contrast that definition of "thread" with that from wikipedia: In computer science, a thread of execution is the smallest unit of processing that can be scheduled by an operating system.

    My emphasis.

    And the first three words do not justify this flagrant mis-definition. It's like calling a stately home (or The Whitehouse) an aeroplane, because they both have wings.

    Coro's event's don't scale across cores. And cores are now both ubiquitous; and the only way to maintain Moore's Law going forward.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      Not to argue with you point, but I think if the Coro's author had written "fiber" instead of "thread", there would be much less confusion now.

        Yes. That would be a perfect description of Coro's coroutines. Though "coroutines" is better, if for no other reason than it is a term that isn't associated with MS, as fibres (wrongly) are. It's also an older and very well understood term in CS circles.

        But I think that is to ignore the political aspect of the Coro pod. Anyone who includes vitriol like this in a module's documentation, is obviously too far gone for rationality.

        A great many people seem to be confused about ithreads (for example, Chip Salzenberg called me unintelligent, incapable, stupid and gullible, while in the same mail making rather confused statements about perl ithreads (for example, that memory or files would be shared), showing his lack of understanding of this area - if it is hard to understand for Chip, it is probably not obvious to everybody).

        Especially as what Chip Salzenberg said is correct. At least as far as memory and file handles are concerned; they are shared within a single process space. Access is controlled and limited only at the language level; not the kernel or processor level. As for his remarks about the author, I don't know him so I couldn't make comment; but I suspect that anyone with even a cursory understanding of ithreads has long since drawn their own conclusions.

        The only lack of understanding regarding ithreads is clearly demonstrated by the author. Though I suspect his "confusion" is, at least in part, political rather than genuine.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Why Coro?
by BrowserUk (Pope) on Jul 28, 2010 at 04:30 UTC

    To answer your second question.

    The main advantage Coro, has, is lighter weight execution contexts.

    For your stated requirements, 3000 to 12000 threads, unless you are running on some massively parallel hardware, your problem must be IO-bound rather than CPU-bound. Even on one of AMD 48-core motherboards, running 3000/12000 CPU-bound threads would involve so much context switching that you would waste a huge portion of your overall clock cycles.

    For IO-bound applications where you have (say), large numbers of concurrent tcp conversations that spend most of their time waiting for responses, Coro is probably your better bet.

    Though, if you have multiple cores, your application will be limited to using only one core at a time, which may prove limiting.

    Perhaps the best solution would be to run a Coro scheduler instance inside each of multiple threads--one per core--and split your conversations between them. This still has the limitation that only a single conversation within a given Coro instance will be able to be run concurrently; but will allow concurrency by conversations in different groups.

    It is also unclear whether the underlying libcoro library is thread-safe. If it is, ithreads will have no problem running those multiple instances, but it is a big if.

    I'd love to see a high-level description of the task that requires 3000 to 12000 threads?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Many Thanks! I'll try my best to mention the task clear as sooner as I can ;)

      There are 3000 devices(will add to 12000 in near future) in a big area such as NY(just presume).;) Every one minute, device will store a 8 decimal number(like 00004000) in its register. Always, it just store the newest value in register, that means the old value will be overwritten on the next one minute.

      So I intend to write a multi-methods program running in server side to get data. But after reading your post, now I think I can just run the program on server machines and store to one database. Is it a better idea?

      Besides, The ways which devices connect to server might be diverse like TCP, serial, modem etc.

        s/sever/several/
        Every one minute, device will store a 8 decimal number(like 00004000) in its register.

        Can you clarify the definitions of "device" (type?), and "register" in this context?

        Specifically, will there be custom software running on each of these devices that receives the TCP requests, reads the "register" and replies with the value read?

        Or are the registers on the devices accessible directly via a TCP request? (Perhpas via firmware?)

        If the former is the case, then you could more simply have that custom software read the register and send it to the DB directly. The read and send cycle could be synchronised to the local hardware, and the DB & TCP take care of queuing the data at the DB end. Very simple and reliable.

        If the latter, the problem of whether Coro alone is suitable comes down to whether a single thread can dispatch & retrieve, with context switches, to 12000 machines in 60 seconds. 12000/60 = 200. Assuming permanent connections, that's 200 writes, 200 reads and 400 context switches per second. On a Gigabit LAN maybe you'd be okay, but I think you'd be pushing your luck across a WAN or the internet. With 8 or 12 cores, it would probably be okay. Though I'd want more information about latencies to convince myself of that.

        Multiple Coro instances (processes), whether running on the one or more multi-core machines, sounds like a possible solution. Though you still have the problem that if all the devices allocated to a particular instance decide to respond at the same instance, they will all be serialised through one core/machine, whilst all the others (potentially) stand idle. Ensuring that you poll every device one per minute reliably would require extensive testing.

        The upside is that you should be able to determine the maximum number of devices a single instance can reliably support during testing, and then split your devices in to suitably sized groups. As the number of machines ramps up, you just start more processes. The downside is that you have to carry the overheads of running many more instances than are required for the average case in order to ensure for the occasional confluence spike.

        If you have the opportunity to have the devices post their data to the DB directly rather than polling, you should definitely opt for that. But reading between the lines I suspect that you are talking about SNMP devices?

        If these are SNMP requests, then you should also look closely at the non-blocking request protocols in Net::SNMP. It may be all you need.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Why Coro?
by sundialsvc4 (Monsignor) on Jul 30, 2010 at 00:01 UTC

    If you attempt to run “3,000 threads,” let alone 12,000 ...

    "You're dead, Jim ..."Bones

    The number of threads should be determined by the number of requests that you can actually process “at the same time” on your hardware.   It should be a variable number, and it should be fairly small.

    A small number of threads can very efficiently serve a large number of devices, on the presumption that “not every device will be sending data to us at the same instant.”   There should be these thread pools:

    1. A very small number of threads (perhaps only one ...) that gathers the incoming requests as they arrive, and places them onto a queue.
    2. A somewhat larger pool of threads that services the queue.  (They also note when the last request arrived from each.   Perhaps there is a “watchdog” thread that periodically looks for dead birds.)
    3. If necessary, a third small set of threads that sends acknowledgment responses back to the devices, unless the first-pool threads can also handle this duty.

    You can easily see how this works, and how it will be easily tunable.   We need to gather requests with an adequate level of latency, and to know if a device is dead, so that's what the first (and third) threads do.   Then, we need to be sure that the threads can be processed effectively once received, without bottlenecks, and this is what the second group does.   Because of the presence of the queue, nothing will get out of hand.

    Also note that there are many CPAN packages which are already built to implement this sort of thing, because it is a very common scenario.   (Heck, it dates all the way back to IBM's “CICS” product for the earliest mainframes.)   Never do a thing that has already been done... it is very easy to find yourself doing exactly that.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://851650]
Approved by toolic
Front-paged by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (13)
As of 2014-07-25 16:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (173 votes), past polls