Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re^11: Your main event may be another's side-show. (Coro)

by tye (Sage)
on Oct 22, 2010 at 01:01 UTC ( [id://866679]=note: print w/replies, xml ) Need Help??


in reply to Re^10: Your main event may be another's side-show. (Coro)
in thread Your main event may be another's side-show.

Fiddling around trying to intersperse your linear algorithms with sufficient cede points to ensure that they all get a fair shout at the cpu

Well, there you misunderstand Coro. No such fiddling is required. You are thinking of your prior experience with cooperative multi-tasking operating systems, not of using cooperative multi-tasking within a process that is part of a modern operating system. And Coro provides a way to do asynchronous handling of blocking operations (mainly I/O) that is even less disruptive to the code even than using things like select.

The systems I'm talking about are large. The code for them starts out in the 10's of thousands of lines and I don't have permission to post that code.

The problems with process overhead were not "running 4 or maybe 8" instances of the Perl interpreter but were having banks of dozens of computers dedicated to running many dozens of instances of the Perl interpreter each and then having too many processes idle for too long such that these large computers either ran out of memory or requests backed up (because the systems were properly configured to prevent running out of memory but that just meant that we ran out of available Perl instances).

Yes, we could probably reduce the number of servers required per unit work by writing in assembler. We don't want to do serious development of server software in assembler and we don't want to have to hire a much larger number of assembly programmers to replace a much smaller number of Perl programmers much less wait much longer for each feature to be ready to deploy (and worse, putting up with the much higher bug density such low-level coding would likely lead to).

The point of worrying about the process overhead is that it became a significant portion of the resources being used, turned into something that could increase dramatically based on quite tiny changes in response time from external services (such as a database), and was causing the CPUs of the servers to be mostly idle because memory for process overhead would sometimes swamp all other resource requirements several-fold.

If you can't understand that without a piece of code for you to run and so choose to assume that it is overblown raving, I don't really care.

The process memory overhead scaled as "number of requests * duration of request". Since, during typical operation, most things happened in small fractions of seconds, the "duration" multiplier was not much of a problem.

When the simplest of temporary problems can lead to requests to external services averaging 0.8 seconds instead, that shouldn't be a big deal. But it is when that means that each process with its cache of lots of information that makes it so efficient has to go from spending a small percentage of its life waiting to spending most of its life waiting.

Suddenly the number of required processes is multiplied by 10x or 30x and yet nothing serious is wrong. If the per-process overhead weren't killing things, customer requests would be being handled in 1.8 seconds instead of in 1.0 second.

The data specific to a given request is tiny compared to the per-process overhead of the interpreter and the data cached hither and yon at multiple layers in the code.

iThreads manages to share the memory for executable code (including in shared libraries) and for compiled Perl code (the op-node tree or whatever people want to call it). iThreads doesn't share the CPU cycles used to build that cached data. fork() shares everything that iThreads does. fork() also (to some extent) shares read-only data instantiated in the parent (but not well enough) and also shares the CPU used to populate it.

Coro shares all of the above but also shared the cached data perfectly and also shares the Perl interpreter and only needs a relatively small per-stack set of data that isn't shared. The delta memory requirement per request is a tiny fraction of that required by a large server app using fork() or emulated fork() (iThreads). Creating a new request handler is also a tiny fraction of the amount of work with Coro.

So Coro means that all of the resources nearly scale relative to the number of requests per unit time, instead of having to scale the lion's share of memory required by "number of requests per unit time" * "length of time a request takes to finish" (plus avoiding the overhead of spinning up more handlers to join the pool).

The point is not relative consumption of different approaches but how the resource requirements scale.

comms. servers excluded

I'm not sure what qualifies under that rubric to you. I would think a SIP router would but it isn't a case where Coro is a big win (because nobody implements a SIP router using a single thread of execution per call). But the vast majority of server code I deal with covering a fairly wide variety of functions might, because most of them can significantly benefit from a coroutines approach.

If that escape clause means that you are only interested in purely computation-bound operations, then, yes, I don't find scaling such to be even close to as interesting of a problem so have fun with that. It also is almost never the problem I'm facing at work.

- tye        

  • Comment on Re^11: Your main event may be another's side-show. (Coro)

Replies are listed 'Best First'.
Re^12: Your main event may be another's side-show. (Coro)
by BrowserUk (Patriarch) on Oct 22, 2010 at 02:12 UTC
    Well, there you misunderstand Coro. No such fiddling is required. You are thinking of your prior experience with cooperative multi-tasking operating systems, not of using cooperative multi-tasking within a process that is part of a modern operating system.

    And there it is. Magic bullet claims, and excuses for why you can't demonstrate it.

    A coroutine program that never yields, (Coro code that doesn't cede), is not cooperative-anything nor multi-anything. It's a single tasking process, and you (well I at least), don't need coroutines to write those!

    The moment you put two coroutines into the same program, (and the clue that this is the norm is is the "Co" part--you can't have co-operation between a single routine), then they have to yield periodically otherwise only one ever does anything.

    I didn't ask you to post megabytes of your work code. But if you have the time to write the above post, you certainly have time to code a simple demonstration of the basic control and dataflows. At least you would if you used threads, but maybe Coro code is so complicated it really would take lots of effort?

    But that is the history of this debate. Always ready with the words, but never the code.

    If that escape clause means that you are only interested in purely computation-bound operations,

    No, it doesn't. It means that I recognise that there are some applications for which threads are not the best option. And large fan-out, autonomous communications servers are one such application. But I also recognise that only a small percentage of applications fit that scenario.

    And that the large majority of applications that come up here, involve a mix of IO-bound and cpu-bound tasks. And that threading accommodates this easily where, event-driven frameworks don't. So, for your average punter here seeking to hive off a little cpu-bound processing whilst remaining responsive to other things; or seeking to cut his runtime by utilising his multiple cores to perform cpu-intensive algorithms on a large dataset, threads are the far simpler option to the often suggested, (but never demo'd), event-driven framework behemoths.

    Why are you, and many like you, so scared of comparing like with like?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      And there it is. Magic bullet claims

      *sigh* You misunderstand again.

      Why are you, and many like you, so scared of comparing like with like?

      Yes, I'm quite terrified. *plonk*

      Have fun.

      - tye        

        *sigh* You misunderstand again.

        If I misunderstand--and I don't think I do--its because your descriptions (despite their verbosity(*)), are lacking.

        But then, we all know the best description of code, is the code itself.

        (*)If you left out the attempts at snide put-downs, you'd perhaps make a better fist of your descriptions.

        Ps. This is how quick it is to knock up a PoC.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://866679]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (5)
As of 2024-04-18 04:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found