Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re: The problem with "The Problem with Threads"

by sundialsvc4 (Monsignor)
on Jul 18, 2014 at 15:55 UTC ( #1094223=note: print w/ replies, xml ) Need Help??


in reply to The problem with "The Problem with Threads"

Good thoughts.   Very.

The most common mis-use of threads that I see is the “flaming arrows strategy.”   For each request, light another flaming arrow (a thread), and shoot it into the air to land where and how it may and then die.   This is sure to overwhelm almost any scheduler, which also has no real way to distinguish one scheduler-unit-of-work from another.   It also burdens the system with setup and teardown of processes, which is often expensive.

A far better (and, far more scalable) approach is to do what’s done in, say, any restaurant:   to have a pool of workers who shift their attentions among a greater number of active orders, which are flowing station-by-station through a flexible and well-defined lifespan.   The workers can be generalists, or specialists, or sometimes both, and the allocation can be adjusted at any time.   The system that is built using threads/processes, is very aware of exactly what business-problem it is constructed to solve.   When a unit of work is obliged to wait, a worker is not.   Instead, the order is briefly “parked.”   Workers live to a ripe old age.   If a single work-unit needs to go to several stations at once (burger, fries, milkshake), it might be opportunistically serviced by three workers simultaneously.   Commitments regarding service level are engineered into the system, so that you can say (and measure), that “95% of the time, all orders will be prepared and served to the customer within 3 minutes.”   Concurrency is used to address the problem, but there is not a one-to-one correspondence between workers and work.   There is a dedicated management role, separate from any worker, which is constantly “riding the faders” to keep everything in balance.

There are plenty of good workload-management systems, including some that are designed to share work in a computing cluster, as well as those that are designed to be self-adapting to changing workload mixes and resource-constraint pressures.   The goal is to find the ever-changing “sweet spot” in which a maximum number of units-of-work are being processed, in the least amount of time, without creating traffic-jams at any point (including the OS scheduler itself).   The principles used are simply taken from the real world of human and industrial processes.


Comment on Re: The problem with "The Problem with Threads"
Re^2: The problem with "The Problem with Threads"
by BrowserUk (Pope) on Jul 20, 2014 at 06:36 UTC

    Almost every sentence in that diatribe is wrong.

    I can't be bothered to explain why any more cos you'll only regurgitate it back to me in the wrong context a week or two from now when you've forgotten where you read it and what it meant.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

      Well, that would be just a collective waste of time, don’t you think?

      Threading is an important and useful technique, and yet it is misused by people who equate “thread” with “unit of work.”   You cannot control (much ...) how units-of-work may be introduced into your system, but you can – and must – control how the system goes about doing it.   Any mechanism must have a governor, and a throttle.

      The best example I encountered of this, decades ago, was with an engineering computer at our college.   This machine could do a mechanical-engineering analysis job in about a minute and a half, if one person at a time was doing it, and if absolutely nothing else was going on.   With three such interactive sessions, the time jumped to five minutes.   With four, nineteen.   With six, it took five hours.   With nine, thirteen.   (And all of this assuming that the machine was never doing anything else, which was not a valid assumption.)   A classroom full of engineering students could not do their homework, nor could anyone else do anything at all.   Even as IBM salivated at the thought of up-selling to a much bigger box, a very simple solution was found:   run the program in a batch system that never tried to run more than three of these jobs at one time.   Also, set tuning-rules in the batch monitor to represent (and enforce) a service-level commitment with regards to this (dedicated) class of job.   Problem solved.   The performance curve had exhibited the classic, elbow-shaped, “hit the wall” curve indicative of thrashing, and the solution was to constrain the workload to stay back from that elbow.   We could commit to a “less than five minutes” promise, and keep it.   IBM never got to sell us more hardware, and a few years later it all was replaced with a VAX.

      If you look back upon the archives here, or at any forum, you will find frequent questions from people who are trying to run “large” work ... district-wide reports, say ... directly from a web-page.   Even when the CGI time is set to “never time-out,” the lack of a governor or a throttle causes this design to topple-over in production.   Any system is doomed to try to do whatever it is asked to do, even when it can’t.

      My points are valid, and they don’t dispute your interesting and thorough essay, which by the way I upvoted.

        The best example I encountered of this, decades ago

        That sums up your "knowledge"! Outdated, misunderstood, regurgitated, parrot learnt garbage.


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1094223]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (10)
As of 2014-09-17 16:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (91 votes), past polls