Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things

Re: Fork vs pThreads

by sundialsvc4 (Abbot)
on Oct 21, 2013 at 14:43 UTC ( #1059125=note: print w/replies, xml ) Need Help??

in reply to Fork vs pThreads

What BrowserUK is saying about “only 4 at a time” is anything but “an aside.”   It’s the key to the whole thing.

Consider what you see going on every day in a fast-food joint.   There’s a certain number of workers, and all of them are working on a queue of incoming food orders.   If 1,000 orders suddenly come pouring in, then the queues will get very long, but the kitchen won’t get overcrowded.   The number of workers in the kitchen, and each of their assigned tasks, is set to maximize throughput, which means that all the workers are working as fast as they can and that they are not competing with one another for resources.   The restaurant doesn’t lose the ability to do its job ... it just takes (predictably!) longer.   (And they can tell you, within a reasonably accurate time-window, just how long it will take.)

The loss of throughput, furthermore, isn’t linear:   no matter what the ruling-constraint actually is, the loss becomes exponential.   If you plot the average completion-time as the y-axis on a graph, where the “number of simultaneous processes” is x, the resulting graph has an elbow-shape:   it gradually gets worse, then, !!wham!! it “hits the wall” and goes to hell and never comes back.   If you plot “number of seconds required to complete 1,000 requests” as the y, the lesson becomes even clearer.   You will finish the work-load faster (“you will complete the work, period ...”) by controlling the number of simultaneous workers, whether they be processes or threads.

The number-one resource of contention is always:   virtual memory.   “It’s the paging that gets ya,” and we have a special word for what happens:   “thrashing.”   But any ruling-constraint can cause congestive collapse, with similarly catastrophic results.

Replies are listed 'Best First'.
Re^2: Fork vs pThreads
by ThelmaJay (Novice) on Oct 21, 2013 at 15:20 UTC

    Thank you so much for your explanation :) It means a lot to me. So just to see if I understood.

    By launching 50 it means that each core is going to have a queue of approx 12 streams to be processed by each core. Only 4 simultaneously.

    Because the queue is long, and gets longer because I'm always adding more in each while cycle,the available throughput,cpu and memory gets smaller, causing a bottleneck.

    Did I get it? :)

    So the fact that one stream is bigger than the other does not impact?

      To abuse the fast food analogy where employees are threads, starting a new thread also involves going through HR paperwork before the new thread can do their task. (You really want the task to be more than making a single burger for customer #42 before retiring too)

      Your quad-core restaurant requires a bit of time for one employee to save all their tools away before someone else can change context and use one of the four stations.

      And once you run out of physical ram/floor space for threads to stand in, then you've got to use a bus to swap employees in and out which is horrifyingly slow.

        Thank You :)

      Actually ... no!   :-)

      As you undoubtedly know, a “process” (or thread ...) has absolutely nothing to do with “a core.”   It is (so much for your, ahem, too-thin attempt at humor at my expense ...) just “an employee.”   This employee (no matter what core(s) (s)he happens to get dispatched upon) “finds work to do, and does it, and in so doing remains just as busy as (s)he possibly can be.”   As do all the other employees in the grease-shack.

      Thanks to the existence of a queue, of a “to-do list,” our intrepid employee will never become overwhelmed, no matter how many tour-buses full of hungry folks show up in the drive-thru.   And this is the aforementioned “key.”   There are only so-many square feet of floor-space in the kitchen, therefore only so many burgers that can be cooked at a time.   This will never change, no matter how many burgers are ordered.   “The optimal burger-throughput,” for this particular restaurant, therefore, is always constant ... and the same thing is true of your computing facility.   So, to serve (however many customers there may be) in the least amount of time, you should pay attention only to the conditions in the kitchen ... not the lobby.   Do not over-commit the kitchen.   Instead, parcel out the workload (whatever it is ...) in such a way as to maintain full-utilization of the physical resources but nothing more.   Yes, customers will have to wait, but they are accustomed to that, if they feel that the wait-time is predictable.   (Furthermore, it is necessary(!) for them to wait, if they are to be served in the least amount of time.)   Do not allow the employees to compete with each other.   Do not over-commit the deep-fat fryer.   Do not permit the order-completion time to become, “due to resource contention,” less than it would be if the restaurant were completely empty.   If it should come to that, keep the hungry-folks outside.   Do not permit them to enter the doorway unless you know that you can serve them consistently.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1059125]
[marto]: time for coffee

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (5)
As of 2018-05-21 08:10 GMT
Find Nodes?
    Voting Booth?