in reply to The problem with "The Problem with Threads"
Good thoughts. Very.
The most common mis-use of threads that I see is the “flaming arrows strategy.” For each request, light another flaming arrow (a thread), and shoot it into the air to land where and how it may and then die. This is sure to overwhelm almost any scheduler, which also has no real way to distinguish one scheduler-unit-of-work from another. It also burdens the system with setup and teardown of processes, which is often expensive.
A far better (and, far more scalable) approach is to do what’s done in, say, any restaurant: to have a pool of workers who shift their attentions among a greater number of active orders, which are flowing station-by-station through a flexible and well-defined lifespan. The workers can be generalists, or specialists, or sometimes both, and the allocation can be adjusted at any time. The system that is built using threads/processes, is very aware of exactly what business-problem it is constructed to solve. When a unit of work is obliged to wait, a worker is not. Instead, the order is briefly “parked.” Workers live to a ripe old age. If a single work-unit needs to go to several stations at once (burger, fries, milkshake), it might be opportunistically serviced by three workers simultaneously. Commitments regarding service level are engineered into the system, so that you can say (and measure), that “95% of the time, all orders will be prepared and served to the customer within 3 minutes.” Concurrency is used to address the problem, but there is not a one-to-one correspondence between workers and work. There is a dedicated management role, separate from any worker, which is constantly “riding the faders” to keep everything in balance.
There are plenty of good workload-management systems, including some that are designed to share work in a computing cluster, as well as those that are designed to be self-adapting to changing workload mixes and resource-constraint pressures. The goal is to find the ever-changing “sweet spot” in which a maximum number of units-of-work are being processed, in the least amount of time, without creating traffic-jams at any point (including the OS scheduler itself). The principles used are simply taken from the real world of human and industrial processes.