Adding one more thought (in case I missed it) to BrowserUK’s excellent recommendations here ... (++)x2 ... it is also a very good idea to create a pool of worker-threads whose sole purpose is to wait for an incoming message on a shared queue that they all read, and to carry out that request. (Usually, having done so, they write a record to an outbound-queue so that it can be sent back to the client, either by the “reader” thread or by another one.)
The number of threads is not-equal to the number of requests that are in process at one time, and so the system never attempts to bite off more than it can chew ... the requests just have to wait in line for a bit, but they do so cheaply. The overhead of setting-up and tearing-down a process or thread is also greatly reduced. You can set your watch by how many requests-per-second such a system will be able to churn through, no matter whether its waiting-room is full or empty.