Truer words have never been spoken than what BrowserUK just so-well said: the key is to limit the number of threads that are active at any one time, and to separate this from the
number of requests amount of work that they have to do. Even if you receive (say ...) 1,000 start messages in the space of (say ...) one millisecond, you must not attempt to “therefore ...” launch 1,000 threads. Instead, some of those requests will briefly have to wait-their-turn.
You should have a pool of “worker threads,” of some configurable size that will not exceed this system’s capacity. All of those threads, however many there may be, should be waiting for a message (start ...) to arrive on a common, thread-safe queue. Meanwhile, your main thread, instead of “starting a new thread” each time such a message arrives, should instead be posting the message to that queue. When such a message arrives for them, they carry out the work, then wait again. (The main-thread is the only “writer.” All of the workers are “readers.”) If any worker is waiting, it will get the message without delay. If all of the workers are busy, the message will briefly sit in the queue until some worker can receive it. Nevertheless, the total number of threads that the operating system is being asked to support ... or, to provide memory for ... will never grow beyond the proscribed limits, no matter how stuffed-up the queue (momentarily) gets.
And, there’s already a lot of CPAN code out there to help you, so be sure not to re-invent the wheel here. (Say ...) Thread::Pool, Thread::Queue, Thread::Signal . . .
You should, of course, design the threads so that they dispose of all storage needed to service their latest request, before they go-to-sleep waiting for another request to arrive. If you take care to do that, Perl’s very-clever memory manager should automagically take care of the rest. Yes, the memory-size of the parent process may grow to be large, but it should not
grow uncontrollably leak.