...the OS has to allow the individual process itself manage its own threads in a very very lightweight way.
Win32 already has this ability, they call them fibers.
As well as being ideal for creating pools of "units of execution", they also lend themselves to various other useful programming techniques.
For example, you can make any two pieces of code act as coroutines almost trivially.
What is lacking currently is a suitable Perlish API that would allow application programmers to make use of them. It might be possible to layer a Win32-only API on top of threads, but it would be so much better if, in typical perlish fashion, Perl provided an API for threads that abstracted the underlying implementation of any particular flavour and was flexible enough to allow extension for things like fibers on those platforms that support them.
Rather than the current model which follows one platforms outdated and restrictive model to the letter, and attempts to force-fit that model on all other platforms and implementations.