|Think about Loose Coupling|
This doesn't sound like a Win32 problem.
Perl isn't thread safe. Perl expects each interpretter and all of its bits to only be run by one thread. And each thread gets its own memory pool. For efficiency, which memory pool to use is determined by what the current thread is. Something is using the wrong pool or creating Perl data items in one thread and then using them in another.
Clone::clone() is written in XS and XS so very often makes something that is much, much more fragile than something written in plain Perl. It appears that Clone::clone() has no use for the extreme measure of using XS other than "speed!".
Storable::dclone() is also written in XS but appears to have taken extra steps to handle Perl threads.
So the speed difference you are seeing (which you didn't bother to describe at all which leads me to guess "not much") may just be the cost of handling threads correctly.
Since using Storable::dclone() works, it would appear that Apache2 knows how to compartmentalize Perl interpretters to just one thread each. So it might also work to tweak how Apache2 is built to tell the embedded Perl code that it doesn't care about threads. I wouldn't be too surprised to find that the Apache group hadn't considered using no-threads Perl in a multi-threaded environment. But doing so might cause Clone::clone() to work and might also make the embedded perls run just a bit faster to boot.