Ah. This makes it very clear. Thanks (and ++!!)
So, while I don't understand what could cause the behavior that you observe (let's hope someone else looks more closely at this thread and can explain it), there might be some way to write another perl script that automates the recovery procedure you've been doing manually -- but I'm only guessing, because my exposure to process-control details on windows is nil.
But consider... if you make the outputs to the log file "atomic" (add the extra overhead to open/write/close each time you append a message to the log, so some other process has a chance to read it while this test script is running), you might be able to run a separate script that loops on "check the log file; sleep". Make sure the problem script (which is trying to launch the flakey executable) logs when it's about to start a process (including the iteration number or some other distinct id), as well as when the open call returns.
Now, the separate monitor script could figure out when the problem script is hanging; it knows the pids of the jobs that have been opened successfully (they're in the log file), and now it just needs to find a pid in the windows process table that is associated with the flakey executable, but isn't in the log file. The monitor script kills that "outlier" pid (could even append to the log file to report that), and the problem/test script would move on.
As I said, I'm only guessing that something like this would work -- I don't know how you would actually implement it. And of course it doesn't really answer you initial question or solve the real problem. But if it works as intended, you could at least move on towards whatever your "true" objective may be.