I "think" I found the answer to the "malformed utf8 errors". It probably only works for those using latin1 encoding. Possibly, it can be adjusted to use whatever encoding you use? Anyways, change the line which opens the file from
open (FH,"<", $_);
open (FH,"<:encoding(latin1)", $_);
Anyways, it works for me in stopping the errors. I can't say how the regex search works in all it's complexity, for instance a really complex regex may need some special handling, but so far, so good. :-)
Here is a complete patched snippet. I added another feature to skip search directories that begin with the number 1. (The code is commented out, but you may activate it). I store alot of html files for my categories, in 1DOCS, and I prefer not to delve into them on recursive searches. Modify to fit your needs.