There's a lot of stuff here and I can't answer it all as I get ready for work so I'll rush through this.
The speed penalty is most at compile time. There's actually not much of a runtime speed penalty if code without this module attempts the same level of validation. If you don't mind the compile time hit, you're probably OK. In a persistent environment (such as mod_perl) you may never notice. Caveat: I haven't benchmarked this, so take what I say with a grain of salt. I was looking for programmer efficiency rather than CPU efficiency.
It should have no impact on caller because internally it uses goto to subvert the call stack. In this respect, it's even better than some hand-rolled code. However, I didn't write tests for this. I should do that. And can you give other introspective examples you'd like to see tested?
As for the code snippet you tested, yes it will handle that, if you use 'strict' mode.
In 'strict' mode, it considers the types of variables. That would actually work in 'loose' mode, but only because each subroutine has a different number of variables. Naturally, that would be more bug-prone. I wonder if I should have made 'strict' the default instead of 'loose'?
I used ref instead of Scalar::Util on the "simplest thing that could possibly work" principle. If you can give me a clear code snippet showing why ref is inferior (or point me to a resource.) I'll happily change it. (Update: Of course, I seem to recall some of the issues you mention. Hmm, I don't think I have a choice but to change it.)
The problem with more than one package per file is a combination of my code and how Filter::Simple works. It's very important that I know my calling package when setting argument lists with the subroutines so I determine this in &import. However, Filter::Simple keeps scanning through the code and cheerfully skips past package declarations, thus meaning I could alter subs in the wrong package. I thought about trying to parse out the the declarations and it didn't seem too hard, but Perl has so many odd corners that I thought there would be a good chance of missing something. As a result, I opted to keep it simple for my initial release.
I don't use prototypes because they're useless with methods and I wanted to limit the differences between using functions and methods. Right now they behave almost identically. What I didn't want was having to constantly respond to the following bug report:
And I didn't use attributes because even though I knew I could get something like signatures working, the real problem I wanted to transparently solve was signature-based multi-method dispatch. I don't know if that's possible with using attributes and a relatively straightforward syntax.