I guess I was confused that Test::Harness couldn't see the correct number of valid tests either, but it must parse some of the end report that Test::More generates, instead of counting the tests itself.
The reason T::H gives an error is that the when T::B figures out it has a bad test count it exits with a non-zero value, which then gets picked up as suspicious by T::H (which is why you got the "Test returned status 1" error from T::H).
This makes it easier for other things like Aegis to figure out which test script borked automatically.