|Problems? Is your data what you think it is?|
I think the original question confuses tests with contracts. A 3-digit format requirement of the user -- it doesn't matter if it's a subclass or not -- must be specified in a contract (types, guards, pre-conditions, etc.). If the requirement is violated by switching to a 4-digit format this is not a test failure, but a contract failure.
If the test suite were copied into the subclass, the subclass would incorrectly report a test failure. Someone may "fix" the superclass format instead of fixing the interface mismatch.
Ideally, I'd want the subclass to inherit all the superclass tests. That way I can run the superclass tests on an instance of the subclass. If there are any failures, that might mean the inheritance is incorrect. (Is a circle an elipse?) It might also mean the superclass tests are not polymorphic. Either way there's a bug to fix.
The nasty dilemma is that a system can be correct and unsafe at the same time.
On the subject of tire testing...
Vehicle manufacturer tire testing is actually a very poor analogy. Those tests are more like security tests -- if all of your design assumptions are violated, does the system fail gracefully?
Vehicle crash testing is an extreme example. The software analogy would be to introduce standardized hardware failures and then verify the failures do not cause data loss.
Specification tests (your levels 1 through 5) are really just purchasing formalities. Did we get what we paid for? Programmers worry almost exclusively about this. Is my superclass ripping me off?
The most common tests that manufacturers run are design verification and performance measurement. Unlike most software, mechanical objects are quite unpredictable. Manufacturers run tests to see if what they think should happen really does happen. Programmers never do this kind of testing. Gee, I wonder if 1+1 still works if I put it before a while statement?
Lastly, manufacturing a verified design is not easy either -- tests are required there too. These are similar to specification tests and mostly interesting to accountants. 50% failure of a dirt cheap process might be better than 1% failure of an expensive process.