http://www.perlmonks.org?node_id=295093


in reply to Re: Re: Inheriting Tests and Other Test Design Issues
in thread Inheriting Tests and Other Test Design Issues

I think the original question confuses tests with contracts. A 3-digit format requirement of the user -- it doesn't matter if it's a subclass or not -- must be specified in a contract (types, guards, pre-conditions, etc.). If the requirement is violated by switching to a 4-digit format this is not a test failure, but a contract failure.

If the test suite were copied into the subclass, the subclass would incorrectly report a test failure. Someone may "fix" the superclass format instead of fixing the interface mismatch.

Ideally, I'd want the subclass to inherit all the superclass tests. That way I can run the superclass tests on an instance of the subclass. If there are any failures, that might mean the inheritance is incorrect. (Is a circle an elipse?) It might also mean the superclass tests are not polymorphic. Either way there's a bug to fix.

The nasty dilemma is that a system can be correct and unsafe at the same time.


On the subject of tire testing...

Vehicle manufacturer tire testing is actually a very poor analogy. Those tests are more like security tests -- if all of your design assumptions are violated, does the system fail gracefully?

Vehicle crash testing is an extreme example. The software analogy would be to introduce standardized hardware failures and then verify the failures do not cause data loss.

Specification tests (your levels 1 through 5) are really just purchasing formalities. Did we get what we paid for? Programmers worry almost exclusively about this. Is my superclass ripping me off?

The most common tests that manufacturers run are design verification and performance measurement. Unlike most software, mechanical objects are quite unpredictable. Manufacturers run tests to see if what they think should happen really does happen. Programmers never do this kind of testing. Gee, I wonder if 1+1 still works if I put it before a while statement?

Lastly, manufacturing a verified design is not easy either -- tests are required there too. These are similar to specification tests and mostly interesting to accountants. 50% failure of a dirt cheap process might be better than 1% failure of an expensive process.

Replies are listed 'Best First'.
Re: Tests (Safety) vs Contracts (Correctness)
by BrowserUk (Patriarch) on Sep 29, 2003 at 21:24 UTC
    Ideally, I'd want the subclass to inherit all the superclass tests.

    This is a source of confusion, in my mind at least, though dws touched on it earlier also.

    There are two "chains of inheritance", for want of a better term, involved in this discussion. We have the superclass & subclass chain, and we have the superclass tests and the possiblity of the tests for the subclass 'inheriting' those within the superclass.

    My view, based on my accumulated wisdom -- I use the term loosely -- from exposure to the various different methods I've used, is that using inheritance in the test chain, regardless of whether this is formal, language based inheritance, or cruder mechanisms by which the tests for the superclass would be run as a part of the testing cycle of the subclass, is bad practice.

    My reasoning is, as I cited above

    1. Duplication.

      If the the superclass has many subclasses, re-running the superclass tests for every subclass achieves nothing.

    2. Targeting.

      The testing performed at each of the five levels I cited serve different purposes. Not only does mixing them cause duplication, it can also compromise the validity of the testing.

      • Unit tests are best written internally to the code they test. Only by having sight of the implementation can one ensure that all the paths are covered.
      • Functional verification specifically should not have sight of the code. It should be testing against the contract/specification, not the specific implementation.

        I strongly disagree that specification testing is only a bean counting exercise. It is integral to the loose-coupling philosophy that allows independance between development teams working on seperate subsystems of an overall system. Ensuring and maintaining this loose coupling is the most important step in achieving cost effective and attributable development in large systems, which leads to the third reason.

    3. Binding.

      Whilst tests are, of their very nature, inextricably tightly bound to the code they test, subclasses should be as loosely bound to their superclasses as possible.

      The theoretical reasons for loose binding are very well documented, thought the practical manifestations of them are less well defined or recognised.

      By creating an inheritance hierarchy in tests suites that mirrors the inheritance hierachy of the production code, you are creating an indirect tight binding between subclasses and their superclass.

      If the need arises to swap out the current implementation of the superclass for an alternate, if good OO practice has been followed in the construction of the subclasses and they are loosly bound, then the replacement should require no changes in the subclasses.

      However, if there is an indirect close-couplng between the subclasses test's and those of the superclass, then replacing the superclass will require replacing those tests, with the knock on effect of requiring changes to the tests in the otherwise unaffected subclasses.

      Although the replacement superclass may perform the same function as that it replaced, it is quite likely that it will used a different implementation. It's unit test suite will therefore have to be different and anything inheriting from those tests will also be affected.

      Whilst it would be theoretically possible to maintain loose-coupling in the test hierarchy, doing so would require significant design effort in the test suite. It's all code, and it's all possible, but the tests are there to support the main code. Once it becomes a project all of its own, it saps resources and you end up with the crazy possibility that a replacement superclass (production code) could be rejected on the that it's (test suite) (non-production) didn't fit with the "Test suite model".

      That would really be a case of the tail wagging the dog.

      And if anyone thinks that this level of stupidity couldn't happen, I have a couple of very long stories to show that it can and does. The biggest problem of software development in large organisations, is keeping the focus on the production code and away from ancillary areas.

      Whilst testing, and source code maintenance and backups and documentation and media production and many other areas of the full picture are very important, they must support the development process, not drive or control it.


    On the tyres thing: Most analogies don't stand up to deep scrutiny.

    I'll hold my hand-up here and say that I got a long way into arguing with your crash-test scenario -- or rather the purpose of crash testing vehicles -- before throwing it away and "moving on" :)

    abigail used it to make the point that sub-systems aren't islands and there is intereactions between them.

    I used to make the point that he was correct, but that the different testing is required at different stages.

    By their very nature, analogies tend to be over-simplified, but discussing the true nature of the system used in the anaology is fruitless. If the analogy helps in making the point being broached, it served it's purpose. If it didn't, move on.


    Examine what is said, not who speaks.
    "Efficiency is intelligent laziness." -David Dunham
    "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
    If I understand your problem, I can solve it! Of course, the same can be said for you.

      I wonder if placing tests outside the main code base is the best approach. Perl encourages this with the default MakeMaker machinery, but whenever I really *need* a test result, a built-in test I can call from the debugger (or insert into production code as a guard) would be better. Most of my tests sprout in the main code and then are transplanted into a test suite. Why do this extra work to make the tests less useful?

      This is what I was thinking when I wrote about inheriting tests. I meant a subclass should inherit tests, not that the external subclass test suite should inherit tests. It would be useful to run superclass tests on instances of the subclass in a "real world" environment.

      Sorry for the confusion. I should have known better as I read the whole thread, including dws' clarification, before replying. You've packed a lot of material into your reply, so I'll try to keep the same organization as you.

      1. Duplication.

        I don't understand your optimisim for achieving nothing by running the superclass tests on subclasses. It would at least catch bless $copy, 'Fubar' errors in the copy constructor... ;)

        Different subclasses can violate the invariants of the parent in different ways. This is similar to smoke testing of perl on different architectures. The majority of smoke testing runs the same code with the same results. It's the differences that are important, but we don't know the differences until after the tests are run.

      2. Targeting.

        I shouldn't have been so flip. Most software testing is bean counting as you call it. Measuring 100% code coverage in a unit test is the purest form of bean counting. Running a regression test to verify a change is bean counting too. Neither of these are really "testing" anything other than the system's internal consistency. Isn't that exactly what accounting checks?

        Your whitebox vs blackbox testing strike me as different schools of accounting, not as having different purposes. This testing seems to be a substitute for formal methods. (Whether formal methods will ever work for software is an entirely different question.)

        Examples of testing with different purpose are security, performance and usability.

      3. Binding.

        It's not clear to me why loosely coupled classes would have tightly coupled test suites, but I don't doubt it happened. I am surprised to hear you don't think external factors often influence implementation. Rejecting new code due to its' test suite is more rational than rejecting it due to its' politics -- it wish it were more common too...

        Close coupling is a problem. Test suites that are artificially forced to be loosely coupled in an otherwise closely coupled system may be worse.

      You undoubtedly have accumulated wisdom -- probably more than me. I think I've mostly just accumulated disgust with the status quo.


      Yep, the tyres thing. If the analogy was primarily connected with a hierarchy of tests and interacting systems, I'd have no trouble with it. Unfortunately it's easy to look at that analogy and jump to an incorrect understanding. Safety testing and specification testing are not points on the same continuum.

      "This steak is like a turd." That could mean the steak has a lovely chocolate brown color -- but the reader most likely would not come to that conclusion.

        "This steak is like a turd." That could mean the steak has a lovely chocolate brown color -- but the reader most likely would not come to that conclusion.

        ...I keep wiping the tears from my eyes, but they just keep coming:)


        Examine what is said, not who speaks.
        "Efficiency is intelligent laziness." -David Dunham
        "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
        If I understand your problem, I can solve it! Of course, the same can be said for you.