Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Re: Inheriting Tests and Other Test Design Issues

by BrowserUk (Patriarch)
on Sep 28, 2003 at 01:54 UTC ( #294700=note: print w/replies, xml ) Need Help??

in reply to Re: Inheriting Tests and Other Test Design Issues
in thread Inheriting Tests and Other Test Design Issues

Yes, but shouldn't you have tests checking that your superclass didn't change its specification? Shouldn't that be very important tests?

Yes. You should have a test suite for validating the functionality of your superclass, but not as a part of the unit tests of a subclass!

If you were producing half-a-dozen subclasses of a superclass does it make sense to test the superclass functionality in the unit tests of all of them? What if it's a dozen, or two dozen?

The owners/ authors/ suppliers of the superclass should maintain and use a Functional Verification suite for that class. You, as the user/ purchaser of the class should have an acceptance test suite that verifies the functionality.

If logic and good relations prevail, then these may be the same test suite -- but that's neither essential nor always possible -- but this type of testing should only be run when the superclass is upgraded, not when testing a change to a subclass of it.

Tight coupling between unit tests and the unit is essential, but equally essential is that units (modules/ classes) and be loosely coupled, that includes their testing. A unit test failure should directly indicate a failure in that unit, and not upstream from it. Upstream failures should have been detected before the upstream code is accepted.

No, because your module isn't providing substr or multiplication functionality. It isn't subclassing the functionality of the Perl runtime environment.

Nor is my module implementing the functionality of the superclass -- its just using it, just as it uses the perl runtime, c-runtime, OS systems calls -- or passing it through.

Unit tests of my units functionality, should automatically show up any failures in the superclass, where they affect that functionality.

Any tests aimed solely at verifying the functionality of the superclass are either

  • duplicating testing already performed there,
  • or testing the effectiveness of the functional verification or acceptance tests.

This should not be necessary, and is undesirable, as all it does is increase the development/maintainance cycle and ultimately increases costs. Duplicated testing doesn't improve anything. If you don't have faith in the testing of the superclass, put the effort in to improving it, not duplicating it.

Test thoroughly, but only test once! Or rather, in one place.

I've got some great (but long) horror stories of corperating testing methods, and how more isn't better if it's more tests of the same thing.

But if you have a class that's subclassing a class that provides number formatting, then your class is providing number formatting.

If the number formatting is used internally by my module, then my unit testing should show up any disparities between the specified and actual returns I get from it without resorting to tests specifically aimed at testing the superclass.

If my module is passing the superclass functionality through without overriding it, then any testing I did of that would either be testing the inheritance mechanisms -- which would be like testing substr -- or it would be duplicating testing that is (should) already being performed by the superclass unit or my Acceptance testing of the superclass.

That that happens through inheritance isn't something the users need to know (encapsulating of the implementation).

Agreed, but that doesn't enter into the argument. The users of my module, (and by proxy the superclass) should have their own FV or Acceptance tests for my module. If we can agree to share that between us, so well and good, but if the superclass has a bug or failure to meet spec. that isn't detected by my unit testing or my users acceptance testing of my module, then it is irrelevant or those tests are flawed and should be improved.

It's not a case that anything should go untested, it's just that there is no benefit in testing stuff twice. Placing tests in the right place not only minimises the amount of testing done, and the costs involved in doing it, it also means that testing times are shorter, which encourages them to be used more frequently at each particular level, which improves overall throughput and quality.

It's the same if you buy a Ford. You'd expect that Ford checks that the tires stay on the wheels when going 100km/h, and don't assume that they delegate that to Goodyear (or whatever brand they use). Goodyear does its test, but Ford should do as well.

Nice (or perhaps, pertinent) analogy :)

Goodyear should test the construction of their tyres 1 and ensure they live up to their specified rating SR/VR/HR etc. 2

The wheel manufacturer (Ford or 3rd party) should ensure that their wheels correctly retain standard tyres -- not just the particular tyre chosen as standard equipment on one particular model that uses that wheel -- on the rim under all 'normal' circumstances.3

Ford should

  • Pick (design) a suitable wheel for the vehicle.4
  • Pick a suitably rated tyre for their vehicle/ wheel combination, given it's performance characteristics, weight, probable modes of use etc.4
  • Test all three (plus all other standard equipment) in combination.5

During this latter testing, it should not be necessary for Ford to perform lamination tests on the plys, or durability tests on the radial reinforcing, or test the compound for longevity or wet whether grip etc. This should all have been covered by the unit testing and be certified by the manufactures rating.

In software terms, the tests described above fit roughly into these categories.

1Unit testing.
2Functional Verification.
3Integration Testing.
4Acceptance testing.
5Systems testing.

Examine what is said, not who speaks.
"Efficiency is intelligent laziness." -David Dunham
"When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
If I understand your problem, I can solve it! Of course, the same can be said for you.

  • Comment on Re: Re: Inheriting Tests and Other Test Design Issues

Replies are listed 'Best First'.
Tests (Safety) vs Contracts (Correctness)
by blssu (Pilgrim) on Sep 29, 2003 at 19:46 UTC

    I think the original question confuses tests with contracts. A 3-digit format requirement of the user -- it doesn't matter if it's a subclass or not -- must be specified in a contract (types, guards, pre-conditions, etc.). If the requirement is violated by switching to a 4-digit format this is not a test failure, but a contract failure.

    If the test suite were copied into the subclass, the subclass would incorrectly report a test failure. Someone may "fix" the superclass format instead of fixing the interface mismatch.

    Ideally, I'd want the subclass to inherit all the superclass tests. That way I can run the superclass tests on an instance of the subclass. If there are any failures, that might mean the inheritance is incorrect. (Is a circle an elipse?) It might also mean the superclass tests are not polymorphic. Either way there's a bug to fix.

    The nasty dilemma is that a system can be correct and unsafe at the same time.

    On the subject of tire testing...

    Vehicle manufacturer tire testing is actually a very poor analogy. Those tests are more like security tests -- if all of your design assumptions are violated, does the system fail gracefully?

    Vehicle crash testing is an extreme example. The software analogy would be to introduce standardized hardware failures and then verify the failures do not cause data loss.

    Specification tests (your levels 1 through 5) are really just purchasing formalities. Did we get what we paid for? Programmers worry almost exclusively about this. Is my superclass ripping me off?

    The most common tests that manufacturers run are design verification and performance measurement. Unlike most software, mechanical objects are quite unpredictable. Manufacturers run tests to see if what they think should happen really does happen. Programmers never do this kind of testing. Gee, I wonder if 1+1 still works if I put it before a while statement?

    Lastly, manufacturing a verified design is not easy either -- tests are required there too. These are similar to specification tests and mostly interesting to accountants. 50% failure of a dirt cheap process might be better than 1% failure of an expensive process.

      Ideally, I'd want the subclass to inherit all the superclass tests.

      This is a source of confusion, in my mind at least, though dws touched on it earlier also.

      There are two "chains of inheritance", for want of a better term, involved in this discussion. We have the superclass & subclass chain, and we have the superclass tests and the possiblity of the tests for the subclass 'inheriting' those within the superclass.

      My view, based on my accumulated wisdom -- I use the term loosely -- from exposure to the various different methods I've used, is that using inheritance in the test chain, regardless of whether this is formal, language based inheritance, or cruder mechanisms by which the tests for the superclass would be run as a part of the testing cycle of the subclass, is bad practice.

      My reasoning is, as I cited above

      1. Duplication.

        If the the superclass has many subclasses, re-running the superclass tests for every subclass achieves nothing.

      2. Targeting.

        The testing performed at each of the five levels I cited serve different purposes. Not only does mixing them cause duplication, it can also compromise the validity of the testing.

        • Unit tests are best written internally to the code they test. Only by having sight of the implementation can one ensure that all the paths are covered.
        • Functional verification specifically should not have sight of the code. It should be testing against the contract/specification, not the specific implementation.

          I strongly disagree that specification testing is only a bean counting exercise. It is integral to the loose-coupling philosophy that allows independance between development teams working on seperate subsystems of an overall system. Ensuring and maintaining this loose coupling is the most important step in achieving cost effective and attributable development in large systems, which leads to the third reason.

      3. Binding.

        Whilst tests are, of their very nature, inextricably tightly bound to the code they test, subclasses should be as loosely bound to their superclasses as possible.

        The theoretical reasons for loose binding are very well documented, thought the practical manifestations of them are less well defined or recognised.

        By creating an inheritance hierarchy in tests suites that mirrors the inheritance hierachy of the production code, you are creating an indirect tight binding between subclasses and their superclass.

        If the need arises to swap out the current implementation of the superclass for an alternate, if good OO practice has been followed in the construction of the subclasses and they are loosly bound, then the replacement should require no changes in the subclasses.

        However, if there is an indirect close-couplng between the subclasses test's and those of the superclass, then replacing the superclass will require replacing those tests, with the knock on effect of requiring changes to the tests in the otherwise unaffected subclasses.

        Although the replacement superclass may perform the same function as that it replaced, it is quite likely that it will used a different implementation. It's unit test suite will therefore have to be different and anything inheriting from those tests will also be affected.

        Whilst it would be theoretically possible to maintain loose-coupling in the test hierarchy, doing so would require significant design effort in the test suite. It's all code, and it's all possible, but the tests are there to support the main code. Once it becomes a project all of its own, it saps resources and you end up with the crazy possibility that a replacement superclass (production code) could be rejected on the that it's (test suite) (non-production) didn't fit with the "Test suite model".

        That would really be a case of the tail wagging the dog.

        And if anyone thinks that this level of stupidity couldn't happen, I have a couple of very long stories to show that it can and does. The biggest problem of software development in large organisations, is keeping the focus on the production code and away from ancillary areas.

        Whilst testing, and source code maintenance and backups and documentation and media production and many other areas of the full picture are very important, they must support the development process, not drive or control it.

      On the tyres thing: Most analogies don't stand up to deep scrutiny.

      I'll hold my hand-up here and say that I got a long way into arguing with your crash-test scenario -- or rather the purpose of crash testing vehicles -- before throwing it away and "moving on" :)

      abigail used it to make the point that sub-systems aren't islands and there is intereactions between them.

      I used to make the point that he was correct, but that the different testing is required at different stages.

      By their very nature, analogies tend to be over-simplified, but discussing the true nature of the system used in the anaology is fruitless. If the analogy helps in making the point being broached, it served it's purpose. If it didn't, move on.

      Examine what is said, not who speaks.
      "Efficiency is intelligent laziness." -David Dunham
      "When I'm working on a problem, I never think about beauty. I think only how to solve the problem. But when I have finished, if the solution is not beautiful, I know it is wrong." -Richard Buckminster Fuller
      If I understand your problem, I can solve it! Of course, the same can be said for you.

        I wonder if placing tests outside the main code base is the best approach. Perl encourages this with the default MakeMaker machinery, but whenever I really *need* a test result, a built-in test I can call from the debugger (or insert into production code as a guard) would be better. Most of my tests sprout in the main code and then are transplanted into a test suite. Why do this extra work to make the tests less useful?

        This is what I was thinking when I wrote about inheriting tests. I meant a subclass should inherit tests, not that the external subclass test suite should inherit tests. It would be useful to run superclass tests on instances of the subclass in a "real world" environment.

        Sorry for the confusion. I should have known better as I read the whole thread, including dws' clarification, before replying. You've packed a lot of material into your reply, so I'll try to keep the same organization as you.

        1. Duplication.

          I don't understand your optimisim for achieving nothing by running the superclass tests on subclasses. It would at least catch bless $copy, 'Fubar' errors in the copy constructor... ;)

          Different subclasses can violate the invariants of the parent in different ways. This is similar to smoke testing of perl on different architectures. The majority of smoke testing runs the same code with the same results. It's the differences that are important, but we don't know the differences until after the tests are run.

        2. Targeting.

          I shouldn't have been so flip. Most software testing is bean counting as you call it. Measuring 100% code coverage in a unit test is the purest form of bean counting. Running a regression test to verify a change is bean counting too. Neither of these are really "testing" anything other than the system's internal consistency. Isn't that exactly what accounting checks?

          Your whitebox vs blackbox testing strike me as different schools of accounting, not as having different purposes. This testing seems to be a substitute for formal methods. (Whether formal methods will ever work for software is an entirely different question.)

          Examples of testing with different purpose are security, performance and usability.

        3. Binding.

          It's not clear to me why loosely coupled classes would have tightly coupled test suites, but I don't doubt it happened. I am surprised to hear you don't think external factors often influence implementation. Rejecting new code due to its' test suite is more rational than rejecting it due to its' politics -- it wish it were more common too...

          Close coupling is a problem. Test suites that are artificially forced to be loosely coupled in an otherwise closely coupled system may be worse.

        You undoubtedly have accumulated wisdom -- probably more than me. I think I've mostly just accumulated disgust with the status quo.

        Yep, the tyres thing. If the analogy was primarily connected with a hierarchy of tests and interacting systems, I'd have no trouble with it. Unfortunately it's easy to look at that analogy and jump to an incorrect understanding. Safety testing and specification testing are not points on the same continuum.

        "This steak is like a turd." That could mean the steak has a lovely chocolate brown color -- but the reader most likely would not come to that conclusion.

Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://294700]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2022-09-24 22:15 GMT
Find Nodes?
    Voting Booth?
    I prefer my indexes to start at:

    Results (115 votes). Check out past polls.