http://www.perlmonks.org?node_id=517662


in reply to Re^6: Self Testing Modules
in thread Self Testing Modules

We rightly reject repetitous, c&p code in applications and modules in favour of once and once only. Why tolorate it in test code?

I'm in favor of reducing duplication in test code. It's sort of why Schwern and I built Test::Builder.

Remember though, copy and paste code is only a problem if and when it becomes a maintenance burden. If it is, someone should refactor it -- and perhaps create a new test library either for the project or, if possible, for the CPAN in general.

If it's not getting in the way though, what's the problem?

Replies are listed 'Best First'.
Re^8: Self Testing Modules
by BrowserUk (Patriarch) on Dec 19, 2005 at 15:12 UTC

    This isn't really a response to the above post, but more follow up to your previous questions about "my problem" with the Test::* modules, slightly prompted by your statement above:

    I'm in favor of reducing duplication in test code.

    I think a good part of my problem is that I don't see what I would consider proper segregation between the types (or purposes) of tests within typical test suites that utilise the Test::* framework.

    A few examples, which probably won't be well organised. I may borrow specific examples from particular modules, but that's not a critique of those modules or their authors, just an attempt to ground the examples with some reality.

    To start with, let's imagine we are writing a OO module

    • Starting with the tests that get shipped with modules and run pre and post installation.

      What should these do?

      I'll assert that these should be minimal. And that most modules test too much and the wrong things at this time.

      Taking these two test types in reverse order

      1. Post installations tests.

        My assertion is that it should do just three things.

        1. Check that the installation has succeeded:

          The module can be found and loaded along with all it's compile-time dependencies.

          To this extent, just useing the module in the normal way would, (IMO), be better than use_ok(). The standard error texts report a lot of very useful information that simply is discarded by the institutional wrappers.

        2. Check that an instance of the object can be instantiated given known good instantiation parameters.

          The purpose is to test the installation--not unit or integration test the module. See later.

        3. Check that the instance is correctly destroyed.

          Any destructor code gets called correctly when an instance goes out of scope with no remaining references.

          I'm not sure that this really deserves to be here, or how one could check it in Perl, but it is typical to test this at this time.

        And that should be it.

      2. Pre-installation tests.

        The only things that should be run at this point are, what I would class as simple environmental checks. This might include such things as:

        • The version of Perl available.
        • The availability and versioning of dependant modules.
        • The type and version of the OS if this is important.

      What should these not do?

      I assert, pretty much anything else. That is, they should not test

      1. the modules parameter handling.
      2. nor it's error/exception handling.
      3. nor it's functionality.
      4. nor any of the above for it's dependant modules.
      5. nor any of the above for Perl's built-ins.

      Items 1 through 3 are all tests that should run and passed prior to the module being shipped--by the author. It doesn't make sense to re-run these tests at installation time, unless we are checking that electronic pixies haven't come along and changed the code since it was shipped.

      That's slightly complicated by the fact that authors will rarely be in a position to validate their modules in all the environments that it will run in, but Perl's own build-time test suite should (and does) test most factors that vary across different OSs and compilers. If the Perl installation has passed it's test suite, then there is little purpose in re-checking these things for every module that is installed.

      For example. If a module has a dependency upon LWP to fetch URLs from the Internet, there is no reason to run tests that exercise LWP on every installation. LWP is core, and has a it's own test suite. Modules that use LWP should only need to ensure that it is available, accessible and of a sufficiently high version for their needs. Everything beyond that is duplication. The modules system test should go out to a live Internet site and fetch data in order to do an end-to-end test, but that should only need to be run by the authors prior to shipping.

      I don't have a problem with shipping the full test suite with the module. In the event of failures that cannot be explained through other means, then having the user run the test suite in there environment and feed the results back to the author makes good sense. But running the test suite on every installation doesn't.

    Different test phases each have a different target audience

    And that brings me back to my main problem with the Test::* framework. It's main function appears to be to capture, collate and present in a potted form, the results of running the entire test suite; but whom is this aimed at?

    As a module user, I am only concerned with "Did it install correctly?".

    As a module developer, I am only concerned with "Did my last change work?", "Did it break anything else?", if no to either, "Where did the failure occur?"--and preferably "Take me there!".

    And to my mind, the summary statistics and other output presented by Test::Harness serve neither target audience. They are 'more than I needed to know' as a module user, and 'not what I need to know' as a developer.

    About the only people they serve directly are groups like the Phalanx project (Please stop using the cutesy 'Kwalitee'--it sounds like a '70s marketing slogan! Right up there on the 'grates like fingernails on a blackboard' stakes with "Kwik-e-mart", "Kwik print" and "Magick" of all forms). And possibly corporate QA/Third party code approval Dept's.

    I have misgivings with need to put my unit tests in a separate file in order to use the Test::* modules. It creates couplings where none existed. Codependent development across files, below the level of a specified and published interface, is bad. It is clumsy to code and use. It means that errors found are reported in terms of the file in which they are detected, rather than the file--and line--at which they occurred. That makes tracking the failure back to the point of origin is (at least) a two stage affair. In-line assertions, that can be dis/en-abled via a switch on the command line or in the environment just serve the developer so much better at the unit test level.

    And finally, I have misgivings about having to reduce all my tests to boolean, is_ok()/not_ok() calls. The only purpose this seems to serve is the accumulation of statistics which I see little benefit in anyway. I realise that extra parameters are available to log a textual indication of the cause of the failure, but again: Who do these serve?

    Not the user, they don't care why it failed, only that it did.

    And not the developer. Having to translate stuff like:

    • Did all the bytes get saved?
    • URI should be a string, not an object.
    • URI should be string, not an object
    • URI shouldn't be an object
    • URI shouldn't be an object
    • URI should be a plain scalar, not an object
    • URI shouldn't be an object
    • Find one form, please
    • Should have five elements

    back to file and line number of the failing code; what the inputs were that caused them; why they failed; and what to do about it; just does seem logical to me. Most of those should just be asserts that are in the modules, stay permanently enabled, and report the file & line number (and preferably the parameters of the assertion in terms of both the variables names and contents) when they occur. And the Test::Harness to capture that information and present either "An assertion failure occurred" when in 'user mode', or the assertion text including all the information if run in 'developer mode'. That is

    Assertion failed: Module.pm(257): "$var(My::Module=HASH(0xdeadbeef)) != 'SCALAR'

    would be a lot more useful than

    t\test03: test 1 of 7 failed: "URI should be a plain scalar, not an object"

    I'm grateful to eyepopslikeamosquito (who started this sub-thread), for mentioning the Damian's module Smart::Comments. From my quick appraisal so far, I think that it is likely to become a permanent addition to my Perl file template. It appears to be similar in operation to another module that I have mentioned Devel::StealthDebug, but is possibly a better implementation. It seems to me that it would make an ideal way of incorporating Unit tests into code in a way that is transparent during production runs, but easily enabled for testing. With the addition of a Test::Harness compatible glue module that essentially simply enables the tests, and captures/analyses the output and converts it to boolean is_ok/not_ok checks, might be close to what I've been looking for.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      just useing the module in the normal way would, (IMO), be better than use_ok(). The standard error texts report a lot of very useful information that simply is disaguarded by the institutional wrappers.

      Possibly this has changed at some point, but to the best of my recollection use_ok has always reported the same error messages that a plain use would supply as part of its test diagnostics.

      I don't have a problem with shipping the full test suite with the module. In the event of failures that cannot be explained through other means, then having the user run the test suite in there environment and feed the results back to the author makes good sense. But running the test suite on every installation doesn't.

      Here my experiences differ from yours. I have lost count of the number of times that modules test suites have saved my bacon by failing due to some bizarre platform/version/infrastructure issue.

      I've found running the tests post-install much less useful because I have to spend time chasing down the dependencies, running their tests, etc. Pre-installation tests are one of the things that make modular distributions work for me rather than against me. I can't imagine not doing it.

      And to my mind, the summary statistics and other output presented by Test::Harness serve neither target audience. They are 'more than I needed to know' as a module user, and 'not what I need to know' as a developer.

      While I see what you mean, I don't find the situation quite a dire as you seem to.

      As a module user I personally find the test harness output just about right. Enough info to let me know stuff is happening while it runs. Summary info that gives me a pointer to where things are broken if stuff breaks.

      As a module author there are a few things that I wish were slightly easier. Writing a patch for prove that to run test scripts in most-recently-modified order has been on my list for ages. Along with more flexible ways of running/reporting Test::Class based test suites.

      That said, none of these itches have been irritating enough for me to scratch. The current set up hits some kind of 80/20 sweet spot for me most of the time. When it doesn't I can easily roll a custom test runner that gives me what I need.

      I have misgivings with need to put my unit tests in a separate file in order to use the Test::* modules.

      Then don't :-) Stick 'em in modules, subroutines, closures, etc. Whatever allows you to write your tests in a useful way.

      It creates couplings where none existed.

      A bad test suite that's evolved over time, hasn't been maintained, hasn't had commonalities factored out, with functionality spread over dozens of files can certainly be a royal pain.

      Alternatively, when done well, separating out different aspects of a module into different scripts/classes can help make the behaviour of classes easier to see and maintain. If all of FooModule's logging behaviour is tested in logging.t then I've a much easier time of it when it comes to test/tweak/maintain/debug the logging behaviour.

      Codependent development across files, below the level of a specified and published interface, is bad. It is clumsy to code and use.

      I certainly don't find it so. Quite the opposite.

      It means that errors found are reported in terms of the file in which they are detected, rather than the file--and line--at which they occurred. That makes tracking the failure back to the point of origin is (at least) a two stage affair. In-line assertions, that can be dis/en-abled via a switch on the command line or in the environment just serve the developer so much better at the unit test level.

      Contrariwise...

      Inline assertions mean that errors found are reported in terms of the file and line where they occured, rather in terms of the context that caused the error to occur. That makes tracking the failure back to the point of origin (at least) a two stage affair. Automated tests that allow continual, repeatable regression tests just serve the developer so much better at the unit test level.

      Actually I don't believe the "so much better" bit :-)

      Inline assertions are a useful tool, and I've found design by contract to be an effective way of producing software. However, in my experience, they perform complementary tasks to tests.

      Personally I find tests a more generally effective tool since I've found it easier to incrementally drive development via tests than I have via assertions/contracts.

      And finally, I have misgivings about having to reduce all my tests to boolean, is_ok()/not_ok() calls.

      I'm sorry - I'm not understanding your point here. Whether you have an inline assertion/contract or an external test you are still describing a bit of behaviour as a Boolean ok/not_ok aren't you?

      The only purpose this seems to serve is the accumulation of statistics which I see little benefit in anyway.

      The benefit is not the accumulation of statistics. It's about checking that my new bit of code does what I want it to do, or that bug #126 is fixed, or that the new version of Log::Log4perl doesn't break the existing installation, etc. For me testing is all about answering questions, not meaningless stats.

      Not the user, they don't care why it failed, only that it did.

      Depends on the user. I'm usually very interested in why it fails because I need to get the damn thing to work - even when I didn't develop it :-)

      And not the developer.

      ...

      That is

      Assertion failed: Module.pm(257): "$var(My::Module=HASH(0xdeadbeef)) != 'SCALAR'

      would be a lot more useful than

      t\test03: test 1 of 7 failed: "URI should be a plain scalar, not an object"

      It depends on what information you're after. Most of the time I am far more interested in what caused the error than where it occurred. If I know and can reproduce the cause then I can easily find out where an error occurred. I find doing the opposite considerably harder.

      I'm grateful to eyepopslikeamosquito (who started this sub-thread), for mentioning the Damian's module Smart::Comments.

      If you like this I suspect you'll really like design by contract. If you've not played with them already take a look at Class::Contract and Class::Agreement, and give Meyer's "Object Oriented Software Construction" a read.

        If you like this I suspect you'll really like design by contract. If you've not played with them already take a look at Class::Contract and Class::Agreement, and give Meyer's "Object Oriented Software Construction" a read.

        Yes. I am a fan of DbyC.

        I first read OOSC shortly after it came out ('90 or '91?). Later it was the course reference material for a friend who I mentored. The course language was Eiffel/S, probably the best teaching language I ever encountered.

        I will definitely take a look at the those two modules. Thanks for the links.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

      [Assertion failed] would be a lot more useful than [Test Failed]

      I think this comment indicates that you don't have the same expectations of a test framework as I do. When I use a test framework I'm testing behaviour, such as the output of a function under certain circumstances. I'm not concerned with implementation im concerned with output and side effects. It really doesn't matter to me if Module.pm(257) expected to get a SCALAR ref and got something different, what matters to me is how that affected the output, or whether it prevented something from happening that should have happened.

      As an example in DDS i have assertions in various places, as well as "insane case" handlers on the important branch points. However, knowing that one of these assertions failed doesnt help me nearly as much as knowing what input set lead to the assertion failing. Because of the structure of DDS I can end up executing those assertions on almost any input, so knowing that the assertion failed simply tells me that "for some unknown input something doesnt do the right thing". This doesnt help me nearly as much as knowing that "for a stated input something doesn't do the right thing". So when a test reports a failure I can easily identify the input code responsible, and then work towards the logic failure that caused it. Simply knowing that a logic failure occured doesnt help me.

      When I download a module and it fails test the first thing i do is open up the test file to see what was happening then i look at the relevent code in the module and usually I can find a fix right away, with the definition of "fix" being that i can patch it so it passes test. I then continue on my way, satisifed that I know whats going on. On the other hand when I install a module and it fails I have to take considerable time to figure out why because i dont know if its me doing something wrong or if its the module doing something wrong, furthermore since I dont know exactly what it should be doing its much harder to fix it.

      So I guess to me in simple terms regression tests are about why the code fails not so much what code failed. Or in other words its about identifying initial states that lead to failure, not about the exact internal cause of it.

      ---
      $world=~s/war/peace/g

        Assertion failed would be a lot more useful than Test Failed

        If I were suggesting reducing the error text to a bare "Assertion failed", I might agree, but I am not.

        Any Assert mechanism worth it's salt (eg) is going to give you full Carp::Croak style traceback, so you not only get the which line in the module failed, but also get

        1. which line (not just test no).
        2. in which test file
        3. an all-points-bulletin of the execution path between the two.

        Which makes the need to have, and laboriously maintain, test numbers within your testcases redundant. This is all the standard stuff you expect to see from an Assert mechanism in any language.

        However, this being Perl, a dynamic language with full introspection right down to the names of the variables involved in the Assertion, and even the text of the source code if wanted, the assert mechanism can provide even more useful information. And save the testcase writer from having to come up with textual descriptions for the tests, that result in the wild variations for the same situations I posted above.

        Your point about regression tests is well taken, but it also reenforces my point about different tests having different audiences. I mentioned "developer mode' and 'user mode' controlling the volume and type of information that the test harness displays. There no reason not to have an intermediate 'regression mode' also. In reality, these are all just 'levels of information', and provided the full information is available, suppressing some levels of it for different audiences is trivial--but it has to be there in the first place. The problem with the current toolset is that this information is never available in the first place, or silently dropped unasked.

        The way you describe your interests when you get a failure from a module that you are using, you want to see developer type information. And that's okay if the test harness has that ability, you can simply turn it on and get it. For any 'just-a-user' user, they never have to see it if they do not have your 'developer as user' interest.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.
      I don't have a problem with shipping the full test suite with the module. In the event of failures that cannot be explained through other means, then having the user run the test suite in there environment and feed the results back to the author makes good sense. But running the test suite on every installation doesn't.

      Shipping a test suite that I, as a user of the module, can inspect and run to verify that things are just peachy-keen with all the scary internals of some CPAN module makes me very happy.

      In most cases, I don't have the time or desire to read every line of a module, but I do have enough time to skim the tests and make sure that there's test coverage for the main functionality I need. Or, if not, that I can easily add that coverage before writing code that depends on that functionality.

      Without the full test suite, that's much harder. And, should I find it necessary to make local changes to the module, a full test suite makes it much, much easier to be confident that a) I didn't break something, and b) that if it looks like the change might be generally usefu, that it could be submitted back to the maintainer without causing him/her all sorts of grief trying to debug it.