Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses

Re^5: Self Testing Modules

by eyepopslikeamosquito (Bishop)
on Dec 19, 2005 at 01:15 UTC ( #517648=note: print w/replies, xml ) Need Help??

in reply to Re^4: Self Testing Modules
in thread Self Testing Modules

From Perl Testing: A Developer's Notebook by chromatic and Ian Langworth, page 40:

There's no technical reason to keep all of the tests for a particular program or module in a single file, so create as many test files as you need, organizing them by features, bugs, modules, or any other criteria.

In my experience, this is sound advice. Dividing a large number of tests into a number of smaller (cohesive) units, is essentially just divide and conquer, the only fundamental technique we have to fight complexity. It also helps ensure that each test runs in isolation, independent of others. To give a specific example, notice that WWW::Mechanize, written by Phalanx leader petdance, contains 3 lib .pm files and 49 .t files.

Apart from all that, a single large .t file makes developing new tests inconvenient because after adding a new test to the .t file, you must run all the other tests (intolerable if the single .t file contains stress tests taking hours to run). Or are you recommending that we don't use the standard Test::Harness/Test::More framework?

Replies are listed 'Best First'.
Re^6: Self Testing Modules
by BrowserUk (Pope) on Dec 19, 2005 at 02:29 UTC
    Or are you recommending that we don't use the standard Test::Harness/Test::More framework?

    I'm not recommending anything, which may be reason enough of itself to ignore what I am saying.

    Which is, that a test framework that forces greater complexity in the unit tests than is required by the code under test has to be suspect. Test code is still code and should be subject to the same rigeuers as any other code. We rightly reject repetitous, c&p code in applications and modules in favour of once and once only. Why tolorate it in test code?

    Using the WWW::Mechanize as an example, as you brought it up rather than because it is a bad one. In fact, I suspect that it is a rather good example of it's type. (*)

    From 1700 non-blank lines spread across those 49 files, a simple de-duping reduces them to 888. That means 812 lines are duplicated! Taking a few simple measures to standardise non-differenciating aspects like whitespace and punctuation, further reduces that tally to ~800, giving 16 lines per file.

    More importantly, over 50% of the lines are duplicates. That's without removing comments or variations in variable names being used for the same purpose, or variations in text constants used to report the same pass/fail criteria etc.

    Over 50% of the code in the test suite are repeatitous, mostly construction code that would be only be required once if the tests were in one file.

    At ~800 line of actual test code, testing 954 non-blank, non-pod, non-comment lines in the 3 modules, the ratio of tests to code seems about right. But 1700 to do the same job, seems kind of wasteful? We wouldn't tolorate that amount of duplication in our application code.

    (*)Please don't take these figures too literally. The were derive using mechanical means without verification of the total accuracy of the processing used. I may have made mistakes. They are just to serve as an indication of the source and direction of my concerns, not a accurate accessment this particular module and its test suite.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      We rightly reject repetitous, c&p code in applications and modules in favour of once and once only. Why tolorate it in test code?

      I'm in favor of reducing duplication in test code. It's sort of why Schwern and I built Test::Builder.

      Remember though, copy and paste code is only a problem if and when it becomes a maintenance burden. If it is, someone should refactor it -- and perhaps create a new test library either for the project or, if possible, for the CPAN in general.

      If it's not getting in the way though, what's the problem?

        This isn't really a response to the above post, but more follow up to your previous questions about "my problem" with the Test::* modules, slightly prompted by your statement above:

        I'm in favor of reducing duplication in test code.

        I think a good part of my problem is that I don't see what I would consider proper segregation between the types (or purposes) of tests within typical test suites that utilise the Test::* framework.

        A few examples, which probably won't be well organised. I may borrow specific examples from particular modules, but that's not a critique of those modules or their authors, just an attempt to ground the examples with some reality.

        To start with, let's imagine we are writing a OO module

        • Starting with the tests that get shipped with modules and run pre and post installation.

          What should these do?

          I'll assert that these should be minimal. And that most modules test too much and the wrong things at this time.

          Taking these two test types in reverse order

          1. Post installations tests.

            My assertion is that it should do just three things.

            1. Check that the installation has succeeded:

              The module can be found and loaded along with all it's compile-time dependencies.

              To this extent, just useing the module in the normal way would, (IMO), be better than use_ok(). The standard error texts report a lot of very useful information that simply is discarded by the institutional wrappers.

            2. Check that an instance of the object can be instantiated given known good instantiation parameters.

              The purpose is to test the installation--not unit or integration test the module. See later.

            3. Check that the instance is correctly destroyed.

              Any destructor code gets called correctly when an instance goes out of scope with no remaining references.

              I'm not sure that this really deserves to be here, or how one could check it in Perl, but it is typical to test this at this time.

            And that should be it.

          2. Pre-installation tests.

            The only things that should be run at this point are, what I would class as simple environmental checks. This might include such things as:

            • The version of Perl available.
            • The availability and versioning of dependant modules.
            • The type and version of the OS if this is important.

          What should these not do?

          I assert, pretty much anything else. That is, they should not test

          1. the modules parameter handling.
          2. nor it's error/exception handling.
          3. nor it's functionality.
          4. nor any of the above for it's dependant modules.
          5. nor any of the above for Perl's built-ins.

          Items 1 through 3 are all tests that should run and passed prior to the module being shipped--by the author. It doesn't make sense to re-run these tests at installation time, unless we are checking that electronic pixies haven't come along and changed the code since it was shipped.

          That's slightly complicated by the fact that authors will rarely be in a position to validate their modules in all the environments that it will run in, but Perl's own build-time test suite should (and does) test most factors that vary across different OSs and compilers. If the Perl installation has passed it's test suite, then there is little purpose in re-checking these things for every module that is installed.

          For example. If a module has a dependency upon LWP to fetch URLs from the Internet, there is no reason to run tests that exercise LWP on every installation. LWP is core, and has a it's own test suite. Modules that use LWP should only need to ensure that it is available, accessible and of a sufficiently high version for their needs. Everything beyond that is duplication. The modules system test should go out to a live Internet site and fetch data in order to do an end-to-end test, but that should only need to be run by the authors prior to shipping.

          I don't have a problem with shipping the full test suite with the module. In the event of failures that cannot be explained through other means, then having the user run the test suite in there environment and feed the results back to the author makes good sense. But running the test suite on every installation doesn't.

        Different test phases each have a different target audience

        And that brings me back to my main problem with the Test::* framework. It's main function appears to be to capture, collate and present in a potted form, the results of running the entire test suite; but whom is this aimed at?

        As a module user, I am only concerned with "Did it install correctly?".

        As a module developer, I am only concerned with "Did my last change work?", "Did it break anything else?", if no to either, "Where did the failure occur?"--and preferably "Take me there!".

        And to my mind, the summary statistics and other output presented by Test::Harness serve neither target audience. They are 'more than I needed to know' as a module user, and 'not what I need to know' as a developer.

        About the only people they serve directly are groups like the Phalanx project (Please stop using the cutesy 'Kwalitee'--it sounds like a '70s marketing slogan! Right up there on the 'grates like fingernails on a blackboard' stakes with "Kwik-e-mart", "Kwik print" and "Magick" of all forms). And possibly corporate QA/Third party code approval Dept's.

        I have misgivings with need to put my unit tests in a separate file in order to use the Test::* modules. It creates couplings where none existed. Codependent development across files, below the level of a specified and published interface, is bad. It is clumsy to code and use. It means that errors found are reported in terms of the file in which they are detected, rather than the file--and line--at which they occurred. That makes tracking the failure back to the point of origin is (at least) a two stage affair. In-line assertions, that can be dis/en-abled via a switch on the command line or in the environment just serve the developer so much better at the unit test level.

        And finally, I have misgivings about having to reduce all my tests to boolean, is_ok()/not_ok() calls. The only purpose this seems to serve is the accumulation of statistics which I see little benefit in anyway. I realise that extra parameters are available to log a textual indication of the cause of the failure, but again: Who do these serve?

        Not the user, they don't care why it failed, only that it did.

        And not the developer. Having to translate stuff like:

        • Did all the bytes get saved?
        • URI should be a string, not an object.
        • URI should be string, not an object
        • URI shouldn't be an object
        • URI shouldn't be an object
        • URI should be a plain scalar, not an object
        • URI shouldn't be an object
        • Find one form, please
        • Should have five elements

        back to file and line number of the failing code; what the inputs were that caused them; why they failed; and what to do about it; just does seem logical to me. Most of those should just be asserts that are in the modules, stay permanently enabled, and report the file & line number (and preferably the parameters of the assertion in terms of both the variables names and contents) when they occur. And the Test::Harness to capture that information and present either "An assertion failure occurred" when in 'user mode', or the assertion text including all the information if run in 'developer mode'. That is

        Assertion failed: "$var(My::Module=HASH(0xdeadbeef)) != 'SCALAR'

        would be a lot more useful than

        t\test03: test 1 of 7 failed: "URI should be a plain scalar, not an object"

        I'm grateful to eyepopslikeamosquito (who started this sub-thread), for mentioning the Damian's module Smart::Comments. From my quick appraisal so far, I think that it is likely to become a permanent addition to my Perl file template. It appears to be similar in operation to another module that I have mentioned Devel::StealthDebug, but is possibly a better implementation. It seems to me that it would make an ideal way of incorporating Unit tests into code in a way that is transparent during production runs, but easily enabled for testing. With the addition of a Test::Harness compatible glue module that essentially simply enables the tests, and captures/analyses the output and converts it to boolean is_ok/not_ok checks, might be close to what I've been looking for.

        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        Lingua non convalesco, consenesco et abolesco. -- Rule 1 has a caveat! -- Who broke the cabal?
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://517648]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2021-06-23 07:47 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (117 votes). Check out past polls.