Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: Test/ Code Ratio

by Anonymous Monk
on Jan 28, 2005 at 16:00 UTC ( #426012=note: print w/ replies, xml ) Need Help??


in reply to Test/ Code Ratio

I don't think the test/code ratio is a useful measurement. If there's a useful measurement of the amount of tests, it is related to the pre- and postconditions. Or, if you will, the amount of flexibility. To take it to the extreme, if something is supposed to do one thing, all you need is one test. No matter how much code you've written. If pressing a button means the lights go on, and pressing it again means the lights go out, you only need two test cases:

  1. Press button. Success iff lights go on.
  2. Press button again. Success iff lights go off.
But a one-line regular expression might require thousands of tests, because it needs to give the correct answer for every possible string that it might be matched against.


Comment on Re: Test/ Code Ratio
Re^2: Test/ Code Ratio
by BrowserUk (Pope) on Jan 28, 2005 at 16:24 UTC

    Nicely summarised++.

    And that I think, for all the words I have written (but not yet published) attempting to explain my distaste for the Test::* modules, this is the crux of that distaste.

    Test::Harness, and many of the others, tend to emphasis quantity over quality.

    They also put the emphasis on percentage passed, rather than what failed.

    Those two factors tend to combine to encourage the writing of lots of little tests, and ignore the effect of duplicate tests--"Hey, you can never have enough testing!".

    The result is that the one failing test is swamped in the high volume of (often duplicate) tests passed.

    So the headline is a feel-good "99.98% passed" rather than the realistic and crucial "1 test failed".

    Testing is a bit like condoms...99.98% safe isn't any comfort when the 0.02% happens.


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.

      Two comments on this:

      1. 99.98% isn't feel good if your goal is 100%. When I'm evaluating test results, I don't care about percentage -- I want to see zero test failures. Feeling good at 99.98% is an attitude problem, not a Test::Harness problem.
      2. I've personally found that test-driven development works for me and that Test::Harness, et al., make that quick and easy. The power of writing tests first is in having to be absolutely clear what output I expect before I write my code. If that leads to tons of little tests, so be it. The point isn't that I've written lots of tests, it's that I've clearly specified the requirements of my module/application in a verifiable way.

        If the tests don't flag some broken behavior despite having tons of little tests, that's a failure on my part to write a good specification, not a failure of Test::Harness. E.g., if I don't specify what the application should do when input is faulty, then any behavior is acceptable because I haven't constrained it. Defensive coding ("open or die") is just a coders response to make the best of a poorly specified situation.

      Like most tools, Test::* modules are only constructive in the hands of a skilled user. To the OP's point, are lots of lines of test code relative to lines of application code a sign of redundancy or inelegance or a well-thought-out and comprehensive specification of behavior? The answer depends entirely on the specific application and code (and it might be a combination of those, as well).

      -xdg

      Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        I've personally found that test-driven development works for me...

        I am a strong advocate of test-driven development, and I have been for a long time, although I didn't know it would end up being called that for most of that time. I am *not* critisising test-driven development. I am expressing my personal doubts about the tools that I have seen available for supporting that.

        My point, badly made, is that I have no interest in seeing screenfuls of "xxxx.t .....ok". I do not care how many tests passed, or what their names were, or what the percentages are.

        The only thing I am interested in is "0 failures" or "Failure: test nn at file.pl:(nnn)". The output from test harness doesn't tell me what I need to know. What (code; not test) failed,how it failed, and where (which source code file, not test file).

        Instead I have to go off on a hunting spree, first locate the test that failed, then re-run it having added extra print statements to find out how it failed. Then track that back to the source code where the failure originates.

        The Test::* modules are set up to make the writing, running and reporting of tests easy.

        But writing tests is not the objective. The objective is the finding and fixing of failures to meet specification.

        To this end, I want a process that puts the tests close to the code under test. That way, when failures occur, the messages can take me directly to the code that needs fixing, not to a test script in another directory that leaves me with nothing except grep in order to track back the failing test to the failing code.

        All that said, I do not have an alternative to offer. I have played some with some ideas based upon Devel::StealthDebug.

        I am also very impressed by my reading of tmoertel's LectroTest.

        I have ideas for combining these two notions--automated test generation inline, with the ability to turn those tests off for production use such that they have zero impact upon the tested code when disabled.

        That is whereI think the future lies--inline, automated (unit) testing that can be enabled and disabled via command line switch.

        I think that P6 is moving in this direction with is PRE{}, POST{}, FIRST(), ENTER{} & LEAVE() blocks. I have yet to see enough (or visualise enough) P6 code, and the specifications are rather loose and changable, for me to decide whether they are flexible enough to achieve everything I would like, but they appear as if they might.

        So, as I think I have remembered to say each time I have mentioned it--my reservations are purely my own. As with everything, what exists now is infinitely better that what might exist one day.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.

      The one who emphasises quantity is the programmer reading the output. A craftsman shouldn't be blaming his tools. That said, I've always found the percentage readings useless prefer to run my testcases directly rather than under harness.

      chromatic and Ovid have recently been better men than you or I and got down to actually do something about this in the form of better test suite output. I particularly enjoy chromatic's supression of the output of passing tests. That would seem to be what you're after, as well.

      Makeshifts last the longest.

        Ad hominem (again!).

        Please read 426139


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
      So the headline is a feel-good "99.98% passed" rather than the realistic and crucial "1 test failed".

      I hadn't thought of it that way, but you're exactly right. The percentage is effectively meaningless.

      Rewriting how Test::Harness summarizes results is one of the things on my to-do for the reasonably-near future. When I do, I will probably leave out the percentages.

      xoxo,
      Andy

        Thankyou for taking my comments in the light in which they were intended.


        Examine what is said, not who speaks.
        Silence betokens consent.
        Love the truth but pardon error.
Re^2: Test/ Code Ratio
by moot (Chaplain) on Jan 28, 2005 at 20:45 UTC
    Right, but the average case over 40k lines of code will be somewhere in the middle - some functions/ methods/ whatever will require 2 tests, and some will require 1000s. Of course I realise your example is intended to be simplistic to the extreme, but few realistic conditions have such narrow scope.

    Test/LOC ratio is at least as useful a metric as LOC itself - it provides some guide as to the complexity of a real project (as opposed to twee little 'yes but *this* code is a million lines all printing "Hello World"' projects just to prove my statement wrong ;) )

    Of course I didn't expect any response to say "You have it exactly right" or "No, you must be doing twice as many tests as LOC", I'm just after a general "am I on the right path, or am I spinning wheels writing tests that will eventually be duplicated in some way".

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://426012]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2014-07-13 09:06 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (248 votes), past polls