Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Test/ Code Ratio

by moot (Chaplain)
on Jan 28, 2005 at 03:21 UTC ( [id://425845]=perlmeditation: print w/replies, xml ) Need Help??

Unit tests are used to verify small pieces of code as well as larger integrated pieces. I'm in the beginning phases of a new project that currently has about 4000 LOC and 5400 lines of test code, and 1400 actual tests (Test::More::ok()s), and only about 10% of the app has actually been written - basically only the business objects. This indicates that once the app is complete, I'll have about 14000 tests to 40k LOC, or 1 test per 3 lines of code.

I'm curious as to other monks' ratio of either tests or lines-of-test to application code. A large part of this app is being designed by test case ("FooDoesBarWithFrobnitz" is a test that informs a Foo instance that it must do Bar when presented with Frobnitz; the need for a Frobnitz class did not exist before the test case was written), so I'm wondering if I'm being over-cautious in testing everything, or if this ratio is about average.

Replies are listed 'Best First'.
Re: Test/ Code Ratio
by cog (Parson) on Jan 28, 2005 at 08:32 UTC
    If you're suspecting you might be testing the same thing several times, why don't you use Devel::Cover to check on that?

    Devel::Cover (very easy to use and with exceptional output, including colored HTML) will tell you what's being tested and what's not.

    It will not tell you (that I know of) that two tests are doing the same thing, but it might prevent you in the future from duplicating (or triplicating, or whatever) tests.

      If you're suspecting you might be testing the same thing several times, why don't you use Devel::Cover to check on that?
      And how's Devel::Cover going to tell you? All Devel::Cover is going to tell you is whether you have executed certain pieces of code. And how many times. But that won't tell you whether you have tested the same thing twice. Well, unless you think that "running the code once" equals "testing" it.

      Say for instance, you have this function to calculate the cube of a number (pretty stupid function, but the function is not the point):

      sub cube { $_[0] >= 1 ? $_[0] * $_[0] * $_[0] : : - (abs($_[0]) * abs($_[0]) * abs($_[0])) }
      You test it with 2, 1, 0 and -1. Devel::Cover tells you you've executed each branch at least twice. Does that mean you've done twice as much tests as necessary? Not at all, in fact, you haven't done enough tests, as the tests will fail for numbers between 0 and 1.

      Devel::Cover is a great tool, but as with all tools, it becomes pretty useless if you misuse it. Devel::Cover does not tell you what you have tested. Devel::Cover will tell you which statements have run, which branches have been taken, and which subs have been called. From that, you can deduce you haven't tested certain parts of the program, purely because you have run the code. But that's it. It doesn't do more. It cannot be used to decide you have succesfully tested something. It will only show negatives, never positives.

      Ah, thanks. Hadn't found that one. Testing the same thing multiple times is indeed one of my concerns, given that I like to test features in isolation.
Re: Test/ Code Ratio
by tstock (Curate) on Jan 28, 2005 at 04:12 UTC
    I think the amount of testing needed on a given app is particular to each app and business needs. I think I test more than most around me, but I don't come close to having 1/3 of my code as test code. If you include inline testing, validation, defensive coding, as test code, then I may come close to that, otherwise, my test code is probably in the 1/5 to 1/10 range.

    I think that if the action of writting a test and running that test everytime you run a test harness causes more pain than the advantage of having that test, then you probably went overboard. I understand it's hard to measure something that hasn't happended yet (and may never happen), this is just a rought low bar for writting a test.

    Something else that a recent Joel on Software article touched on, "if the cost of fixing a bug is more expensive than what leaving the bug in is likely to cost in the long run, then maybe the bug doesn't need to be fixed". horrors. In those cases, maybe writting tests for those possible bugs is also going overboard ?

    OK, burn me at the stake now...

    Tiago
      maybe writting tests for those possible bugs is also going overboard ?
      There are two situations that your statement made me think of. In both, I'll argue in favor of writing the test.

      The first case is some edge case that "will always be true...duh!". Dollars to donuts, something will change to break your assumption. At that point, you'll be glad that you had the test in place.

      The second situation is a known bug that your customer has told you is acceptable. In this case, write the test, but set it to be ignored in your suite. That way, when some big wig walks up and says "did we know about this? the client is fumming!", you can say "yep, and here's the test case that Mary Jones told us to ignore on 17Nov2004".

      thor

      Feel the white light, the light within
      Be your own disciple, fan the sparks of will
      For all of us waiting, your kingdom will come

      Most of my tests aren't there to find bugs in the current version. They're regression tests to stop me introducing bugs next week.
Re: Test/ Code Ratio (1x..3x)
by tye (Sage) on Jan 28, 2005 at 06:19 UTC

    We spend at least as much time writing unit tests as writing the functional code. In many cases, we spend two or three times as much time working on unit tests as we spend writing the code to be tested.

    And we still could use more unit tests in some cases (usually the cases that are harder to test).

    - tye        

      I haven't even got to the use cases yet! This is still testing basic object interaction ("does Zobcrack auto-populate correctly when Fribnortz is instantiated?"). My worry is that I'll spend so long making sure the objects work that I'll run out of time on the project, or conversely will miss some object test that bites me later and proves difficult to track down.

      I have a lot of semi-deterministic behaviour to model that is very sensitive to input conditions, and many object dependencies. *ponder* wouldn't it be great if there were some magic module that could take test harnesses and turn them into working code? Then all we'd have to do is write the test cases!

      package Test::ToCode;
      Any takers? :)
Re: Test/ Code Ratio
by dragonchild (Archbishop) on Jan 28, 2005 at 13:46 UTC
    I'd be more concerned with the number of testing scenarios and your test coverage of that. Happy-day scenarios are relatively easy to cover, and most people who write tests tend to cover those with about a 90% coverage rate (in my experience). However, the place many people I've seen don't test is the rainy-day scenarios - testing your error-handling. Do you correctly fail when you should fail?

    For example, I own Excel::Template which has 3600 LOC over 28 files. (This includes POD and comments.) I have about 75-90% of the standard usages tested for - I only have 61 tests. But, each one of my scenarios is tested with 4 tests and I have 15 scenarios I test for. The reason I can run just 15 scenarios is that I have looked at what the featureset is and have written tests to deal with the commonly-used features. And, most importantly, I have shown myself, through code inspection, that the features do not bleed over into one another. So, I can test each feature (for the most part) in isolation and be fairly confident things will work together.

    Now, I need to add another roughly 30 scenarios - all my error cases. I currently have no way of proving that this thing will fail correctly. I also want to add further branch and conditional tests, as shown by Devel::Cover. D::C shows me that I have a 77.9% overall coverage. It also shows me where my testing coverage is worst, which shows me where additional tests will have the most impact.

    . . . I'm wondering if I'm being over-cautious in testing everything, . . .

    Testing can never prove the abscence of bugs - it can only prove that the pathways you have tests for will work as expected, under the conditions you're testing them. Now, if you can show that you are testing all the usages the code can reasonably expect to see when in production, you've got a minimal amount of tests. You can always test more.

    Being right, does not endow the right to be rude; politeness costs nothing.
    Being unknowing, is not the same as being stupid.
    Expressing a contrary opinion, whether to the individual or the group, is more often a sign of deeper thought than of cantankerous belligerence.
    Do not mistake your goals as the only goals; your opinion as the only opinion; your confidence as correctness. Saying you know better is not the same as explaining you know better.

Re: Test/ Code Ratio
by xdg (Monsignor) on Jan 28, 2005 at 12:35 UTC

    The other thing to consider is if your tests have a lot of repetitive code. You might look at Test::Class as a way of refactoring, particularly if your code has a lot of classes/subclasses and your tests are checking them for similar behaviors.

    On the other hand, it wouldn't surprise me a priori if you have that kind of ratio. Perl is a very concise language and you can do a lot in only a few lines of code, but testing all the ways that code could be used could be quite lengthy. You've got to consider the branch and condition density of your code -- if that's high, then you'll need a lot of tests to verify behavior in all circumstances.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re: Test/ Code Ratio
by adamk (Chaplain) on Jan 28, 2005 at 08:48 UTC
    Ignoring the coverage stats and just looking at pure size, what you might be interested to do is to take PPI, load each file from the code and tests and find the number of significant tokens in each.

    That would give you some idea of size in non-line terms.

    I've been thinking about writing something like this for a while.
      What's PPI?
        Eep, sorry.

        PPI is (originally) short for Parse::Perl::Isolated. It parses perl as a Document, without using perl itself.

        The "Signficicant Tokens" metric ignores lines, variable name lengths, string length, POD and just goes on the complexity of the code itself, and can be found using something like the following.

        use PPI; # Load a perl document my $Document = PPI::Document->load( 'mycode.pl' ); my $significant = grep { ! $_->significant } $Document->tokens;
Re: Test/ Code Ratio
by Anonymous Monk on Jan 28, 2005 at 16:00 UTC
    I don't think the test/code ratio is a useful measurement. If there's a useful measurement of the amount of tests, it is related to the pre- and postconditions. Or, if you will, the amount of flexibility. To take it to the extreme, if something is supposed to do one thing, all you need is one test. No matter how much code you've written. If pressing a button means the lights go on, and pressing it again means the lights go out, you only need two test cases:
    1. Press button. Success iff lights go on.
    2. Press button again. Success iff lights go off.
    But a one-line regular expression might require thousands of tests, because it needs to give the correct answer for every possible string that it might be matched against.

      Nicely summarised++.

      And that I think, for all the words I have written (but not yet published) attempting to explain my distaste for the Test::* modules, this is the crux of that distaste.

      Test::Harness, and many of the others, tend to emphasis quantity over quality.

      They also put the emphasis on percentage passed, rather than what failed.

      Those two factors tend to combine to encourage the writing of lots of little tests, and ignore the effect of duplicate tests--"Hey, you can never have enough testing!".

      The result is that the one failing test is swamped in the high volume of (often duplicate) tests passed.

      So the headline is a feel-good "99.98% passed" rather than the realistic and crucial "1 test failed".

      Testing is a bit like condoms...99.98% safe isn't any comfort when the 0.02% happens.


      Examine what is said, not who speaks.
      Silence betokens consent.
      Love the truth but pardon error.

        Two comments on this:

        1. 99.98% isn't feel good if your goal is 100%. When I'm evaluating test results, I don't care about percentage -- I want to see zero test failures. Feeling good at 99.98% is an attitude problem, not a Test::Harness problem.
        2. I've personally found that test-driven development works for me and that Test::Harness, et al., make that quick and easy. The power of writing tests first is in having to be absolutely clear what output I expect before I write my code. If that leads to tons of little tests, so be it. The point isn't that I've written lots of tests, it's that I've clearly specified the requirements of my module/application in a verifiable way.

          If the tests don't flag some broken behavior despite having tons of little tests, that's a failure on my part to write a good specification, not a failure of Test::Harness. E.g., if I don't specify what the application should do when input is faulty, then any behavior is acceptable because I haven't constrained it. Defensive coding ("open or die") is just a coders response to make the best of a poorly specified situation.

        Like most tools, Test::* modules are only constructive in the hands of a skilled user. To the OP's point, are lots of lines of test code relative to lines of application code a sign of redundancy or inelegance or a well-thought-out and comprehensive specification of behavior? The answer depends entirely on the specific application and code (and it might be a combination of those, as well).

        -xdg

        Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

        The one who emphasises quantity is the programmer reading the output. A craftsman shouldn't be blaming his tools. That said, I've always found the percentage readings useless prefer to run my testcases directly rather than under harness.

        chromatic and Ovid have recently been better men than you or I and got down to actually do something about this in the form of better test suite output. I particularly enjoy chromatic's supression of the output of passing tests. That would seem to be what you're after, as well.

        Makeshifts last the longest.

        A reply falls below the community's threshold of quality. You may see it by logging in.
        So the headline is a feel-good "99.98% passed" rather than the realistic and crucial "1 test failed".

        I hadn't thought of it that way, but you're exactly right. The percentage is effectively meaningless.

        Rewriting how Test::Harness summarizes results is one of the things on my to-do for the reasonably-near future. When I do, I will probably leave out the percentages.

        xoxo,
        Andy

      Right, but the average case over 40k lines of code will be somewhere in the middle - some functions/ methods/ whatever will require 2 tests, and some will require 1000s. Of course I realise your example is intended to be simplistic to the extreme, but few realistic conditions have such narrow scope.

      Test/LOC ratio is at least as useful a metric as LOC itself - it provides some guide as to the complexity of a real project (as opposed to twee little 'yes but *this* code is a million lines all printing "Hello World"' projects just to prove my statement wrong ;) )

      Of course I didn't expect any response to say "You have it exactly right" or "No, you must be doing twice as many tests as LOC", I'm just after a general "am I on the right path, or am I spinning wheels writing tests that will eventually be duplicated in some way".

Re: Test/ Code Ratio
by BrowserUk (Patriarch) on Jan 28, 2005 at 10:56 UTC

    This all sounds an aweful lot like "Never mind the quality, feel the width!" to me.


    Examine what is said, not who speaks.
    Silence betokens consent.
    Love the truth but pardon error.
Re: Test/ Code Ratio
by brian_d_foy (Abbot) on Jan 29, 2005 at 05:57 UTC

    Don't worry about what other people do. Write as many tests as you need and don't worry too much if you have some extra. Don't think of tests in terms of lines of code. Test things that do something. Everything that does something gets a test. Test for expected input, unexpected input, and every situation you can think of, even if you think it's trivial. Use a coverage tool to ensure you test everything.

    If you go back and refactor the tests to bring down the line count, great. If not, oh well. Worrying about the line count as the main goal is only going to get in the way of your testing mindset.

    If you are still curious, simply download some CPAN modules and look for yourself. :)

    --
    brian d foy <bdfoy@cpan.org>
Re: Test/ Code Ratio
by sleepingsquirrel (Chaplain) on Jan 28, 2005 at 21:17 UTC
    If you've got a lot of tests, it might be an indicator that your testing system isn't working at a high enough level. You might want to look into something like LectroTest.


    -- All code is 100% tested and functional unless otherwise noted.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://425845]
Approved by fxmakers
Front-paged by castaway
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2024-03-19 07:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found