Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Does anybody write tests first?

by BrowserUk (Pope)
on Feb 22, 2008 at 05:35 UTC ( #669466=note: print w/replies, xml ) Need Help??


in reply to Does anybody write tests first?

Not me. Whilst the 'tests first' brigade are writing tests to see if Perl remembers how to load a module, detect a non-existent or privileged file, and hasn't forgotten how to do math, I'll be writing an* application that uses the module.

Functions/methods within that module will simply return reasonable constant data/predefined status until the application is written, compiles and runs. Once the application runs, then I start filling in the bodies of of those APIs one by one, checking the application continues to run correctly as I go. Adding asserts in the APIs to check parameters. And asserts in the application to check returns.

I find it infinitely preferable to have the application or module die at a named line, of a named file, so that I can go directly to the failing assertion, than to

  • sit and watch streams of numbers, dots and percentages scroll off the top of my screen.
  • And then have to re-run the tests and pause/abort it to find the name of a .t file and some imaginary number.
  • Then go find that .t file and try and count my way through the tests to find which one failed.
  • Then try and work out what output the test actually produced that didn't match the expected output.
  • Then try and relate that to something in the module that failed to work as expected.
  • Then go find the appropriate line of the module.

And that's the abbreviated version of many of the test failures I've had reported by modules using the Test::* philosophy.

A test should be located as close as possible to the point of failure--and tell me what went wrong and where. Anything else is just mastication.

*An, not (necessarily) the application. A contrived and simplified sample that exercises the full API, is fine. It can also act as user documentation/sample code.


Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
"Science is about questioning the status quo. Questioning authority".
In the absence of evidence, opinion is indistinguishable from prejudice.

Replies are listed 'Best First'.
Re^2: Does anybody write tests first?
by xdg (Monsignor) on Feb 23, 2008 at 01:29 UTC
    • sit and watch streams of numbers, dots and percentages scroll off the top of my screen.
    • And then have to re-run the tests and pause/abort it to find the name of a .t file and some imaginary number.
    • Then go find that .t file and try and count my way through the tests to find which one failed.
    • Then try and work out what output the test actually produced that didn't match the expected output.
    • Then try and relate that to something in the module that failed to work as expected.
    • Then go find the appropriate line of the module.

    I respect the point you're trying to make, but I think the issues you present above can be addressed by greater granularity in the test-code cycle. I never write a complete test file and then write code to meet it. That's just asking for unnecessary pain.

    Instead, before I write a subroutine, I think of how I'll know that I coded it correctly. What inputs does it take? What output should it provide? Then I go write a test for just that piece.

    When I test just that *.t file (e.g. with "prove"), I should see any existing tests pass and that one test I just wrote fail.

    Then I write the code to make that test pass. Then I prove that *.t file again -- if the test passes, great. If not, then either the code is wrong or the test is wrong -- and sometimes it's the test (hey, it's just code and bugs can show up there, too.) It's like double-entry accounting.

    So I don't generally have to go through all the (legitimately annoying) test output analysis you describe. I know what test I just wrote and I know what code I just wrote.

    To the point about "counting tests", test failures should be reporting where in the *.t file the test. That said, I often code with data structures that loop over test cases -- and so I write defensively with good labels for each case that let me quickly find the failing case.

    Only periodically, after I've finished a substantial chunk of work, do I run the full test suite to make sure I haven't inadvertently broken things elsewhere.

    As with everything, you make a great point about avoiding pedantry. Is is really necessary to test that a file loads? Or that math is done correctly? No, of course not. What's that line about 'foolish consistency'?

    That said, the cost of writing a line of test for trivial code is pretty minimal, so I often think it's worth it since what starts out trivial (e.g. addition) sometimes blossoms over time into function calls, edge cases, etc. and having the test for the prior behavior is like a safety net in case the trival becomes more complex.

    -xdg

    Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

      When I test just that *.t file (e.g. with "prove"), I should see any existing tests pass and that one test I just wrote fail.

      That's fine when you're writing a given piece of code. You know what you just added or changed and it's probably sitting in your editor at just the right line. With the minuscule granularity of some of the test scripts I've seen, a failing test probably translates to an error in a single line or even sub clause thereof.

      But most of my interactions with Test::* (as someone who doesn't use them), are as maintenance programmer on newly built or installed modules running make test.

      I've not just forgotten how the code is structured, and how the (arbitrarily named) test files relate to the structure of the code. I never knew either.

      All I see is:

      ... [gobs of useless crap omitted] ... t\accessors.......ok 28/37# Looks like you failed 1 test of 37. t\accessors.......dubious Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 14 Failed 1/37 tests, 97.30% okay ... [gobs more crap omitted] ...
      • Something failed.
      • 28 from 37 == 9. But it says: "# Looks like you failed 1 test of 37."
      • Then it usefully says: "dubious". No shit Sherlock.
      • Then "Test returned status 1 (wstat 256, 0x100)". Does that mean anything? To anyone?
      • Then (finally) something useful: "DIED. FAILED test 14".

      Now all I gotta do is:

      1. work out which test--and they're frequently far from easy to count--is test 14.
      2. Where in accessors.t it is located.
      3. Then work out what API(s) that code is calling.
      4. And what parameters it is passing.

        Which if they are constants is fine, but if the test writer has been clever efficient and structured a set of similar tests to loop over some data structure, I've a problem.

        I can't easily drop into the debugger, or even add a few trace statements to display the parameters as the tests loop, cos that output would get thrown away.

      And at this point, all I've done is find the test that failed. Not the code.

      The API that is called may be directly responsible, but I still don't know for sure what file it is in?

      And it can easily be that the called API is calling some other API(s) internally.

      • And they can be located elsewhere in the same file.
      • Or in another file used by that file and called directly.
      • Or in another file called through 1 or more levels of inheritance.

      And maybe the test writer has added informational messages that identify specific tests. And maybe they haven't. If they have, they may be unique, hard coded constants. Or they could be runtime generated in the test file and so unsearchable.

      And even if they are searchable, they are a piss poor substitute for a simple bloody line number. And they required additional discipline and effort on the behalf of someone who I've never met and does not work for my organisation.

      Line numbers and traceback are free, self maintaining, always available, and unique.

      If tests are located near the code they test, when the code fails to compile or fails an assertion, the information takes me directly to the point of failure. All the numbering of tests, and labelling of tests, is just a very poor substitute and costly extra effort to write and maintain--if either or both is actually done at all.

      That said, the cost of writing a line of test for trivial code is pretty minimal, so I often think it's worth it since what starts out trivial (e.g. addition) sometimes blossoms over time into function calls, edge cases, etc. and having the test for the prior behavior is like a safety net in case the trivial becomes more complex.

      This is a prime example of the former of the two greatest evils in software development today: What-if pessimism and Wouldn't-it-be-nice-if optimism.. Writing extra code (especially non-production code) now, "in case it might become useful later" costs in both up-front costs and ongoing maintenance.

      And sods law (as well as my own personal experience), suggests that the code you think to write now "just in case" is never actually used. Though inevitably the piece that you didn't think to write, is needed.

      Some code needs to cover all the bases, consider every possible contingency. If your code is zooming around a few million miles away at the other end of a low-speed data-link burned into eprom. Then, belt & braces--or even three belts, two sets of braces and a reinforced safety harness--may be legitimate. But very few of us, and a very small amount of the world's code base live in such environments.

      For the most part, the simplest way to improve the ROI of software, is to write less of it! And target what you must write, in those areas where it does most good.

      Speculative, defensive, non-production crutches to future possibilities will rarely if ever be exercised, and almost never produce a ROI. And code that doesn't contribute to the ROI is not just wasted capital investment, but an ongoing drain on maintenance budgets and maintenance team mind-space.

      Far better to expend your time making it easy to locate and correct real bugs that crop up during systems integration and beta testing, than trying to predict future developments and failure modes. And the single best contribution a test writer can make to that goal is to get the maintenance programmer as close to the source of the failure, when it occurs, as quickly as possible.

      Yes. It is possible to speculate about what erroneous parameters might be fed to an API at some point in the future. And it is possible to write a test to pass those values to the code and ensure that the embedded range checks will identify them. But it is also possible to speculate about the earth being destroyed by a meteorite. How are you going to test that? And what would you do if you successfully detected it?

      And yes, those are rhetorical questions about an extreme scenario, and there is line somewhere between what is a reasonable speculative possibility and the extremely unlikely. But that line is far lower down the scale than most people think.

      Finally, it is a proven and incontestable fact that the single, simplest, cheapest, most effective way to avoid bugs is to write less code. And tests are code. Testing is a science and tests should be designed, not hacked together as after-thoughts (or pre-thoughts).

      Better to have 10 well designed and targetted tests, than 100 overlapping, redundant, what-ifs.


      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.
        a simple bloody line number.
        Test and line number are not mutually exclusive.
        $ cat 669457.t use strict; use warnings; use Test::More; BEGIN { plan tests => 1; } cmp_ok(1, q{==}, 2, q{Expect 1==2}); __END__ $ prove 669457.t 669457....NOK 1/1 # Failed test 'Expect 1==2' # at 669457.t line 5. # got: 1 # expected: 2 # Looks like you failed 1 test of 1. 669457....dubious + Test returned status 1 (wstat 256, 0x100) DIED. FAILED test 1 Failed 1/1 tests, 0.00% okay Failed Test Stat Wstat Total Fail List of Failed ---------------------------------------------------------------------- +--------- 669457.t 1 256 1 1 1 Failed 1/1 test scripts. 1/1 subtests failed. Files=1, Tests=1, 0 wallclock secs ( 0.03 cusr + 0.00 csys = 0.03 C +PU) Failed 1/1 test programs. 1/1 subtests failed. $
        --
        Andreas
        Finally, it is a proven and incontestable fact that the single, simplest, cheapest, most effective way to avoid bugs is to write less code. And tests are code.

        And program code is code. Therefore, if you write no code at all, you'll have no bugs. Of course, you'll also have no features.

        Testing is a science and tests should be designed, not hacked together as after-thoughts (or pre-thoughts).

        So is your objection to writing tests first as opposed to after the fact? Or to hacky, poorly-designed tests, regardless of whether they were written first or last?

        My hypothesis would be that tests are more likely to be designed well when they are viewed by the developer as an integral part of the development of program code rather than something to be added afterwards -- at least with respect to individual developers.

        If your development model has QA developers writing tests independently, then maybe the advantage is less.

        -xdg

        Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.

Re^2: Does anybody write tests first?
by amarquis (Curate) on Feb 22, 2008 at 15:39 UTC

    First, I appreciate the dissenting opinion, I was looking particularly for the proverbial "other ways to do it."

    I suspect that the experence difference between you and I is the main point here: your method works best for you because you have the skill to avoid the pitfalls. Were I to design something in that way, I'm sure that I'd miss an assertion or return check somewhere, or the coverage wouldn't be complete in some other way. The Test::* method allows me to have all my tests in the same place, where gaps are obvious. The extra work initially has a big payoff downstream when I'm hunting bugs.

Re^2: Does anybody write tests first?
by dragonchild (Archbishop) on Feb 24, 2008 at 18:26 UTC
    A few points here:
    • Yes, what gets reported in a failing test could be seriously improved. I have fully drunk the TDD kool-aid and I hate the results when I have a failing test in a .t file with 523 test cases. Often, I have to throw a bunch of __END__'s and if(0)'s around to find the failing testcase.
    • The major benefit of test suites, IME, is the repeatability. I can make a change, then run 5000 tests to make sure nothing else got bolloxed up. That's nice.
    • You're a maintenance dev. I didn't write a lot of .t tests as a maintenance dev. Frankly, it wasn't cost-effective.

    My criteria for good software:
    1. Does it work?
    2. Can someone else come in, make a change, and be reasonably certain no bugs were introduced?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://669466]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (3)
As of 2019-10-18 19:16 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Notices?