I'll be giving a talk at work about improving
our test automation.
Initial ideas are listed below.
Feedback on talk content and general approach are welcome
along with any automated testing anecdotes you'd like to share.
Possible talk sections are listed below.
- Reduce cost.
- Improve testing accuracy/efficiency.
- Regression tests ensure new features don't break old ones. Essential for continuous delivery.
- Automation is essential for tests that cannot be done manually: performance, reliability, stress/load testing, for example.
- Psychological. More challenging/rewarding. Less tedious. Robots never get tired or bored.
- Opportunity cost of not finding bugs had you done more manual testing.
- Automated test suite needs ongoing maintenance. So test code should be well-designed and maintainable; that is, you should avoid the common pitfall of "oh, it's only test code, so I'll just quickly cut n paste this code".
- Cost of investigating spurious failures. It is wasteful to spend hours investigating a test failure only to find out the code is fine, the tests are fine, it's just that someone kicked out a cable. This has been a chronic nuisance for us, so ideas are especially welcome on techniques that reduce the cost of investigating test failures.
- May give a false sense of security.
- Still need manual testing. Humans notice flickering screens and a white form on a white background.
When and Where Should You Automate?
- Testing is essentially an economic activity. There are an infinite number of tests you could write. You test until you cannot afford to test any more. Look for value for money in your automated tests.
- Tests have a finite lifetime. The longer the lifetime, the better the value.
- The more bugs a test finds, the better the value.
- Stable interfaces provide better value because it is cheaper to maintain the tests. Testing a stable API is cheaper than testing an unstable user interface, for instance.
- Automated tests give great value when porting to new platforms.
- Writing a test for customer bugs is good because it helps focus your testing effort around things that cost you real money and may further reduce future support call costs.
Adding New Tests
- Add new tests whenever you find a bug.
- Around code hot spots and areas known to be complex, fragile or risky.
- Where you fear a bug. A test that never finds a bug is poor value.
- Customer focus. Add new tests based on what is important to the customer. For example, if your new release is correct but requires the customer to upgrade the hardware of 1000 nodes, they will not be happy.
- Documentation-driven tests. Go through the user manual and write a test for each example given there.
- Add tests (and refactor code if appropriate) whenever you add a new feature.
- Boundary conditions.
- Stress tests.
- Big ones, but not too big. A test that takes too long to run is a barrier to running it often.
- Tools. Code coverage tools tell you which sections of the code have not been tested. Other tools, such as static (e.g. lint) and dynamic (e.g. valgrind) code analyzers, are also useful.
Test Infrastructure and Tools
- Single step, automated build and test. Aim for continuous integration.
- Clear and timely build/test reporting is essential.
- Keep metrics (via test metadata, say) on the test suite itself. Is a test providing "value". How often does it fail validly? How often does it fail spuriously? How long does it take to run?
- Quarantine flaky failing tests quickly; run separately until solid, then return to main build. No broken windows.
- Make it easy to find and categorize tests. Use test metadata.
- Integrate automated tests with revision control, bug tracking, and other systems, as required.
- Divide test suite into components that can be run separately and in parallel. Quick test turnaround time is crucial.
Design for Testability
- It is easier/cheaper to write automated tests for systems that were designed with testability in mind in the first place.
- Interfaces Matter. Make them: consistent, easy to use correctly, hard to use incorrectly, easy to read/maintain/extend, clearly documented, appropriate to audience, testable in isolation.
- Dependency Injection is perhaps the most important design pattern in making code easier to test.
- Mock Objects are frequently useful and are broader than unit tests - for example, a mock server written in Perl (e.g. a mock SMTP server) to simulate errors, delays, and so on.
- Consider ease of support and diagnosing test failures during design.
Test Driven Development (TDD)
- Improved interfaces and design. Especially beneficial when writing new code. Writing a test first forces you to focus on interface - from the point of view of the user. Hard to test code is often hard to use. Simpler interfaces are easier to test. Functions that are encapsulated and easy to test are easy to reuse. Components that are easy to mock are usually more flexible/extensible. Testing components in isolation ensures they can be understood in isolation and promotes low coupling/high cohesion. Implementing only what is required to pass your tests may help prevent over-engineering.
- Easier Maintenance. Regression tests are a safety net when making bug fixes. No tested component can break accidentally. No fixed bugs can recur. Essential when refactoring.
- Improved Technical Documentation. Well-written tests are a precise, up-to-date form of technical documentation. Especially beneficial to new developers familiarising themselves with a codebase.
- Debugging. Spend less time in crack-pipe debugging sessions. When you find a bug, add a new test before you start debugging (see practice no. 9 at Ten Essential Development Practices).
- Automation. Easy to test code is easy to script.
- Improved Reliability and Security. How does the code handle bad input?
- Easier to verify the component with memory checking and other tools (e.g. valgrind).
- Improved Estimation. You've finished when all your tests pass. Your true rate of progress is more visible to others.
- Improved Bug Reports. When a bug comes in, write a new test for it and refer to the test from the bug report.
- Improved test coverage. If tests aren't written early, they tend never to get written. Without the discipline of TDD, developers tend to move on to the next task before completing the tests for the current one.
- Psychological. Instant and positive feedback; especially important during long development projects.
- Reduce time spent in System Testing. The cost of investigating a test failure is much lower for unit tests than for complex black box system tests. Compared to end-to-end tests, unit tests are: fast, reliable, isolate failures (easy to find root cause of failure). See also Test Pyramid.
- Dummy objects are passed around but never actually used. Usually they are just used to fill parameter lists.
- Fake objects actually have working implementations, but usually take some shortcut which makes them not suitable for production (an InMemoryTestDatabase for example).
- Stubs provide canned answers to calls made during the test, usually not responding at all to anything outside what's programmed for the test.
- Spies are stubs that also record some information based on how they were called; for example an email service that records how many messages were sent.
- Mocks are pre-programmed with expectations which form a specification of the calls they are expected to receive; they can throw an exception if they receive a call they don't expect and are checked during verification to ensure they got all the calls they were expecting. Note that only mocks insist upon behavior verification. The other doubles can, and usually do, use state verification. Mocks behave like other doubles during the exercise phase because they need to make the SUT (System Under Test) believe it's talking with its real collaborators - but mocks differ in the setup and the verification phases. While mocks are valuable when testing side-effects, protocols and interactions between objects, note that overuse of mocks inhibits refactoring due to tight coupling between the tests and the implementation (instead of just the interface contract).
- Mocks aren't Stubs article by Martin Fowler (mockists vs classicists, classic: use real objects if possible and a double if it's awkward to use the real thing, use state verification; vs mockist: always use a mock for any object with interesting behavior, use behavior verification (mocks are pre-programmed with expectations, a specification of the calls they are expected to receive, verification ensures they got all the calls they were expecting).
- Concise version of Fowler mocks arent stubs
- Visual Studio 11 Fakes, Stubs and Shims (run-time method interceptors) (Stub: State-based verification "Arrange, Act, Assert"; Mock: behavior-based verification: A mock provides not only a fake implementation but also logic for verifying how calls were made on the fake. When you are testing side-effects, protocols and interactions between objects, they are extremely valuable. Some folks fall into behavior verification when none is needed)
- Test double flavours (Test Stub, Test Spy, Mock Object, Fake Object, ...)
- What is the difference between a mock and a stub (stack overflow)
- verified fakes in python
Updated 2019: Added Test Doubles section.