|Welcome to the Monastery|
Note 20110114: Thank you all for your valuable comments on the original RFC. After 10 days and no negative feedback or further suggestions to this RFC, I posted the formal tutorial here Using Test-Point Callbacks for Automated Diagnostics in Testing
This second iteration incorporates changes, ideas and comments on the first version, as well as further research on the subject to correctly identify the potential value of using this technique.
The main idea is to give your testing software the ability to "look" inside subroutines for intermediate results at specific places, namely those problematic spots where you tend to leave debug-level messages to diagnose problems, and especially those places where you would tend to leave a $DB::single=1 to remind yourself and others that this is a place to stop and check a local variable, branching decision, side-effect, etc. The idea is inspired in electronic circuit-board Test-Points and I2C tests and auto-diagnostic circuitry.
Benefits of using this technique include:
The proposed Tutorial follows:
Using Test-Point Callbacks for Automated Diagnostics in Testing
What is a "Test-Point" and why should I care?
Tests usually fail, at most, with a diagnostic message regarding the overall result of the test or some observable side-effect available to the test code. Then it's up to the programmer to elevate the debugging level and re-run the tests in the hopes that the diagnostic debug messages will give some clue as to what went wrong inside a specific method or subroutine. If the diagnostic messages don't help, the programmer then has to dive into the code, usually placing warn/debug statements or firing up the debugger and use breakpoints, watches, etc. to determine under what conditions this particular test was failing.
It's quite customary to leave these debug-level messages (and even DB breakpoints) in the code as they are basically innocuous at runtime, but are left there, precisely as a reminder that this is usually a good place to stop and inspect something when testing and debugging. It is these spots in the code which we will call "Test-Points" in the context of this tutorial. They are places in the code that help you diagnose a failure, akin somewhat to assertions, but targeted not at unexpected results or side-effects, but rather at expected flows, values, etc. based on certain conditions created by a specific test.
It is important to note that contrary to assertions, Test-Points (or TP) are mainly used to give the test code the ability to inspect inside methods at specific places, when running a specific test, and just as debug statements and pre-defined breakpoints, they can be safely left in the code, and activated when needed. The TPs are analogous to those found in printed circuit boards which are sometimes used by special self-test / diagnostic circuitry such as I2C, for example, which is commonly found in most modern Phillips appliances.
Anywhere you place a permanent debug-level statement, or have found the need to place a DB breakpoint, qualifies as a potential TP.
By using the TP techniques described here, you can actually do stuff in tests like: evaluate partial results within a sub, monitor some local/automatic var, check an object attribute or side-effect at a given spot, or even condition further tests based on these intermediate results.
What Test-Points are not
Test-Points are not a substitute for good design. Having said that, the techniques outlined here can actually be used to test otherwise un-testable software. And while this is better than no testing at all, it is always best to re-think and factorize your code to make it more test-able in the first place. In fact, by learning the Test-Point techniques in this tutorial, you might be inspired to use some functional techniques (yes, even in your OO code!) to make your software more flexible and elegant. The underlying concepts of this technique were borrowed directly from a book titled "Higher Order Perl" by Mark Jason Dominus. One particular paragraph that will hopefully encourage you to read this book, and come to really appreciate the beauty and power of a multi-paradigm language such as Perl:
... by parametrizing some part of a function to call some other function instead of hardwiring the behavior, we can make it more flexible. This added flexibility will pay off when we want the function to do something a little different, such as performing an automatic self-check. Instead of having to clutter up the function with a lot of optional self-testing code, we can separate the testing part from the main algorithm. The algorithm remains as clear and simple as ever, and we can enable or disable the self-checking code at run time if we want to, by passing a different coderef argument.
Last paragraph, section 1.3 "The Tower of Hanoi"
To illustrate the method, we will be using Moose code as an example, although this technique can be applied to any and all Perl code, as it is based on powerful features/capabilities of the core language such as code references and callbacks.
The example code below uses a single callback as a class attribute. In reality, you would probably want protect this attribute with specific read and write methods, use configuration options, environment variables and so forth, but for the sake of simplicity, we will just use the default get/setters to map the single callback to a code reference in the test code itself.
Any potential hook in software can lead to unscrupulous use by crackers and malware. It could also lead to inadvertent use by a clever hacker to exploit some other use for your TP, possibly not even realizing that it should only be used for testing and no code should depend on these callbacks whatsoever. Also in the example below, you will notice that these callbacks are used just to send scalar values in anonymous hashes, giving the testing software a chance to inspect values inside a sub, much like watches. Likewise, they could be also used bi-directionally to force certain behavior in a specific test (e.g. to simulate some data inside the subroutine) but this is highly unrecommended, unless you really need to and know what you're doing. Although, such inbound manipulation could be used to further diagnose a failure by re-running the test and forcing conditions to assert the failure, perhaps based on particular experience or expertise.
Setting up the Callback
This example Moose class implements the TP technique by setting up a single callback attribute (tp_callback), which is preset to be a code reference. In the BUILD method, we initialize the callback with a simple sub, making it innocuous unless someone deliberately points this code reference elsewhere. In this example, this may occur at creation time (in the new() method of the class), or ad-hoc later on.
In real life, you would probably want to limit this by wrapping/conditioning the actual calls to the callback with a debug-level or diagnostic-mode flag of some sort. Similar to the way debug messages are conditioned by level, you could make the attribute private or simply unavailable if certain conditions are not met. One way to accomplish this in Moose, could be to verify the debug-level (config param, environment var, etc.), and by using a code block at the beginning of your class, modify the class before creation, making the whole callback mechanism is simply disabled or non-existent to external code. A simple example of this is given at the end of this tutorial.
Creating the Test Point
In the example code below, a TP is created by simply calling the global callback with two parameters (a recommendation, but the parameter list is really up to you). The first parameter is the TP name, this will allow the testing code to identify which TP is being called. The second parameter is optional as you may just want to check the object's public attributes or side-effect. We recommended that you standardize this to a single anonymous hash, containing key-value pairs of the things you want to inspect, or even modify by returning references. References would allow the test code to perform more elaborate actions, but again, this is not recommended unless really needed, usually if there is no other way to accomplish this using the regular input of the sub in question, or because you want to drill-down into the diagnosis.
Remember that the principal objective of the Test-Point is for better diagnostics, but in some particular situations it might be useful to force certain things and analyze the behavior. In other words, forcing something may allow the testing code to diagnose the problem even further and pin-point exactly not only what failed but possibly even why. In an electronic circuit this would be similar to injecting a signal at a TP to verify or assert the diagnosis. Of course, this is what a test software is meant to accomplish in the first place, but it's generally limited to the sub's standard input parameters, whilst this technique may allow a test to alter or force certain things inside the sub itself. Again, use with care and only if it makes sense to enhance the diagnostic capabilities of your tests.
The test code below implements the Test Points. The general structure of the test file is just like any other except that the Test-Points are handled in an event-driven fashion, meaning that they will be called at any time during the execution of a regular test, and a specific Test-Point may be called by several tests as well, so you may need additional parameters or test logic to differentiate.
This test example below has four main sections. The first is just the basic declarations and common use_ok tests of any test file. The second, is the declaration of the dispatch table that maps the Test-Point names to their servicing sub-routines. The third is the standard tests you would usually perform on this class, and the fourth are the Test-Point service subroutines.
TP names of course, should be unique, so you may want to include the sub name as part of their name, or could also be numbered with some special TP scheme such as those used in circuit boards. Also, bear in mind as mentioned above, that a TP may be called by different tests. This can be addressed by conditioning the callback in the test code itself, or in the class code as well (e.g. invoke this TP only if x,y,z).
The final implementation details depend very much on your your specific needs so you must adapt this technique accordingly. The technique described here is intended to be simple and introduce the subject of Test-Points by using Callbacks. Other options you may want to consider are: using multiple callbacks or allowing your subs to accept coderefs as parameters. This is commonly used in functional programming as was described above in the reference Mark Jason Dominus' "Higher order Perl".
How to run the examples
Download the examples to the same directory and run with:
prove -v test_class.t
You should see something like the example results below using bash shell:
Avoiding TP abuse by Meta manipulation in Moose
This las section deals with some security concerns that may arise by allowing external code to peek or modify the behavior of your subs. It is not meant as an exhaustive analysis, but rather as a simple example to prevent inadvertent use of your Test-Points. The following code is almost exactly the same as above, but with a simple mechanism to disable the TP mechanism unless an environment variable is set to a specific debugging level.
The class is almost identical to the original example above. Nevertheless, a code block at the top of the class modifies the class' meta information disabling the TP capabilities. To explain this code in detail is beyond the scope of this tutorial but it should clearly illustrate the point. To further understand it please refer to the POD for Class::MOP, Moose, Moose::Manual and Moose::Manual::Attributes.
The test code is almost exactly the same as the previous one except for the conditional on setting-up the callback.
How to run the examples
Download the examples to the same directory and run with:
prove -v cond_tp.t
Test the TP conditionals by setting the MYDEBUG_LEVEL to 5 and above. You should see something like the example results below using bash shell: