good chemistry is complicated,
and a little bit messy -LW
A while ago I was involved in developing an automated test suite for unit testing and functional verification of the OS/2 gui. Both the graphics primitives to be tested and the tests to verify it were generated programmically.
Eg. A pseudo-random sequence of primitives would be generated to draw a random number of lines, boxes, circles etc. The size, position, color, fill characteristics, patterns etc would all be randomly chosen. Every time the testcase generator generated a random graphic, it also generated 1 or more tests to verify it into a separate file. If it generated a line from (10,10) to (20,20) in red, it would generate a tests to verify:
The pixels at (9,9),(10,9),(11,9),(9,10),(11,10),(9,11),(10,11) were all background colored and (10,10) & (11,11) where red. Same thing at the other end and one or two bracketing triples down the length of that line.
It's more complicated than that with thickness, dotted and dashed lines etc, but the principle remains same.
If transforms where randomly applied to teh drawing primitives, they were also applied to the check codes.
To verify these, the test-harness that ran the generated code would do a screen grab (Alt-printscreen so that on the application main frame was grabbed) after the picture was drawn. It would then run the checkcodes against the grabbed bitmap.
All the testcodes were essentially the same; Position(x,y)=>expected(rgb). In this way, when later primitives overlayed earlier ones, the earlier tests where also overwritten. (It got complicated when trying to account for primitives that combined (src xor dest; not(src)and(dst) etc.) but the principle remained).
Generate a random primitive and a few random tests to verify it.
By using the Alt-printscreen function, we removed dependance upon the position and size of the application window, which was also randomly chosen at each run. And each run also applied a randomly chosen set of transforms (scale, rotate, shear etc) to apply to the generated picture. The same tranforms were applied to the checkcode before running.
It worked very well. Each new testcase generated usually took a developers attention to correct anomolies, but once generated and tested, it then became a part of the regression test suite also.
Prior to the RTG, we had half a dozen guys writing testcases for several weeks but when they were analysed, they all covered 15% or so of the same ground (in terms of the primatives exercised), over and over, another 10% or so partially and the rest not at all. A few weeks of generating testcases, the coverage was much, much more evenly spread across the range of primatives and many more combinations of parameters had been exercised.
Besides, writing the generator was an aweful lot more fun than writing the boring tests themselves:)
I just wish I had been using Perl to write it instead of C & REXX.