note chunlou <p>I think I'm the one who's not being clear, not you.</p> <p> I suppose we're both talking about the "best judgement" introducing bias. </p> <blockquote> <i>I fully understand the mechanism whereby it is possible to estimate how many bugs <b>will be found</b> on the basis of how many have been found, and projecting that forward, once the test cases are being produced randomly.</i> </blockquote> <p> Perhaps the fuzziness of human language get in the way here. Any estimate is to estmate how many <i>could</i> be found, never ever how many <i>will</i> be found. To see that, I'll use the catch-and-release example. </p> <p> Suppose the total number (the actual T) of unknown bugs is actually 100. Tester One was assigned with 20 (A) test cases; Tester Two 20 (B) also. 2 (C) bugs in common were found. The estimate is 200 total (possible) bugs (notice the large margin of error). Does it mean you <i>will</i> find 200 bugs given infinite time? Of course not, since we already know that there're 100 actual bugs. The estimate is 200, nevertheless. 200 is the <i>possible</i> total bugs you <i>could</i> find, based on actual available counts at the moment. </p> <p> The technique and the skillset will affect the accurary of an estimate but the principle is still the same. </p> <p align="center"> *&nbsp;&nbsp;&nbsp;&nbsp; *&nbsp;&nbsp;&nbsp;&nbsp; *&nbsp;&nbsp;&nbsp;&nbsp; *&nbsp;&nbsp;&nbsp;&nbsp; *&nbsp;&nbsp;&nbsp;&nbsp; * </p> <p> One side note, not to critique their method, just to provide complementing information, one should be careful when using a polynimial to fit data. Polynimial can fit any mathematical functions, given enough degrees (it's a theorem). Similarly it can fit any data, include white noise. </p> <p> Consider you're testing the response time of your server in response to various levels of workload. You try a linear fit (a straight line) and polynomial of degree two (a+bx+cx^2). The polynomial fits the data better and you have the following. </p> <pre> X X . . X . * . . X . * X . * .X . .X . . X .: data points X: fitted to actual data *: prediction, extrapolation </pre> <p> But it doesn't fit into the common sense (response time improves as the workload increases). This kind of error is very hard to detect in higher dimension, especially when you don't actually know what to expect. </p> <p> The moral: A more complicated model does not always improve your <i>prediction</i>; it could even worsen it in some cases. </p> 285637 285718