Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I think I'm the one who's not being clear, not you.

I suppose we're both talking about the "best judgement" introducing bias.

I fully understand the mechanism whereby it is possible to estimate how many bugs will be found on the basis of how many have been found, and projecting that forward, once the test cases are being produced randomly.

Perhaps the fuzziness of human language get in the way here. Any estimate is to estmate how many could be found, never ever how many will be found. To see that, I'll use the catch-and-release example.

Suppose the total number (the actual T) of unknown bugs is actually 100. Tester One was assigned with 20 (A) test cases; Tester Two 20 (B) also. 2 (C) bugs in common were found. The estimate is 200 total (possible) bugs (notice the large margin of error). Does it mean you will find 200 bugs given infinite time? Of course not, since we already know that there're 100 actual bugs. The estimate is 200, nevertheless. 200 is the possible total bugs you could find, based on actual available counts at the moment.

The technique and the skillset will affect the accurary of an estimate but the principle is still the same.

*     *     *     *     *     *

One side note, not to critique their method, just to provide complementing information, one should be careful when using a polynimial to fit data. Polynimial can fit any mathematical functions, given enough degrees (it's a theorem). Similarly it can fit any data, include white noise.

Consider you're testing the response time of your server in response to various levels of workload. You try a linear fit (a straight line) and polynomial of degree two (a+bx+cx^2). The polynomial fits the data better and you have the following.

            X X
   .  .  X  .    * 
. .   X    .       * 
    X  .             *
 .X  .
.X .  .

.: data points
X: fitted to actual data
*: prediction, extrapolation

But it doesn't fit into the common sense (response time improves as the workload increases). This kind of error is very hard to detect in higher dimension, especially when you don't actually know what to expect.

The moral: A more complicated model does not always improve your prediction; it could even worsen it in some cases.

In reply to Re: Re: Re: Re: Software Design Resources by chunlou
in thread Software Design Resources by Anonymous Monk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
Domain Nodelet?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (2)
As of 2021-08-02 05:26 GMT
Find Nodes?
    Voting Booth?
    My primary motivation for participating at PerlMonks is: (Choices in context)

    Results (20 votes). Check out past polls.