Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??
Within a month it was shown that all of the test case produced by programmers/testers targeting their efforts according to their (or their superiors) best judgment, had covered less than 15% of the total APIs with 10% having been duplicated over and over, 5% of the possible parameter combinations, and found less than 1% of the possible bugs.

According to the limited description, the repeated 10% "duplication" was probably due to "best judgment" bias. If random sampling is used instead, it's almost impossible to have such a duplication across testers and overtime (though you still will have duplications "locally").

"Best judgement" bias is like trying to estimate the total number of Perl programmers in the world by asking people what language they use in Perl websites alone.

The "1% of the possible bugs" estimate was possibly derived from something like the catch-and-release method I mentioned in my another reply in this thread.

The the unprobabilistic approach of "best judgement" and the probabilistic estimate of "1% of the possible bugs" seem strange to occur together, however.

*     *     *     *     *     *

In case "probability" may sound like a voodoo, consider this: In a population of 100, if all are ages 25, what sample size do you need to come up with a 100% confidence estimate of the population average age? One, of course.

If 90 of them are of age 30 and 10 age 20 (average 29), a random sample of size one give you an "average" 30 90% of the time. Pretty good "estimate" actually, considering you don't have to ask all 100 of them.

The worst case (in term of sample size needed) is a 50/50 split.

So, a population of one million all aged 25 only need a sample size one to get a good (perfect in this case) estimate, whereas a population of 30, 10 aged 30, 10 aged 20, 10 aged 5 needs a larger sample size.

The moral: the quality of a statistical estimate is affected by the heterogeneity of the population, not its size. It's very counterintuitive to many people, I know.

In reply to Re: Re: Software Design Resources by chunlou
in thread Software Design Resources by Anonymous Monk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2021-06-16 13:16 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (74 votes). Check out past polls.