Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Seems like you're operating at an extremely low level of fault allowance more like building an airplane than a buddy website.

Instead of trying to write a book, guess I'll start off with some somewhat randomly selected thought.

In order to know the "quality" of your code, it's good to know how many bugs are there, which of course is unknown. Finding how many bugs in your code is like finding how many fish of certain species in a particular area of an ocean. Rarely could we find it by exhaustive count but need a probabilistic approach instead--say, a catch-and-release approach (language borrowed from fish-counting).

Example (rewording of something I posted elsewhere a while back). Your code accepts various combinations of input. Some bugs are to be found by entering those input (whereas some others by load-testing, etc.) Normally, all possible such combinations are too vast to be practical, timely and productive to test them all. Instead we randomly select, say, two subsets among all such possible combinations for Tester One and Two to test.

Let T be the total unknown number of possible bugs associated with all combinations.
Let A be the number of bugs found by Tester One.
Let B be the number of bugs found by Tester Two.
Let C be the number of bugs found by both Tester One and Two.

Hence (let P(X) be probability of X)

P(A and B) = P(C)     (by definition)
P(A)P(B) = P(C) (independence assumption) A B C --- * --- = --- T T T A*B ----- = T C

That means, the less bugs both Tester One and Two found at the same time, the more likely there're still a large number of unknown bugs yet to be found. Or, the more common bugs found by both Tester One and Two, the more likely that they have found most of the bugs. The idea can be visualized with a Venn diagram:

   |                                  |
   |    +------------+                |
   |    |            |        T       |
   |    |    A       |                |
   |    |            |                |
   |    |     +------|-------+        |
   |    |     |  C   |       |        |
   |    +-----|------+       |        |
   |          |          B   |        |
   |          |              |        |
   |          +--------------+        |
   |                                  |

Since A and B are only random subsets of all possible combinations, they are not going to detect all possible unknown bugs (associated with data-input).

The key lies on C, the common area. If you look at the Venn diagram and if you imagine squeezing the superset T smaller and smaller, it will be less likely for C to be small--A and B must tend to overlap.

The whole point is to estimate the total number of bugs (again, associated with data-input, not everything else, such as workload) without having to go through an exhaustive testing.

Of course, that simple estimate probably won't be statistically very valid, since bugs are not independent. But it still gives a good conceptual insight--if a bunch of independent testers tend not to find common bugs, there're probably still pretty of bugs out there.

NOTE: It's not testing results that are most important but the process (as learning). The results are only as good as what you learn from them and how you understand the problems better and differently.

It is the same meta process with programming. A programmer is considered "good" not just because he writes code that works but he writes code that actually solves the problem pertaining to the users, not the problem the programmer finds interesting and feels like to solve. He doesn't stop just because "it works."

If the testing process only helps you answer the what (such as how many bugs) but not the why, the process is flawed.

Flawed in what sense? The what (such as benchmarking) only tells you where you are. The why helps you predict and plan. If someone doesn't learn something new (both what and why) out of a testing, the testing is pointless, no matter what fancy technique used and statistics derived. (Repeated testing without being able to pinpoint and solve anything is symptom. For one to blame testing that fails him is like he blaming his car driving him to the wrong destination.)

In reply to Re: Software Design Resources by chunlou
in thread Software Design Resources by Anonymous Monk

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or or How to display code and escape characters are good places to start.
Log In?

What's my password?
Create A New User
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2021-06-16 20:15 GMT
Find Nodes?
    Voting Booth?
    What does the "s" stand for in "perls"? (Whence perls)

    Results (80 votes). Check out past polls.