Session 26: Testing and debugging

Testing
    White-box testing
    Black-box testing
Large-scale testing (Section 10.10)
    Unit testing
    Integration testing
    Regression testing
Debugging
    Hand traces
    Print statements
    Debuggers

Testing

Solid testing requires the following three steps.

  1. Generate a test case.
  2. Determine what the results should be.
  3. Run the program to observe its results.
  4. Observe how the program's results differ from expectations.
This is really very similar to the classic scientific method (hypothesis, procedure, results, conclusion).

Testing falls into two categories: white-box testing and black-box testing.

White-box testing

White-box testing involves generating test cases while looking at the code. Generally, you're looking for a large enough set of test cases to hit all the cases in the code.

For example: Consider the following piece of code to find the maximum of 5 numbers typed by the user.

int max = IO.readInt();

for(int i = 0; i < 4; i++) {
    int q = IO.readInt();
    if(q > max) {
        max = q;
    }
}
If we were trying to hit all execution paths, we'd have to cover all 16 combinations of hitting/missing the if condition. Such examples could include the following.
1 1 1 1 1     1 1 1 1 5
1 1 1 4 1     1 1 1 4 5
1 1 3 1 1     1 1 3 1 5
1 1 3 4 1     1 1 3 4 5
1 2 1 1 1     1 2 1 1 5
1 2 1 4 1     1 2 1 4 5
1 2 3 1 1     1 2 3 1 5
1 2 3 4 1     1 2 3 4 5
If we modified the program to find the the maximum of 20 numbers, we'd have 220 > 1 million different cases. This is just not reasonable.

If you settle for the second case, then there is just one case to test: And that case must have the if condition be true at some time.

1 2 1 1 1
Of course, this isn't as thorough, but it at least checks the fundamentals: That the statement wasn't a complete catastrophe.

Black-box testing

In black-box testing, the tester generates test cases without reference to the source code - that is, the tester is treating the program as a black box, into which the tester cannot look.

Beta testing obviously always involves black-box testing. But even the original software developer does this. In fact, it's probably the primary kind of testing you've been doing on your laboratories: Once you have the program coded, you run it by acting like a regular user.

Good black-box testing will include tests falling into three categories.

After finding a bug in black-box testing, it's often a good idea to try to prune the test case down to try to determine exactly what's going on. You'd do this before you even begin to try to debug, because the simpler test case will generally illustrate the actual problems better.

Large-scale testing

Textbook: Section 10.10

With large-scale programs (of more than 100,000 lines, built by teams of programmers), it's not appropriate to wait until the program is entirely complete to begin testing.

Unit testing

In unit testing, each piece of the program is thoroughly tested before it is accepted. In Java, the most convenient way to break up a program into pieces will be into its separate classes. For example, for the video store program, you would write individual tests of the various classes (Customer, Store, Video, and Main) before putting them together.

This necessitates writing new classes whose sole purpose is to test others. For example, you might write the following program to test various the checkOut method of the Customer class.

public class CustomerTest {
    public static void main(String[] args) {
        Video[] vids = { new Video("A"), new Video("B"), new Video("C"),
            new Video("D"), new Video("E"), new Video("F") };
        Customer test = new Customer("Me");
        for(int i = 0; i < 5; i++) {
            try {
                test.checkOut(vids[i]);
            } catch(Exception e) {
                System.err.println("Unexpected exception on " + i + ": " + e);
            }
        }
        try {
            test.checkOut(vids[5]);
            System.err.println("Exception not thrown when limit reached");
        } catch(Exception e) {}
        Customer other = new Customer("You");
        try {
            other.checkOut(vids[0]);
            System.err.println("Exception not thrown when video already checked out");
        } catch(Exception e) {}
    }
}
It's not uncommon to have the code for the unit testing to be longer than the code it is meant to test!

Unit testing is problematic when there are dependencies between pieces. For example, there may be different people in charge of the Customer and Video classes. This causes a problem for the person writing the Customer class, as it cannot even be compiled until the Video class is complete.

To get around this problem, the Customer author would write a short stub class, which simply defines non-functional methods that Video is to provide. Then at least the Customer class should be able to be compiled. But this isn't adequate for testing purposes.

Integration testing

This dependency problem is resolved by integration testing. Integration testing requires that you draw a picture of which classes use which other classes, called a dependency graph. For example, for the first Drawer lab, you might draw the following picture.

From this, you could work out an order in which individual classes can be tested. Here, we would have to start out with Rectangle, then move to Drawing, then Canvas, and finally Drawer.

The dependency graph quickly gets much more complex as you add more classes to a program. Here's a dependency graph for the second part of the drawing lab.

And, in fact, that laboratory assignment is constructed with a view toward integration testing: You will successively build in new pieces that are relatively independent of each other.

Regression testing

When a software system is relatively complete, and the designers are engaged in incrementally adding new features, they often use regression testing. In regression testing, the developers build up a large library of tests associated with the program. Preferably, these tests will be automated.

When a developer thinks a feature is complete, the developer submits the modifications. But before they are accepted as valid, all the regression tests in the library are run to test whether the modifications break any existing programs. You don't want to accept a modification if it ends up introducing bugs into the system.

In very large systems, regression testing is often an nightly job, executed every night when the developers aren't using the computers.

Debugging

Hand traces

Tracing through the code by hand, to see how variables change, is extremely common - much more common that you might initially think. It's just much easier to trace through the code than to repeatedly recompile and run a test case.

Print statements

Adding print statements is another useful technique. You may think that it's antiquated, but it's in wide use and will continue to do so. It's just so simple.

Some useful tips for deciding where to put your print statements:

Debuggers

There is another tool called a debugger. I don't want to overemphasize its usefulness - I use a debugger far less frequently than I use hand traces and print statements. But a debugger is still often useful.

Good debuggers have at least the following two features.

Forte has a debugger built into it. I'll demonstrate it in class.