Next: None. Up: Large-scale software development. Previous: Large-scale software.


Waterfall model

Early in the history of computing (around 1970), people started noticing how big programs were getting - and, more crucially, how buggy they were. They invented the field of software engineering within computer science to study the process of developing reliable, large software.

One of the first problems was trying to get some handle on how people should develop large-scale software, and one of their first efforts was called the waterfall model, a picture that looks like the following. (The details of the waterfall model vary, but two things remain constant: There are boxes going from left downward to the right; and there are arrows connecting each to its successor (preferably blue, to connote the idea of water falling). But people aren't unanimous as to how many boxes there are, or what goes into the boxes. This is my own variation.)


The waterfall model of software development
The basic concept of the waterfall model is that software development is a sequence of stages, and software developers should clearly delineate the stages: First we design; and after we finish designing, we begin coding; and after we finish coding, we begin testing; and after we finish testing, we deploy.

The arrows on the bottom are for handling errors in executing the process. As we're coding, we may discover that there's something wrong with our design, and so we would halt coding and go back to the previous stage to modify our design to correct for the error. Or, during testing, we may discover a minor coding bug - but we may also discover a larger design bug.

Design

Design itself breaks down into smaller stages.

In this class, all three of these things have been given to you. A more realistic scenario would give you only the specification to work with. But so far you've always been given the design as part of the problem, so that you can get a feel for how a well-designed system would work.

Coding

This is the stage that we've emphasized most in this class. It's also the least important piece. I'm not going to discuss it more now.

Verification and testing

Software engineers distinguish strongly between verification and testing. Verification refers to formal mathematical proofs of program correctness, while testing refers to experimental trials to test for potential errors. Verification is quite rare - it's more of a subject for researchers and nuclear weapons control systems programming. For most software, the only step toward determining correctness is the experimental trials.

In this class, you've probably often tested as follows: You've written everything in the design document, and then you run your entire program to ensure it works, running a few random test cases to see what works. In a larger system, this process breaks down, since there are so many pieces to go wrong. It just doesn't work to try to code it all without testing, then put it together.

Even in our labs, I've been pushing the concept of iterative development, where you slowly grow the program, each time checking it to make sure it is still working. But this process is really only well-suited to single-person jobs.

Properly done, large-scale testing has three stages.

In its large systems, Microsoft works with a testing system that works as follows: Programmers check out and work on a set of modules. When they complete a module, they check it in. The system will automatically run it through a sequence of tests to verify that the module checks internally. Overnight, each night, the system will compile all the checked modules, and it will run a sequence of tests automatically on the complete system to ensure it still passes all the tests. (There are many tests that it does automatically, so it can only be done overnight.) If it does not with the new module incorporated, it will reject the module and produce a report of the error for the person who produced the latest update to verify.

Microsoft employs a large division of people dedicated to beta testing. Their job is simply to run software, looking for problems. When they find a problem, they have to nail it down as much as possible, and then they produce a report for the software developers to tackle.

Deployment

Conclusion

How big are all these pieces? Fred Brooks, author of The Mythical Man-Month (an extremely readable, short book about software engineering, which I highly recommend), estimated the following for a large-scale project he supervised (OS/360, an IBM operating system that at the time was among the largest systems ever developed).

1/3design
1/6coding
1/4module tests
1/4system tests
So you see, the coding stage (which this course has emphasized almost exclusively) is only a small piece of the overall process.


Next: None. Up: Large-scale software development. Previous: Large-scale software.