Session 36: Large-scale software development

Large-scale software
Waterfall model
Design
Coding
Verification and testing
Deployment
Conclusion

Large-scale software

In this course, we have emphasized almost exclusively the details of a single programming language (Java), trying to understand the language enough to be able to write our own programs to do our simple tasks.

In the real world, the process is easily complicated by the fact that the real programs that people use aren't tiny things at all. In this class, we've written some largish programs, some even with a few hundred lines. But try comparing this to real systems that people use.

300 lines	a typical lab assignment solution
7,500 lines	Logisim, an educational circuit-building program
2,000,000 lines	Solaris, the operating system running on our Sun computers
29,000,000 lines	Windows 2000 operating system

Logisim represents a single-person effort, spanning a few weeks of development time. Even it's in a different class than our lab programs. But it bears no resemblance to a heavy-duty operating system like Solaris or Windows 2000, which involves huge teams of programmers working full-time for years on a system.

What we've been doing is what I call ``programming in the small.'' It's a necessary first step, but it's nothing like what happens in real software development projects.

What I want to do today is to introduce a general overview of the large-scale software development process, using a specific model called the waterfall model.

Waterfall model

Early in the history of computing (around 1970), people started noticing how big programs were getting - and, more crucially, how buggy they were. They invented the field of software engineering within computer science to study the process of developing reliable, large software.

One of the first problems was trying to get some handle on how people should develop large-scale software, and one of their first efforts was called the waterfall model, a picture that looks like the following. (The details of the waterfall model vary, but two things remain constant: There are boxes going from left downward to the right; and there are arrows connecting each to its successor (preferably blue, to connote the idea of water falling). But people aren't unanimous as to how many boxes there are, or what goes into the boxes. This is my own variation.)

The waterfall model of software development The basic concept of the waterfall model is that software development is a sequence of stages, and software developers should clearly delineate the stages: First we design; and after we finish designing, we begin coding; and after we finish coding, we begin testing; and after we finish testing, we deploy.

The arrows on the bottom are for handling errors in executing the process. As we're coding, we may discover that there's something wrong with our design, and so we would halt coding and go back to the previous stage to modify our design to correct for the error. Or, during testing, we may discover a minor coding bug - but we may also discover a larger design bug.

Design

Design itself breaks down into smaller stages.

Specification: You meet with the person paying for the work to determine what exactly the software's purpose is, and what particular things it ought to do.
User interface design: You decide on how exactly the software should appear to the user, and how the user will be able to accomplish things. Frequently, this is a matter of simply designing menus and dialog boxes. But there are radically different user interfaces too (a voice interface via telephone, perhaps).
Module design: You decide on how to break up the overall task into smaller, more manageable pieces. In an object-oriented system, this is basically a matter of deciding what the classes will be.

In this class, all three of these things have been given to you. A more realistic scenario would give you only the specification to work with. But so far you've always been given the design as part of the problem, so that you can get a feel for how a well-designed system would work.

Coding

This is the stage that we've emphasized most in this class. It's also the least important piece. I'm not going to discuss it more now.

Verification and testing

Software engineers distinguish strongly between verification and testing. Verification refers to formal mathematical proofs of program correctness, while testing refers to experimental trials to test for potential errors. Verification is quite rare - it's more of a subject for researchers and nuclear weapons control systems programming. For most software, the only step toward determining correctness is the experimental trials.

In this class, you've probably often tested as follows: You've written everything in the design document, and then you run your entire program to ensure it works, running a few random test cases to see what works. In a larger system, this process breaks down, since there are so many pieces to go wrong. It just doesn't work to try to code it all without testing, then put it together.

Even in our labs, I've been pushing the concept of iterative development, where you slowly grow the program, each time checking it to make sure it is still working. But this process is really only well-suited to single-person jobs.

Properly done, large-scale testing has three stages.

module tests: As each person codes a module, they test it themselves before releasing it to be part of the overall package. Often, this will involve writing some specialized test code (maybe even more code as there is in the module itself) to exercise the various parts of the module thoroughly.

Basically, a programmer won't release the module until confident that it has no problems within it internally. It may not interact with other modules well, but independently of the rest it should work.
system tests (alpha testing): With enough modules complete, they begin to put modules together into a program, running their own tests. In a large system, you may want to gradually work yourself up, testing coherent groups of modules separately, before testing all the groups put together.

Basically, you want to avoid big bang testing as much as possible - the system where each person writes a draft of their code and ensures it compiles. The first test is on a draft of the complete program. What you will find is simply that nothing even remotely resembles a working product. Also, big bang testing tends to be highly sequential - you can't distribute the work among many people, and one bug frequently prohibits you from testing any other pieces of the system.
beta testing: After the system appears to be working, they release it to beta testers - people who know the product well, but aren't necessarily programmers. They exercise the system in a more lifelike situation. A good beta tester will find aspects that the developers may not have considered - like some rare situation that doesn't occur often, but is worth considering.

In our drawing program, one thing you may not have considered but is worth considering: What happens if you drag the mouse from within the window to outside the window? I imagine a rectangle would appear, even though its top left coordinates might might be negative. (I'm not speculating on whether this is right or wrong - but it's definitely worth considering whether this should be legal.) A good beta tester would try this particular case to see what would happen.

In its large systems, Microsoft works with a testing system that works as follows: Programmers check out and work on a set of modules. When they complete a module, they check it in. The system will automatically run it through a sequence of tests to verify that the module checks internally. Overnight, each night, the system will compile all the checked modules, and it will run a sequence of tests automatically on the complete system to ensure it still passes all the tests. (There are many tests that it does automatically, so it can only be done overnight.) If it does not with the new module incorporated, it will reject the module and produce a report of the error for the person who produced the latest update to verify.

Microsoft employs a large division of people dedicated to beta testing. Their job is simply to run software, looking for problems. When they find a problem, they have to nail it down as much as possible, and then they produce a report for the software developers to tackle.

Deployment

documentation: The group develops final documentation of how to use the system, and possibly how the system works. People like to skimp on this stage, but for large-scale systems, poor documentation means that people can't really use the system at all.
distribution: The software is distributed to its users. In software companies, this means marketing and selling the software. For internally written software in a company, it means installing it on the proper computers so that people can access the new software.
maintenance: As people use it, bugs will pop up. Also, as hardware and software develops, new compatibility issues will arise. Finally, people may need new, relatively small features to be added to the system. A relatively small group of programmers will have the job of updating the system to meet this challenges.

Conclusion

How big are all these pieces? Fred Brooks, author of The Mythical Man-Month (an extremely readable, short book about software engineering, which I highly recommend), estimated the following for a large-scale project he supervised (OS/360, an IBM operating system that at the time was among the largest systems ever developed).

1/3	design
1/6	coding
1/4	module tests
1/4	system tests

So you see, the coding stage (which this course has emphasized almost exclusively) is only a small piece of the overall process.