by Carl Burch, Hendrix College, September 2012
Being able to prove programs correct is a desirable element of a programming language, since this leads to more reliable programs. While the proof process is very tedious and is impractical for most situations, knowing how to prove a program correct can help a programmer reason about a program's correctness. For example, when I write an algorithm, I often informally think about how it might be proven, and sometimes this leads to insights about important, forgotten cases.
In this chapter we examine a system for proving programs in an imperative language. We use a proof system designed for a severely restricted version of C, but the system can easily be extended to a more complete language.
Straight-line programs are programs that include no control statements. We'll begin with this type of program to gain an understanding of how the proof system works; we'll incorporate control statements into our proof system later. For this restricted version, in fact, we'll consider only assignment statements.
A hypothesis involves a combination of program code with assertions that say what is true at particular points in the program's execution. Assertions are enclosed in braces and are written in mathematical notation. Consider the following hypothesis, for example.
{fact
=i
! ∧i
≥ 1 }
i = i + 1;
fact = fact * i;
{fact
=i
! ∧i
≥ 1 }
This hypothesis consists of three parts: an assertion called the precondition, followed by some C code, followed by another assertion called the postcondition. It indicates that, were the precondition to hold before executing the given C code, then the postcondition would hold upon completing the code.
We will prove such a hypothesis using a sequence of formal steps. Each step will be justified either by being a straightforward mathematical fact (involving no assertions) or by applying one of a set of rules, or axioms. We will begin with the following five axioms useful for proving straight-line programs.
Rule of Sequence | Rule of Consequence | Rule of Assignment | ||||||||
|
|
|
Rule of Preconditions | Rule of Postconditions | |||||||
|
|
Axioms are written in a form involving some set of facts above a line, followed by a single fact below the line. This indicates that if we can prove every fact above the line, then we can infer the fact below the line using the axiom.
For example, the Rule of Sequence requires first that we show two things. The first, {P} S {Q}, directs us to show that if P is true, then after executing S, Q is true. We also need to show that if Q is true, and then after executing T, R is true. If we can show both of these, then we can conclude that if P is true, and then we execute S then T, then R is true. (If this sounds obvious, that's exactly the point: Axioms are supposed to be self-evident, here just as in geometry.) This rule gives us a way of combining statements into a sequence.
Then we have the Rule of Consequence, which requires a previous step of the form P ⇒ Q. This is a mathematical assertion with no reliance on programming at all; consequently, the justification that would allow us to include such a step would be a mathematical argument.
In the Rule of Assignment, the notation Px → E indicates the proposition where we take P and replace every occurrence of x by the expression E.) This axiom says that, if Px → E is true, then after we assign E to x, P is true. We don't need any previous steps to conclude this, and so nothing appears above the line in the axiom statement.
This rule is somewhat peculiar in that conceptually it proceeds backward in time: You take a proposition P that you want to be true after the assignment statement, and from this you can derive a proposition (Px → E) that, if true before the assignment takes place, would make P be true after the assignment.
Finally, we have the Rule of Postconditions and Rule of Preconditions. Technically, these are unnecessary, as each is simply a combination of the Rule of Sequence and the Rule of Consequence.
Below, we use our five axioms to demonstrate a complete formal proof building up to the conclusion in the final step. This proof is a sequence of steps numbered in increasing order. Each step consists of a fact, followed by a justification of the fact. When the axiom used requires previous steps, they're referenced in the proof by listing the lines where those steps can be found.
1. |
fact = ((i + 1) − 1)! ∧ i ≥ 1 ⇒ fact = i ! ∧ i + 1 ≥ 1
| Mathematical fact |
2. |
{ fact = ((i + 1) − 1)! ∧ i + 1 ≥ 1 }i = i + 1; { fact = (i − 1)! ∧ i ≥ 1 }
| Assignment |
3. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; { fact = (i − 1)! ∧ i ≥ 1 }
| Preconditions: 1, 2 |
4. |
fact = (i − 1)! ∧ i ≥ 1 ⇒ fact ⋅ i = i ! ∧ i ≥ 1
| Mathematical fact |
5. |
{ fact ⋅ i = i ! ∧ i ≥ 1 }fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Assignment |
6. |
{ fact = (i − 1)! ∧ i ≥ 1 }fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Preconditions: 4, 5 |
7. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Sequence: 3, 6 |
For example, the justification for step 3 states that the fact is proven using the Rule of Preconditions, based on steps 1 and 2. Recall the Rule of Preconditions.
P ⇒ Q |
{Q} S {R} |
{P} S {R} |
The Rule of Preconditions states that we need a statement of the implication “P ⇒ Q”, which we have in step 1, and a statement of the form “{Q} S {R}” with the precondition matching the implication's consequence, which we have in step 2. So we can manipulate it into “{P} S {R}” in step 3, matching “{Q} S {R}” except that the precondition has changed.
Step 2 involves the Rule of Assignment.
{Px → E} x = E; {P} |
The Rule of Assignment requires nothing to be proven, and so the
justification of step 2 refers to no other lines of the
proof. However, you can see that step 2 follows the required form
“{Px → E} x = E; {P}”:
Every instance of i
in its
postcondition is replaced by i
+ 1 in the
precondition.
Step 7 illustrates the Rule of Sequence at work.
{P} S {Q} |
{Q} T {R} |
{P} S T {R} |
In this case, the step references steps 3 and 6. Since the postcondition of step 3 matches the precondition of step 6, this assertion can be validly understood as the Q of the Rule of Sequence. When we write the conclusion given in step 7, the precondition of step 3 stands in for P and the postcondition of step 6 stands in for R.
Steps 1 and 4 are justified as
“mathematical facts.”
This is a cheat: Technically, they too should be justified by a
sequence of logical inferences.
In the case of step 1 of our proof,
we can easily observe that if we have the antecedent,
fact
= ((i
+ 1) − 1)! implies
fact
= i
! as a simple matter of algebra,
and of course if i
is at least 1, then
i
+ 1 is also, since it is even larger.
In the case of step 4,
we can multiply both sides of fact
= (i
− 1)!
by i
.
Doing this, however, is tedious, and it lies beyond the scope of proving
programs.
Thus, we'll settle for simply asserting that they're easily
proven. (Students sometimes take this informality as a
license to claim some things as mathematical facts that are actually
false. No: It's incumbent on the proof writer
to verify that such facts are indeed easily proven.)
That one can write a proof is nice, but ultimately we want to know how to construct a proof. Some guidelines about how to construct such a proof are in order.
To construct a program proof, it's most convenient to go backwards: We'll start with our desired conclusion and eventually work ourselves back to the beginning. Suppose we're trying to prove the following as the conclusion of our proof.
100. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| ??? |
We've numbered this as step 100 in our proof. Of course, we won't be using all of steps 1 to 99, but we don't know how many there will be yet.
We reason that the code of step 100 is a combination of two smaller pieces of code; we'll have to use the Rule of Sequence to combine them. Thus, we assume that the Rule of Sequence must be the rule implying step 100. To apply this rule to conclude with step 100, we need to have proven two facts previously.
50. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; { A } | ??? |
99. |
{ A }fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| ??? |
We don't know exactly what A is yet, and we don't know how to prove these two steps. But we do know that if we can prove steps 50 and 99, then we could apply the Rule of Sequence using steps 50 and 99 to derive step 100.
We'll work on proving step 99 first.
Step 99 looks very much like something proven by the Rule of
Assignment.
We can determine the nature of assertion A based on the Rule of
Assignment, by substituting fact
⋅ i
for fact
in the desired
conclusion. Thus, step 99 becomes the following.
99. |
{ fact ⋅ i = i ! ∧ i ≥ 1 }fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Assignment |
The assertion A thus derived is called the the weakest precondition. The name comes from its being the minimum that must be true to get the desired conclusion.
With step 99 out of the way, only step 50 remains to be proven. With A determined, we know exactly what we have left to prove.
50. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; { fact ⋅ i = i ! ∧ i ≥ 1 }
| ??? |
This statement doesn't match any of our rules exactly. However, it does have an assignment statement, which indicates that we'll have to use the Rule of Assignment. The Rule of Assignment doesn't apply for step 50, but we can go ahead and throw it into our proof using the same postcondition required for 50.
49. |
{ fact ⋅ (i + 1) = (i + 1)! ∧ i + 1 ≥ 1 }i = i + 1; { fact ⋅ i = i ! ∧ i ≥ 1 }
| Assignment |
This doesn't match step 50 exactly, since the preconditions don't match. But the rest does match, which suggests that the Rule of Preconditions could come into play. To use the Rule of Preconditions, we'd also need the following step.
48. |
fact = i ! ∧ i ≥ 1 ⇒ fact ⋅ (i + 1) = (i + 1)! ∧ i + 1 ≥ 1
| ??? |
We can quickly examine this to verify that it is a mathematical fact: If we know
fact
= i
!,then we can multiply both sides by i
+ 1 to get
fact
⋅ (i
+ 1) = i
! ⋅ (i
+ 1) = (i
+ 1)!.With this demonstrated, our overall proof has all its steps completed. Putting the parts together, the final product is below.
48. |
fact = i ! ∧ i ≥ 1 ⇒ fact ⋅ (i + 1) = (i + 1)! ∧ i + 1 ≥ 1
| Mathematical fact |
49. |
{ fact ⋅ (i + 1) = (i + 1)! ∧ i + 1 ≥ 1 }i = i + 1; { fact ⋅ i = i ! ∧ i ≥ 1 }
| Assignment |
50. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; { fact ⋅ i = i ! ∧ i ≥ 1 }
| Preconditions: 48, 49 |
99. |
{ fact ⋅ i = i ! ∧ i ≥ 1 }fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Assignment |
100. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Sequence: 50, 99 |
This proof is actually shorter than our earlier one: The backward-derived proof has only five steps, while the other had six. More important than its shortness, however, was our systematic construction of it. You could even conceive of writing a program to generate a proof like this automatically.
Things get somewhat more difficult when we throw control statements
like if
and while
statements into the batch.
We'll add two new axioms to give us the ability to prove programs
that incorporate if
and while
statements.
These are the last two axioms we'll see. (In case you've
forgotten your logical notation: The symbol ∧ stands for
logical AND and the symbol ¬ stands for logical
NOT.)
Rule of Condition | Rule of Iteration | |||||
|
|
The Rule of Condition states that if we want to prove that P implies that
executing an if
statement makes Q hold,
then we must have two previous steps:
{ P ∧ B } S { Q }
and { P ∧ ¬ B } T { Q }.
If we have know these things, and we reach the if
statement
with P holding, then B is either true or false.
If B is true, then both P and B hold and we will execute
S; because we have proven “{ P ∧ B } S { Q },”
we know that Q must hold after executing S.
If B is false, then P and
¬ B hold before executing T;
because we have proven “{ P ∧ ¬ B } T { Q },”
we know that Q must hold after executing S.
Regardless of whether B is true, then, Q holds upon completing the
if
statement.
The Rule of Iteration states that, if we know that the body of a
while
loop preserves the truth of some assertion I (assuming
that the while
condition B also holds before executing the
body),
then we can conclude that if I holds on reaching a while
statement, it will continue to hold after each iteration, and thus it
will hold once B is finally false and hence the loop
completes.
The assertion I is called an invariant of the while
loop, since it invariably holds as the while
loop
completes each iteration.
Now suppose we want to prove the following.
n
≥ 1 }fact = 1;
i = 1;
while (i != n) {
i = i + 1;
fact = fact * i;
}
fact
= n
! }
This hypothesis boils down to proving that this C code correctly
computes n
!: If n
is at least 1 before we start executing
the code, then fact
holds n
! following it.
To prove that the code works, we'll have to decide on the invariant for the loop. This invariant I needs to have the property it holds on first reaching the loop, it holds after each execution of the loop, and I ∧ ¬ B implies the desired conclusion after the loop.
In general, coming up with such an invariant involves a bit of insight; practice definitely helps recognize the invariant. In this case, we'll use the following invariant for I.
fact
= i
! ∧ i
≥ 1This is typical of invariants for loops that iterate over a sequence
of integers: There is a reference to the loop control variable
i
within the invariant, and the value of i
on first
reaching the loop makes for a trivial assertion, while the value of
i
on completing the loop gives us what we want. (We'll see
an example of a similar invariant in the next section.)
Below is a completed proof using this invariant.
1. |
{ fact = i ! ∧ i ≥ 1 }i = i + 1; fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| from previous section |
2. |
fact = i ! ∧ i ≥ 1 ∧ i ≠ n ⇒ fact = i ! ∧ i ≥ 1
| mathematical fact |
3. |
{ fact = i ! ∧ i ≥ 1 ∧ i ≠ n }i = i + 1; fact = fact * i; { fact = i ! ∧ i ≥ 1 }
| Preconditions: 1, 2 |
4. |
{ fact = i ! ∧ i ≥ 1 }while (i != n) { { fact = i ! ∧ i ≥ 1 ∧ ¬ (i ≠ n ) } | Iteration: 3 |
5. |
fact = i ! ∧ i ≥ 1 ∧ ¬ (i ≠ n ) ⇒ fact = n !
| Mathematical fact |
6. |
{ fact = i ! ∧ i ≥ 1 }while (i != n) { { fact = n ! } | Postconditions: 4, 5 |
7. |
{ fact = 1! ∧ 1 ≥ 1 }i = 1; { fact = i ! ∧ i ≥ 1 }
| Assignment |
8. |
{ 1 = 1! ∧ 1 ≥ 1 }fact = 1; { fact = 1! ∧ 1 ≥ 1 }
| Assignment |
9. |
{ 1 = 1! ∧ 1 ≥ 1 }fact = 1; i = 1; { fact = i ! ∧ i ≥ 1 }
| Sequence: 8, 7 |
10. |
n ≥ 1 ⇒ 1 = 1! ∧ 1 ≥ 1
| Mathematical fact |
11. |
{ n ≥ 1 }fact = 1; i = 1; { fact = i ! ∧ i ≥ 1 }
| Preconditions: 10, 9 |
12. |
{ n ≥ 1 }fact = 1; { fact = n ! }
| Sequence: 11, 4 |
The proof begins by referencing our earlier proof and continues in three basic parts:
Steps 1 to 3 establish that the loop's body preserves the invariant as required by the Rule of Iteration. Step 4 uses this in employing the Rule of Iteration.
Steps 5 and 6
repair the postcondition following the loop to be the desired
postcondition (i.e., fact
holds n!).
Steps 7 to 11 establish that the variable initializations lead to achieving the the loop invariant before we start the loop. Joined to the loop in step 12, we get the final result.
(You may notice that we never actually used the precondition
n
≥ 1: It was simply discarded in step 10.
On the other hand, the program doesn't work unless this
precondition is true. The reason is that our proof technique
only establishes “partial correctness,” an issue
that we'll discuss in Section 3.2.
To get a better feel for how this works, let's examine another program.
This one finds finds the sum of all the factors of n
(excepting n
itself).
n
≥ 1 }i = 1;
sum = 0;
while (i != n) {
if (n % i == 0) {
sum = sum + i;
} else {
}
i = i + 1;
}
sum
= ∑j = 1n
− 1 j ⋅ δdiv(n
, j) }
The postcondition uses the notation δdiv(n
, j);
this specialized mathematical notation expresses a function that
is 1 when j divides into n
exactly and 0 otherwise.
1 | if n mod j = 0 |
0 | otherwise |
The postcondition, then, says that sum
is the sum of the
factors n
, up to but not including n
itself.
The first thing we need to do is to determine our invariant. For this loop, a good invariant is the following.
sum
= ∑j = 1i
− 1 j ⋅ δdiv(n
, j)Note the similarity to the invariant for the factorial program:
fact
= i
!.In both cases, each additional iteration adds a new term to the summation/product.
1. |
n ≥ 1 ⇒ 0 = ∑j = 11 − 1 j ⋅ δdiv(n , j)
| Mathematical fact |
2. |
{ 0 = ∑j = 11 − 1 j ⋅ δdiv(n , j) }i = 1; { 0 = ∑j = 1 i − 1 j ⋅ δdiv(n , j) }
| Assignment |
3. |
{ n ≥ 1 }i = 1; { 0 = ∑j = 1 i − 1 j ⋅ δdiv(n , j) }
| Preconditions: 1, 2 |
4. |
{ 0 = ∑j = 1i − 1 j ⋅ δdiv(n , j) }sum = 0; { sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) }
| Assignment |
5. |
{ n ≥ 1 }i = 1; sum = 0; { sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) }
| Sequence: 3, 4 |
6. |
sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ i ≠ n ∧ n mod i = 0⇒ sum + i = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j)
| Mathematical fact |
7. |
{ sum + i = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j) }sum = sum + i; { sum = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j) }
| Assignment |
8. |
{ sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ i ≠ n ∧ n mod i = 0sum = sum + i; { sum = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j) }
| Preconditions: 6, 7 |
9. |
sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ i ≠ n ∧ ¬ (n mod i = 0)⇒ sum = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j)
| Mathematical fact |
10. |
{ sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ i ≠ n ∧ ¬ (n mod i = 0) }{ sum = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j) }
| Consequence: 9 |
11. |
{ sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ i ≠ n }if (n % i == 0) { { sum = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j) }
| Condition: 8, 10 |
12. |
{ sum = ∑j = 1(i + 1) − 1 j ⋅ δdiv(n , j) }i = i + 1; { sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) }
| Assignment |
13. |
{ sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ i ≠ n }if (n % i == 0) { { sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) }
| Sequence: 11, 12 |
14. |
{ sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) }while (i != n) { { sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ ¬ (i ≠ n ) }
| Iteration: 13 |
15. |
sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) ∧ ¬ (i ≠ n )⇒ sum = ∑j = 1n −1 j ⋅ δdiv(n , j)
| Mathematical fact |
16. |
{ sum = ∑j = 1i − 1 j ⋅ δdiv(n , j) }while (i != n) { { sum = ∑j = 1n −1 j ⋅ δdiv(n , j) }
| Postconditions: 15, 16 |
17. |
{ n ≥ 1 }i = 1; { sum = ∑j = 1n −1 j ⋅ δdiv(n , j) }
| Sequence: 5, 16 |
The proof has several parts, structured around the application of the Rule of Iteration in step 14.
Steps 1 to 5 prove that the initial assignments preceding the loop establish the loop's invariant. (Step 1 involves the sum of an empty list of numbers, yielding 0.)
Steps 6 to 13 prove that the loop's body preserves the loop invariant.
Steps 14 to 16 apply the Rule of Iteration
to construct the while
loop and correct the postcondition
to match the overall postcondition.
Step 17 pastes together the initialization code with the loop to arrive at the intended destination.
The proof system examined in this chapter has several shortcomings that are worth noting.
In this proof system, we have treated expressions
as if they could be incorporated directly into assertions' mathematical
expressions.
Unfortunately, integer arithmetic in many languages does not correspond
to mathematical arithmetic: Languages such as C have a maximum
integer beyond which arithmetic is invalid.
This is a significant shortcoming of the proof of the factorial
program's correctness: Although our “proof” shows that it operates
correctly, in fact it behaves incorrectly when n
is even
moderately large (like 25), due to problems with overflow.
For such languages, a complete proof of a program's correctness would involve a proof that overflow is impossible. Such additional details adds new degrees of complexity to the proof. The complexity is even worse in programs involving floating-point arithmetic.
Another shortcoming of this proof system is that it works only
for demonstrating partial correctness.
A proof is partial if it assumes that loops terminate.
For an example of a shortcoming of partial correctness, consider
the following program intended to compute n
!.
fact = 1;
i = 1;
while (i != n) {
}
The same invariant we examined for our earlier factorial program works for this loop, too: Since nothing happens in executing the loop's body, nothing happens that changes the invariant's truth. We could easily establish a partial correctness proof for this program, even though it clearly does not work.
Total correctness requires, in addition to partial correctness,
a proof that each loop terminates.
For the factorial program, we would need to argue that
each time through the loop, i
starts out below n
and becomes closer to n
with each iteration;
thus, it must eventually reach n
.
This is where we would bring in the fact that n
≥ 1,
which we ignored in our prior proof.
This chapter's proof system is already difficult to use; the added complexity of handling computer arithmetic, total correctness, and a more complete language only increase the difficulty of proving a program correct. Because of this complexity, it is extremely rare that anybody produces a complete proof of a program's correctness.
The backward-proof technique provides some hope that much of the proving process can be mechanized. Researchers have made substantial progress in alleviating the work of program proofs, But the technology isn't close to mainstream acceptance.
That doesn't mean that program proofs are irrelevant in practice, however. It helps for a programmer to occasionally think when writing some confusing code: If I were proving this, how would the argument go? This can be particularly helpful in reasoning about a difficult loop: You can think about what the invariant might be. A formal argument isn't necessary, but some awareness of the invariant can help in formulating the program and in catching any missing pieces.
Following are a few more small programs that you could try proving to gain additional practice.
y
≥ 0 }z = 0;
i = 0;
while (i != y) {
z = z + x;
i = i + 1;
}
z
= x
⋅ y
}
n
≥ 1 }a = 0;
b = 1;
i = 1;
while (i != n) {
c = a + b;
a = b;
b = c;
i = i + 1;
}
b
= fibn
}
n
≥ 2 }i = 2;
while (n % i != 0) {
i = i + 1;
}
b = (i == n);
b
= isPrime(n
) }
x
≥ 0 }z = 0;
b = x;
i = y;
while (i != 0) {
if (i % 2 == 0) {
i = i / 2;
} else {
z = z + b;
i = (i - 1) / 2;
}
b = 2 * b;
}
z
= x
⋅ y
}