Program Correctness

Created Jan 22, 2001 | Updated Jan 28, 2002

Computer programs that most people run in their daily lives tend to have bugs, so that occasionally (or more often) they fail to do what they are meant to do. So it is a cause for some concern when computers take over things where failure is not an option - things like nuclear reactors, airplanes, the traffic lights of New York, and so forth. In these situations, programmers often try to rigorously prove that their programs are correct, and hopefully avoid death and destruction.

Asking the Question

The first thing to do is to decide what it means for a program to be correct - and the normal way is with preconditions and postconditions. The program is allowed to assume that the preconditions are true, but has to guarantee that the postconditions are true when it finishes, and that it will, eventually, finish. To prove the program correct, it's then only necessary to show that, whenever the program is run and the preconditions are true the postconditions will be true afterwards.

Here's an example: suppose that we have a plan for the day and we want to check that it will make us happy and fulfilled, provided our computer is working. Then the precondition would be computer is working, and the postcondition would be I am happy AND I am fulfilled. As shown in the postcondition, you can use boolean logic in conditions to make them more easy to deal with mathematically.

To save on space, this is normally written as follows:

[computer is working]
Perform plan for the day
[I am happy AND I am fulfilled]

Or, if we elaborate on the plan, the following is also correct:

[computer is working]
Turn on computer
Log on to h2g2
Check out the Post
Write an entry
Log off h2g2
Turn off computer
[I am happy AND I am fulfilled]

Partial Correctness

Partial correctness of a program doesn't mean that it is partly correct and partly incorrect. Instead, it is just like normal correctness, except the program no longer promises to finish - it mind go into an endless loop. This is considerably less useful, but it turns out that it is often easiest to prove that a program is correct by first proving that it is partially correct, and then, seperately, proving that it will eventually finish (otherwise known as termination}.

Here is an example of a partially correct program:

{I am penniless}
start Busk to earn a pound.
Buy a Lottery Ticket
If ticket is not a jackpot winner, GO TO start
If ticket is a jackpot winner, celebrate
{I am a millionaire}

If this program ever finishes, then you'll be rich - the problem is that there's nothing to stop it going round and round in circles forever!

Getting the specs wrong

The difficulty in this process is that if you get the specs wrong, then proving that your program meets the specifications is worse than useless. Consider out original specifications for happiness and fulfillment - they could equally be filled by the following program:

[computer is working]
Remember computer deleted your homework
Smash computer with a large axe as revenge
[I am happy AND I am fulfilled]

Chances are, that's not what we want - in this case the postconditions weren't strong enough because we didn't specify that we wanted the computer to still be working after following the plan. A better postcondition would be I am happy AND I am fulfilled AND the computer is working - which would reject the 'smash the computer' plan, while still allowing the 'use h2g2' plan.

Naturally, when dealing with computer programs the pre and postconditions are statements about the values of numbers and suchlike, which are clearly mathematically true or false, and its all much more formal and much less obvious. Ahh well.

Prove It!

Now that we have an idea about what it means for something to be correct - but this doesn't help very much if there's no way to tell which is which! Unfortunately, it's impossible to take an arbitrary program and automatically determine whether it's correct or not¹. There will be some programs which are correct but which we can't prove correct - and programmers have to live with this.

Floyd-Hoare Logic

Floyd-Hoare is what most people think of by proofs - they consist of a number of lines, each of which builds from the previous ones according to some set of rules, and there will be at least one rule for each available type of command.. There are two types of rules that can be used:

Axioms in logic are the simplest possible true statements - so in arithmetic, it is taken as an axiom that the predecessor of the successor of any number is that number, and that zero is the smallest non-negative number.

When proving programs, typically there is at least one axiom for each of the simple commands. So, there might be an axioms for 'turn on computer', reading:

[computer is working AND computer is off]
Turn on computer
[computer is on]

[computer is on]
Turn on computer
[computer is on]

Rules of inference in logic are ways of obtaining true statements, given other true statements - for example, if you know that 'Fred is tall' and you also know that 'Fred is a man', then you can infer that 'Fred is tall and a man'. The rule of inference here might be: "If some person has property A, and the same person has property B, then that person has property A and property B."

In program proving, these rules are normally used to combine simple commands into more complicated commands. Some of these can be used in almost any programming language, while some of them are specific to the specific commands available (typically constructs like loops and so forth). One of them is the 'sequencing rule', which is:

If the following two statements of correctness are true:

[condition 1]
Command 1
[condition 2]

[condition 2]
Command 2
[condition 3]

Then the following statement of correctness is true:

[condition 1]
Command 1
Command 2
[condition 3]

There are many other similar rules, and putting them all together you can prove very complicated expressions - the difficulty is working out how to prove them - typically this is a very tedious process in practical, especially if you want to be absolutely rigorous.

Verfification Conditions

The idea behind verification conditions is that people and computers are pretty lousy proving stuff about the correctness of commands, but are a lot better at proving stuff about logical statements. Therefore, it is easier to compile the correctness statements into logical statements, called verification conditions.

For example, the following command for getting a full kettle of water:

[kettle=empty]
Put water in kettle
[kettle=full of water]

This might compile down to the logical statement: "A kettle is a container", which is pretty simple, and happens to be true. The logical statements generally are this simple to prove, and in practice they can often be proven automatically.

Generally the compiler needs a little help, though, because it is not obvious which logical statements should be generated. To help the compiler along therefore, a human expert needs to insert regular 'assertions' into the program. Assertions are just like pre and postconditions - except they must be true every time that they are passed - rather than just at the beginning or end of the program.

Assertions, as well as many other things in this field, are now used in a more informal manner by programmers in general. Typically, when a program is run in 'debug' mode, the computer will check that every assertion is true as it is passed - but this is slow. When run in 'nondebug' mode, the assertions will be ignored - but hopefully any bugs will have been picked up previous to this.

What Else?

People putting together nuclear reactors typically trust nobody. The compiler might have bugs in it, for example - or the rules of inference could be wrong, or the specifications could have a hole in them. Because of this it is never enough to prove something correct and then sit back and go home.

Typically systems are also put through thorough testing - which means that operators try to ensure that all of the code has been 'exercised'. The idea is that any errors will hopefully show themselves in the tests. Testing doesn't tend to help much with code that won't be run terribly often though - and ironically that is often the code that is most critical, like disaster recovery code.

So a combined approach of proof and testing is always used for critical systems, along with a philosophy of Keep it Simple, Stupid and If it ain't broke, don't fix it². In especially critical systems, it is usual to run three programs, on three different systems, designed by three seperate teams in complete isolation from one another - and use votes to decide which one is correct. And with all this, disasters still happen - Murphy's Law is stronger than any human attempt to restrain it.

¹This is related to the 'halting problem', which is unsolvable by turing machines²Fixes of bugs or perceived bugs are the number one cause of introducing new bugs

h2g2 The Hitchhiker's Guide to the Galaxy: Earth Edition

Find h2g2 Entries:

The
Hitchhikers Guide
To The Galaxy

Earth Edition

Program Correctness

Asking the Question

Partial Correctness

Getting the specs wrong

Prove It!

Floyd-Hoare Logic

Verfification Conditions

What Else?

Conversations About This Entry

Title

Latest Post

Entry

Infinite Improbability Drive

Read a random Edited Entry

Written and Edited by

Disclaimer

Write an Entry

Help

About Us

Contact Us

Follow Us

Statistics

Other Stuff

h2g2 The Hitchhiker's Guide to the Galaxy: Earth Edition

Find h2g2 Entries:

TheHitchhikers GuideTo The Galaxy

Earth Edition

Asking the Question

Partial Correctness

Getting the specs wrong

Prove It!

Floyd-Hoare Logic

Verfification Conditions

What Else?

Conversations About This Entry

Title

Latest Post

Entry

Infinite Improbability Drive

Read a random Edited Entry

Written and Edited by

Disclaimer

Write an Entry

Help

About Us

Contact Us

Follow Us

Statistics

Other Stuff

The
Hitchhikers Guide
To The Galaxy