23 Apr 2013

# Proving versus testing programs

One of the advantages of the purely-functional paradigm over the imperative or object-oriented ones is that functional programs are easier to reason about. For example, consider the following distributivity law between `reverse` and `++`:

``reverse (xs++ys) = reverse ys ++ reverse xs``

We can prove that this property holds for all finite lists using only high-school algebra and induction. In fact, such techniques have been taught in first-year university courses on functional programming since the 1980s.

But properties can also be used for testing programs when you don't want to invest the effort to do a formal proof. QuickCheck is an excelent Haskell library that allows expressing properties about programs and testing them with randomly-generated values.

A property for QuickCheck is simply a truth-valued function whose arguments are implicitly universally quantified. The library exports function called `quickCheck` that tests a property with 100 randomly-generated values (by default). QuickCheck uses types to generate test data, starting with short lists and small integers and generating progressively larger values. If the property fails for some value, a counter-example is shown; otherwise the test succeeds.

Run the code bellow to test our distributivity law for `reverse`.

``````import Test.QuickCheck

prop_revapp :: [Int] -> [Int] -> Bool
prop_revapp xs ys = reverse (xs++ys) == reverse xs ++ reverse ys

main = quickCheck prop_revapp``````

The tests failed! Can you spot what is wrong?

The failure of the test case above shows another cool QuickCheck feature: when a counter-example is found, QuickCheck attempts to shorten it before presenting using a shrinking heuristic. This is great because randomly-generated data contains a lot of uninformative "noise"; a short counter-example is much more useful for debugging our programs.

In the test above QuickCheck actually found the minimal counter-example: the wrong equation still holds trivally if either of the lists is empty or if they contain identical values (e.g. if both lists are `[0]`) but fails for `[0]` and `[1]`.

# A case-study: string splitting

Consider the following list processing problem I gave my first-year functional programming students:

Define a recursive function `split :: Char -> String -> [String]` that breaks a string into substrings delimited by a given character. Some usage examples:

``````split '@' "pbv@dcc.fc.up.pt" = ["pbv","dcc.fc.up.pt"]
split '/' "/usr/include" = ["", "usr", "include"]``````

Sugestion: use the `takeWhile` and `dropWhile` functions from the standard Prelude.

The sugestion is to use some standard higher-order functions from the Prelude to break the string recursively. Here's an initial attempt at a solution:

``````split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs
``````

However, the above definition has one subtle mistake. Let us employ testing to spot and correct the bug.

## Manual testing

Let us start by testing the defintion with a few sample cases:

``````split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs

-- show
examples = [('@',"pbv@dcc.fc.up.pt"), ('/',"/usr/include")]

test (c,xs) = unwords ["split", show c, show xs, "=", show ys]
where ys = split c xs

main = mapM_ (putStrLn.test) examples
-- /show``````

Our two test cases gave the right answers. At this point you might be able to guess some more interesting cases to check for "edge" conditions (you may edit the `examples` list above). But another issue is figuring out what the right answers should be for such cases. This is not too difficult for a simple function, but can be much harder for more complex problems --- even for experienced programmers.

Let us instead ask ourselves one question: can we think of general properties that `split` should satisfy?

We can devise a property for `split` by consider the inverse function of `split` which I shall call `join`. Here are some examples of what we want `join` to do:

``````join '@' ["pbv", "dcc.fc.up.pt"] = "pbv@dcc.fc.up.pt"
join '/' ["", "usr", "include"] = "/usr/include"``````

Generalizing from such examples we can define `join` as the composition of `concat` and `intersperse` (from `Data.List`).

``````join :: Char -> [String] -> String
join c = concat . intersperse [c]``````

We can now express one interesting property: if we first split and then join with the same delimiter, we should get back the original string (i.e. `join` is the left inverse of `split`). This is expressed as the following QuickCheck property:

``prop_join_split c xs = join c (split c xs) == xs``

You should convince yourself that the property above should hold for all characters c and strings xs regarless of how many times c occurs in xs (including zero occurences).

## A first attempt

We can now attempt to test `prop_join_split` using QuickCheck.

``````import Test.QuickCheck
import Data.List (intersperse)

split :: Char -> String -> [String]
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs

join :: Char -> [String] -> String
join c = concat . intersperse [c]

-- show
prop_join_split c xs = join c (split c xs) == xs

main = quickCheck prop_join_split
-- /show``````

Most likely you have observed that 100 random tests passed (or you may have hit a counter-example if you were lucky). In any case, we should be aware of Dijkstra's remark that testing shows only the presence of bugs, not their absence!

For random testing it is important to ensure that the test data is reasonably distributed, otherwise we simply get a false sense of security. QuickCheck allows us to collect information on data distribution easily: for example, let us collect the lengths of the `split` results.

``````import Test.QuickCheck
import Data.List (intersperse)

split :: Char -> String -> [String]
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs

join :: Char -> [String] -> String
join c = concat . intersperse [c]

-- show
prop_join_split c xs
= let ys = split c xs in
collect (length ys) \$ join c ys == xs

main = quickCheck prop_join_split
-- /show``````

The report shows that around 90% of the strings are split into lists of length 0 or 1. This is because when both the delimiter c and string xs are choosen independently it is unlikely that c occurs in xs, and if so, more than once. The property might still fail for other cases, which means we haven't tested our program thoroughly.

## More thorough testing

To conduct a more thorough testing we should modify our property slightly: select delimiters that do occur in the string. We can do so by introducing a variant `prop_join_split'` with an explicit quantifier for choosing the delimiter.

``````import Test.QuickCheck
import Data.List (intersperse)

split :: Char -> String -> [String]
split c [] = []
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs

join :: Char -> [String] -> String
join c = concat . intersperse [c]

-- show
prop_join_split' xs
= forAll (elements xs) \$ \c ->
join c (split c xs) == xs

main = quickCheck prop_join_split'
-- /show``````

This time we found a short counter-example very quickly where the string contains a single delimiter character.

``````split 'a' "a" = [""]
join 'a' [""] = ""``````

This is a counter-example to our property because `"a" /= ""`. But what should we modify to fix the problem? In general, we would have three alternatives:

1. modify the definition of `join`;
2. modify the definition of `split`;
3. revise the property `prop_join_split` to exclude such cases.

In our case we can reject #1 and #3 because `join` and `prop_join_split` are short, self-evident definitions; hence we will fix the definition of `split`.

## Fixing split

It it easy to see that our property will hold in the counter-example case found if `split` gave the following result:

``split 'a' "a" = ["",""]``

This implies that the equation defining `split` for the empty list is wrong; it should instead be:

``split c [] = [""]``

Better still: we can just delete this equation altogether because the general case will give this result anyway after a single recursion step. With this correction our solution passes all tests.

``````import Test.QuickCheck
import Data.List (intersperse)

join :: Char -> [String] -> String
join c = concat . intersperse [c]

-- show
split :: Char -> String -> [String]
split c xs = xs' : if null xs'' then [] else split c (tail xs'')
where xs' = takeWhile (/=c) xs
xs''= dropWhile (/=c) xs

prop_join_split' xs
= forAll (elements xs) \$ \c ->
join c (split c xs) == xs

main = quickCheck prop_join_split'
-- /show``````

Note that, considered in isolation, both definitions `split c []=[]` and `split c []=[""]` seem acceptable (our property holds for the empty string in both cases); the first definition may appear simpler. It is only when we consider the recursion as a whole that we see that defining `split c []=[]` is less orthogonal because it breaks our property for strings terminated by the delimiter character.

# Conclusion

QuickCheck is a great way to test Haskell programs, not just because it allows conducting many tests cheaply, but mostly because it encorages thinking about general properties that our code should satisfy rather than just specific instances. Thinking early about properties leads to a cleaner, more orthogonal design (i.e., with fewer arbitrarily-specified corner cases).

There is a lot more about QuickCheck that I have not covered in this tutorial namely:

• writing custom generators and shrinking heuristic for user-defined data types;
• controling the size of generated data;
• controling the number of tests performed;
For more information check the original QuickCheck paper or the one on monadic testing; you should also lookup the `QuickCheck library documentation`.