SmallCheck in Rust - Part One

This is the first in a series of experience reports on the implementation of SmallCheck in Rust. You can find my source-code on Github. This article focuses on SHA 6b758b.

SmallCheck is a property testing approach proposed by Runciman, Naylor and Lindblad in their 2008 "SmallCheck and Lazy SmallCheck: Automatic Exhaustive Testing for Small Values". Runciman, Naylor and Reich sum up the approach in their 2013 "Advances in Lazy SmallCheck":

A property-based testing library uses a strategy to search the test data space for counterexamples to a given property. For example, QuickCheck randomly selects a fixed number of test-data values. SmallCheck, on the other hand, exhaustively constructs all possible values of a particular type, bounded by the depth of construction (or some appropriate metric for non-algebraic types).

That might be a little inside-baseball if property based testing is new to you. The idea is straightforward. You've got some piece of software you want to test, maybe it's a function, maybe it's an integrated piece of a program. Whatever it is, we'll call the general sense of thing-we-want-to-test a System Under Test (SUT). The common way of testing is this: you, the developer, think up some inputs for the SUT, feed those inputs to the SUT and then verify that the outputs and/or side-effects are what you need them to be. Testing in this manner conflates two things:

  • production of input data
  • verification of software property

Property-based testing separates those two. Production of input data becomes an automatic process done by software. Properties are still produced by developers. QuickCheck – introduced by Claessen and Hughes in their 2000 "QuickCheck: A Lightweight Tool for Random Testing of Haskell Programs" – works by producing values from a pool of inputs at random. That is, if you want to verify that a + b == b + a for all positive integers (not subject to overflow) then QuickCheck will produce N random values of a and b and check the above property. It works super well.

Now, what about that caveat "not subject to overflow"? One problem with QuickCheck is that if a failure condition is suitably rare it's possible that QuickCheck will miss the inputs. This is a function of the exact production strategy your QuickCheck implementation is using as well as how many test cases you're running. In Rust

<u8>::max_value() + <u8>::max_value()

is not even a u8! This is true for any a: u8 and b: u8 where a >= <u8>::max_value() / 2 and b >= <u8>::max_value() / 2. Finding two values where this holds in the pool of possible u8s is not unlikely but kick that up to u128 and it's going to be less so.

SmallCheck trades this possibility of missing inputs for time. Rather than choose random inputs from the pool it enumerates over all possible inputs, up to some "depth" proposed by the tester. A type u8 will be the sequence 0, 1, 2, 3, ... <u8>::max_value() etc. The original SmallCheck paper asserts that

(t)he principal motivation for this approach can be summarised in the following observations, akin to the small scope hypothesis behind model-checking tools such as Alloy (Jackson 2006). (1) If a program fails to meet its specification in some cases, it almost always fails in some simple case. Or in contrapositive form: (2) If a program does not fail in any simple case, it hardly ever fails in any case. A successful test-run using our tools can give exactly this assurance: specified properties do not fail in any simple case.

It's an interesting hypothesis. The authors are well aware that exhaustive searching is time consuming – which is maybe okay – but also wasteful – which is not okay. The motivating example in the 2008 paper is insertion into a set represented by a list.

type Set a = [a]
insert :: Ord a => a -> Set a -> Set a
ordered :: Ord a => Set a -> Bool

This insert should always preserve the property:

prop_insert Set :: Char -> Set Char -> Property
prop_insert Set c s =
  ordered s ==> ordered (insert c s)

In English: if a character c is inserted into a set s the resulting set s' must also be sorted. SmallCheck will have to enumerate two types here: a character c and a set s. For the test to be performed s must be sorted, requiring that any s which is not be thrown out and the next enumeration tried. "Strict" SmallCheck will generate boatloads of these things, even if it's clear that ordered will always be false. Consider the infinite number of sets which start [1, 0 ...]. All that construction is wasted time, which, if you're capping SmallCheck runs by time implies that you're testing less actual cases. The 2008 paper additionally proposes a "Lazy" SmallCheck to overcome this waste problem.

This ability to see the result of a function on many inputs in one go is very attractive to property-based testing: if a property holds for a partially-defined input then it will also hold for all fully defined refinements of that input. The aim of Lazy SmallCheck is to avoid generating such fruitless refinements.

Again, a little inside-baseball. What the authors have done is to exploit a characteristic of Haskell's laziness – partially defined inputs – to exclude all fully defined inputs which overlap in some important way. In the above, any fully defined s which overlaps with [1, 0 ...] can be tossed out.

I'm interested in replicating the author's work on Lazy SmallCheck in Rust. I think it should be possible, even though Rust does not, itself, have some of the key concepts that the author's exploit in their original paper. I started laying the groundwork in smallcheck 6b758b and stopped when I ran out of weekend. I'd originally hoped to produce a 'strict' variant of SmallCheck first but found out that's going to be more trouble than it's worth, unless I have some great insight through the week.

// Well, crap. I think to implement `next` for an arbitrary
// structure I'm going to have to look into the lazy
// approach. The problem here being that, while I can fiddle
// with simple types directly this is a cheat. `Serial`
// really has some state or, absent state, must _always_ be
// able to be interpreted into the 'next' value. 

We'll see!