Desirable Unit Tests

A Test Desiderata Perspective

Jan 20, 2022

The Test Desiderata collect desirable properties for tests. You can think of them as sliders—more of this, sometimes less of that. Alternatively, they form a 12-dimensional space of all possible tests.

I wrote the Test Desiderata when I noticed that some people fixate on a single kind of test, a single point in that space, and claim that those are the only valuable tests. We aren’t done exploring all of what testing can do for us as programmers or for those whose lives we affect with our programs.

Tim Ottinger recently tweeted about properties of unit tests, in particular the FIRST principles he compiled with Jeff Langr. They looked at good unit tests and described their properties. I realized I could just the opposite—look at the properties and use “settings” on each property to describe the attractor we call “unit tests”.

Here they are:

Isolated — Unit tests are completely isolated from each other, creating their own test fixtures from scratch each time. (Note that I’m not saying these are the only useful tests, just that if tests aren’t isolated you’re going to have a hard time making the case that they are “unit tests”.)
Composable — Follows from isolation.
Deterministic — Should be. Code that uses a clock or random numbers should be passed those values by their unit tests.
Specific — If a unit test fails, the code of interest should be short.
Behavioral — If the behavior changes accidentally, a unit test should fail.
Structure-insensitive — This can be a challenge for unit tests. Too much mocking, especially strict mocking, is a structure sensitivity nightmare.
Fast — Yep.
Writable — Good interface design makes writing unit tests easier. Alternatively, difficult-to-write unit tests are the canary in the bad interface coal mine.
Readable — It can be challenge to write readable unit tests, because you are seeing so little context compared to the whole system.
Automated — Yep.
Predictive — Unit tests passing likely gives little confidence that the whole system is working. Unit tests failing should give great confidence that the whole system is not working.
Inspiring — A frequently-run unit test suite gives great confidence the programming is progressing. Sometimes I run my unit tests 2 or 3 times, just because it feels good (and they’re wicked fast).

Tim Ottinger

Jan 31, 2024

Regarding "when I noticed that some people fixate on a single kind of test, a single point in that space, and claim that those are the only valuable tests"

hope it's clear that I wasn't suggesting that only one kind of test has value. The FIRST qualities of microtests (we didn't have the term back then) suggested that these are the ones best for TDD, because you can afford them in the tight, inner loop of coding.

Other tests have value, but slower tests are unlikely to be run after each minor refactoring step, like a variable rename.

Expand full comment

2 replies by Kent Beck and others

Terry Yin

May 15, 2022

Inspired by your tweet, I define a unit test as "a unit test is a small test." It is not about the size of the code under test. Small means fewer lines of test code, specifically the first A of the AAA pattern. If the tests can differentiate from each other by just a few lines of code, they are unit tests to me. I usually can achieve everything in your list, even if my test exercises thousands of lines of code, using a database, and there are hundreds of them. My e2e tests using multiple services can achieve almost everything on your list, except they are a bit slow. You might not want to run all of them 2 or 3 times for no reason, but they are bearable.

In summary, I don't do unit testing. My units (whatever that means) do not have their tests. I do unit tests, which are small tests decoupled from the code structure.

25 more comments...

Software Design: Tidy First?

Discussion about this post