arnau

snapshots, property and simulation testing

Your test suite passes. Your coverage is 90%. You deploy to production. Your software breaks.

Traditional testing measures the wrong things. It counts lines of code covered, not bugs prevented. The best way to avoid bugs is to not write them in the first place — but when you do, you need tests that actually catch them. Unit tests mostly don't. They test components in isolation, missing the interactions that cause real failures. There are better tools. Most engineers never find them.

snapshot testing

I was reviewing tests with someone. Two structs compared field by field. Add a field, the test breaks. Rename a field, the test breaks. Nothing was wrong with the code. I suggested serializing and comparing the output instead. If the shape changes intentionally, you update the snapshot. If it changes by accident, the test catches it.

Snapshot testing captures the output of a function and compares it to a stored version. Not field by field. The whole thing. When something changes, the test fails and shows you exactly what changed. You decide if it was intentional. You stop asserting what the output should be and start asserting it should not change unexpectedly.

property testing

Unit tests cover the cases you think of. Property tests cover the cases you don't.

Instead of writing specific inputs and expected outputs, you define properties that must always hold. Your payment amount should never be negative. Your array count should never exceed capacity. Your serialization should always round-trip. The framework generates thousands of random inputs and tries to break them.

This is where integer overflows surface. Edge cases you never imagined. Inputs that only fail at i32::MAX. You did not write a wrong test. You wrote a correct property and the framework found where your code violates it. Jane Street runs this in CI on every change. The bugs it finds are the ones that matter.

Property testing forces you to think in invariants. What must always be true? That question alone makes your code better before you write a single test.

simulation testing

Some bugs only appear under specific sequences of operations. Not wrong inputs. Wrong order.

You push, pop, push again, and something breaks on the third operation only if the second one ran concurrently with the first. Simulation testing generates random sequences of operations, runs them against your implementation and a reference model, and verifies invariants hold after every step. TigerBeetle runs their entire database through this. Millions of operation combinations. Bugs that would take months to find in production surface in minutes.

You will not find this technique until you are writing systems where correctness actually matters. The hard stuff. Which is exactly why most engineers never find it.

Write hard things. Then write tests hard enough to test them.

Unit tests are not enough. They never were.