The two-second pause: the freeze that hid inside the rain

← All field notes

The Long Watch is a slow game on purpose. The whole register is unhurried, golden-hour calm — a world you can sit with. So the bar we set ourselves was a plain one: a finished world should hold your attention for fifteen minutes of simply watching it breathe.

And for a long time it couldn’t, because every second and a half or so the entire game would lock solid for roughly two seconds, lurch, and lock again. The stutter had been there for ages, predating the work around it; we’d just never chased it down. This is the story of chasing it down — and of how the culprit turned out to be the rain.

The wrong suspect, cleared early

The world’s terrain is reshaped over time by a rainfall-and-erosion simulation: roughly once a second it works out how much rain is falling everywhere, then hands a fast carving pass to the graphics hardware to actually wear the land down. Erosion is meant to be slow — it plays out over in-game years — but the bookkeeping behind it ticks about once a second, which lined up suspiciously well with a freeze that recurred about that often.

So we reached for the obvious suspect first. Handing a job to the graphics card and reading the answer back is exactly the kind of operation that stalls a game — the processor sits idle, waiting on the hardware to finish and report back. It was the natural thing to blame.

Then we measured it instead of guessing. The graphics round-trip came back healthy and fast — a couple of milliseconds, nothing you’d ever feel. It was innocent. The freeze was somewhere else entirely, and it had been sitting in plain sight the whole time.

The obvious suspect is the one you have to measure first, precisely because it’s obvious.

It was the rain, all at once

Before the carving pass can run, the simulation needs a fresh map of how much rain is falling at every point on the world grid — tens of thousands of cells. And the old code built that whole map in a single stroke, the instant each erosion tick came due, on the very thread that has to keep drawing frames and keeping the game responsive. One giant burst of sampling, about once a second, blocking everything else while it ran. That was the freeze.

Each cell on its own is cheap, but it isn’t a simple lookup. Working out the rainfall at one spot walks the weather’s storm schedule, layers in a few kinds of noise, and applies the seasonal swing. Tiny per cell — ruinous times tens of thousands, all in one frame, with everything else the frame has to do competing for the same moment. The processor wasn’t waiting on anything. It was buried in honest work, doing all of it at once.

A golden-hour voxel valley under rain, shown as one heavy downpour on one side and a soft even drizzle on the other.Concept art · pre‑alpha
Same rain, same result — delivered all in one burst, or trickled gently across many moments.

The property that let us cut it up

The fix turned on one quiet, important fact about the weather model: asking it how much rain falls here, at this spot, at this moment always returns the same answer for the same inputs. The weather carries no hidden state that drifts between frames — it’s a pure, repeatable calculation. (That purity is the same discipline the whole simulation leans on, and it has its own story.)

If the answer never depends on when you ask, then you don’t have to ask for all of it at once. We could fill the rainfall map a little at a time — a few thousand cells each frame — spreading the identical work across many frames and accumulating the result, and the finished map would be exactly the same as if we’d built it in one stroke. The erosion tick simply waits for a complete map before it carves; it never works from a half-filled one. And because erosion is so deliberately slow, taking a few extra seconds to assemble the map is imperceptible against a process that models years.

The crucial line is that this changes only when the rainfall values are computed, never what they are.

Spread heavy work thin across time. Refuse to let any single frame do too much.

Proving it, not assuming it

“The same calculation, just split up” is the kind of claim that feels obviously true and is exactly where bugs hide. So we didn’t trust it — we proved it. The gradually-built map was checked, byte for byte, against the all-at-once map, and that exact match was the hard gate: the fix only counted if the answer never moved by a single bit. A faster way to get a different world is not an optimization. It’s a regression wearing a disguise.

There was one honest detour worth admitting. We first sized the per-frame slice using a cost we’d measured in isolation — and that estimate turned out to be roughly ten times too optimistic once the sampling had to share the thread with terrain rendering and the real scene streaming in. Under that real contention each cell cost far more than the clean-room number, so a slice that looked comfortable on paper still tanked the frame rate. The lesson there is older than this bug: tune against the contended conditions the work will actually run in, not the idealized cost of it running alone.

The clever guard we built, then deleted

We very nearly over-engineered the fix. While the terrain was still warming up — loading and streaming in — the frame was at its most crowded, so we added a guard that would pause the rainfall sampling until the terrain had settled. Reasonable on its face.

Two things killed it. First, the engine offered no reliable signal for “terrain finished loading,” so the guard was leaning on something it couldn’t actually know. Second, when we tested step by step whether it was pulling any weight, it wasn’t: simply choosing a smaller, well-measured per-frame slice held the frame rate on its own. The guard was redundant. So we took it out, in favour of the simplest thing that worked — and as a bonus, that kept the simulation cleanly separated from anything that peers at the rendering side, which is a boundary we like to keep clean. The simplest fix that holds is the one to keep. Delete the clever machinery the moment it stops earning its place.


What’s left standing guard

The recurring freeze is gone. The world now runs at a steady frame rate, and you can sit and watch it the way the game always meant you to. But a fix you can’t defend will quietly rot, so three always-on checks now stand watch so the stutter can’t creep back unnoticed: one that trips if any single frame takes too long, one that watches the frame-rate floor, and one that re-confirms the spread-out map still matches the all-at-once map exactly. If the freeze ever returns, or the trick ever stops being honest, one of them goes red before a player ever feels it.

This was one freeze in one system, but the shape of it travels. Measure the obvious suspect before you trust it. Find where the time actually goes. Lean on a calculation being pure so you can split it across time without changing its answer — and then prove the answer didn’t change. Tune against real load, not a quiet room. And keep deleting the cleverness you don’t need. The broader, profile-first pass that taught us where a whole frame’s time really hides is its own story; this one was just the rain.

A calm world is an engineering achievement, not only an art-direction choice. The way you keep it calm is by refusing to let any single moment carry more than it can.

Keep reading

Concept art · pre‑alpha