Handing the rivers to the graphics card: our first whole simulation, moved off the processor

← All field notes

The land in The Long Watch is meant to erode continuously and slowly as the world runs — rain gathering, water finding the downhill, the ground wearing a hair thinner here and building a hair thicker there. The trouble was the bill.

Re-computing erosion across the whole map, every world tick, on the same processor that has to keep drawing the scene, simply would not fit inside the time a single frame gets. So we did something we had never done before in this game: we lifted an entire simulation off the main processor and handed it, wholesale, to the graphics card.

This is the engineering story of that handoff — the first time a real, ongoing world system in The Long Watch lived not on the processor but on the card. The carving itself, what the eroding land feels like to live in, is told from the world’s side in A world that weathers. This one is about where the computation runs, and what it cost us to trust it there.

Why it had to move

A graphics card is built for exactly the shape of work erosion is: the same small calculation, run independently across an enormous grid of cells, all at once. The map of land we erode is a grid — roughly a quarter of a million cells, each one standing for a square of ground about eight metres on a side. Stepping that grid means walking every cell and doing the same handful of arithmetic at each: let moisture accumulate, flow it downhill toward the steepest of its neighbours, and let the ground carve or deposit a sliver. On a processor, that is a quarter-million little jobs queued one behind the other. On a card, it is a quarter-million jobs done shoulder to shoulder.

So we wrote the whole step as a single batched job for the card, dispatched once per world tick — and the world ticks about once a second, the slow heartbeat the rest of the simulation already runs on. The card walks every cell in parallel; the processor only has to ask for the step and move on.

The speed was the easy part

The card paid off immediately, and almost embarrassingly. A single step of the entire grid — the full quarter-million cells — cost only a couple of milliseconds. And because that step fires just once a second, while the visible change it produces takes many seconds to add up to anything, you can amortise its cost across all the frames in between. Spread that way, its contribution to any one frame came out well under a tenth of a millisecond — effectively nothing against a frame’s budget. Under the combined load of rendering and simulating, the frame rate held comfortably around sixty-two frames a second, with plenty of room above the floor we refuse to drop below.

That is the part of this story that sounds like a win and was. We had taken a calculation the processor could not afford and made it nearly free. If speed had been the whole problem, we would have shipped it that afternoon and written nothing down.

Graphics hardware is fast but fickle. The speed was never the question. The question was whether we could trust the answer.

The harder problem: making the card honest

The Long Watch rests on one promise that everything else leans against: the same seed must grow the same world, every run, on every machine. Hand a friend a seed and you hand them the exact same place. Reload a save and it comes back byte-for-byte identical. That reproducibility is not a feature you can bolt on afterward — it is load-bearing, and the wider machinery that guards it has its own story.

The catch is that fast parallel hardware does not hand you bit-exact, repeatable results for free. The card is wonderful at doing the same thing a quarter-million times at once; it is not, by default, careful about doing it identically at once, run after run, machine after machine. Two problems stood between the working erosion step and an erosion step we could trust to honour the promise. We had to solve both before the card could be allowed anywhere near the saved world.

One: the numbers that quietly vanished

The first was a rounding quirk in the hardware we tested on. When a value got vanishingly small, the card quietly snapped it to zero rather than keeping it. For most arithmetic that is invisible and harmless. For ours it was a slow poison: in the low-moisture corners of the map, a cell’s moisture can drift down toward almost-nothing, and if the hardware rounds that almost-nothing away on one run but not on another, two otherwise-identical worlds begin, cell by cell, tick by tick, to diverge. Nothing dramatic happens at once. But reproducibility is a thing you have either perfectly or not at all, and a single cell quietly disagreeing is already broken.

The fix was small and blunt: never let a cell’s moisture fall below a tiny fixed minimum. Hold every value above that floor and the arithmetic stays inside the range the hardware treats the same way every time. The almost-nothing never reaches the threshold where the card decides to round it away, so the two runs can’t part company there. A one-line guard against a one-bit betrayal.

Two: proving it, not believing it

The second problem was simply proving the result was stable — because, after the loop we have been bitten by, we no longer take “it should be reproducible” as anything but a hypothesis. So we ran the whole erosion pipeline twice, in two separate launches of the program, and checked that the entire moisture field across the whole grid came out bit-for-bit identical between them. Same seed, same world, two cold starts — and not one cell of difference. Only once that held did we believe the card was honest.

Then we locked it down. We folded that exact, verified result into a single fingerprint and recorded it as the value to watch, so that any future change which silently nudges the math — a reordered operation, a different rounding, a well-meant tweak to the flow rule — trips the guard the moment it lands, instead of surfacing three sessions later as a save that won’t load. The general shape of that tripwire is the determinism story; here it just meant the card’s erosion shipped with a permanent witness standing over it.


How slow it really is

Each step moves the ground by far less than a single voxel — deliberately sub-voxel, so the per-tick change is invisible and only the years add up to anything; what that slowness feels like to live in is its own story.

That deliberate slowness is also why the slow orbiting world you look at before you settle into a save shows the land before any of this has run — the surface exactly as the world begins, un-carved. The erosion only starts once you actually step in and time begins to pass; that first long look is its own story.

What it set the template for

The reason this work mattered to us is not really the rivers. It is that erosion was the first whole simulation we proved could live on the card, continuously, underneath everything else — cheap enough to vanish into the frame, and reproducible enough to trust with the saved world. Getting both of those true at once, on real hardware, is what turned the graphics card from a thing that only draws the scene into a thing that can quietly run part of the world. That proof is the template the rest of the slow world tick gets to follow. The first one is always the expensive one, because the first one is where you find out whether the bet pays. This one did.

Keep reading

Concept art · pre‑alpha