Every test passed, and it still broke
There is a particular flavour of humility that only a green check board can deliver. We had finished three new powers, run the whole suite, watched everything pass with nothing failing, called it done, and pushed it. The next day someone opened the real game. It crashed at launch — instantly, before a single frame of the world could draw. Every test we owned said the feature was finished. The game disagreed.
The powers themselves are the first that act on the sky rather than the land or the creatures — summon rain over a patch, calm a storm, push back frost to warm a little ground. How we taught a read-only sky to carry a change at all is its own engineering story, told in teaching a read-only sky to change. This post is about something narrower and, for us, more lasting: the gap between a board that says done and a game a person can actually start — and the rule we wrote down the day we fell into it.
Green on the wrong axis
The build had come together block by block, exactly the way you want it to. Each piece had a check. Each check passed. When the last one went green we ran the full suite, saw it come back clean with zero failures, and treated that as proof the feature was real. We even went a step further the next day and signed off the formal close, settling a last design call along the way — that calm a storm should quiet only the storm’s extra gusts and leave the everyday seasonal wind alone.
So by the time the crash showed up, this was not a feature anyone thought was in doubt. It was a closed, signed, pushed feature. That is exactly what made the crash instructive instead of merely annoying. A red check is a feature telling you the truth about itself. What we had was the opposite: a feature that had passed every question we knew to ask, and was still broken in the one place we’d never thought to ask about.
A full suite of passing checks is a statement about the world your tests built — not the world the player loads.
The world the tests never booted
The bug underneath the crash was small and humbling. When a world is created, a whole chain of references is handed forward into the live scene the player actually plays. The new piece that drives the weather powers had been built correctly — but it was never copied across that one boundary in the hand-off. The startup code reached for it, found nothing where it expected something, and fell over before the scene could finish loading.
The reason no test caught that is the real lesson, and it is uncomfortable. Not one of our checks ever booted the real played scene. Every one of them stood up its own little world by hand — a convincing stand-in, assembled another way — and reached the new weather machinery through a side door rather than through the genuine startup path. The one place where the reference was supposed to cross was the one place no test ever walked through. So the suite was green precisely because it was testing a world that never had the bug in it.
There was a check that booted the real scene. It had simply never been switched on — it existed, off to the side, registered with nothing that actually ran it. A perfectly green run sat on top of a hole shaped exactly like the door the player enters through.
The fix, and the rule the fix wasn’t
The immediate repair was a single thread. The weather machinery had a sibling — the read-only weather system — that already made the crossing correctly; we threaded the new piece across the same boundary the same way, with a guard on both the fresh-world path and the loaded-save path. The scene booted. The crash was gone.
But the durable change wasn’t the thread. It was a rule we wrote down for ourselves the same day, and it is the part of this story worth keeping:
A feature that touches the real startup path is not done until a registered check boots the real played scene — not a stand-in, however convincing.
To make the rule more than a good intention, we added a check that drives the genuine startup end to end — the actual scene a player gets, brought up the actual way the game brings it up. We proved it the only honest way: we ran it against the broken code first and watched it fail, then ran it against the fix and watched it pass. From now on, this whole class of gap shows up as a red check before anything ships, not as a crash after.
This was not the first time a green close measured the wrong axis — an earlier chapter was declared correct by its own checklist while almost none of it could be seen happening in the running game. And it sits inside a larger family of green-but-hollow checks we’ve catalogued elsewhere — checks that were registered but unloadable, or filed but never run — in the test that proved nothing. The thread connecting all of them is one stubborn question we’ve learned to ask of any green run: not “did it pass?” but “could it have failed, in the place that actually matters?”
The humbler bug in the same hour
While we were in there, a second, smaller indignity surfaced — the kind that no amount of correct code can catch, because it isn’t about whether the code runs but whether a person can reach it. The three new powers, along with the controls for looking around, had been bound to keys past the end of an ordinary keyboard — up in the far function row, off the edge of the twelve keys most keyboards actually have. The code was flawless. It was just stranded somewhere your hands can’t go.
So we remapped everything down onto keys that exist on real hardware: the three weather powers onto 1, 2, 3, the camera modes onto three plain letters, and a written list of the controls so a player can find them. “The code runs” and “a human can use it” turn out to be two entirely separate claims, and that day taught us both at once.
There was even a quiet third lesson in housekeeping. The little fix that threaded the missing reference nudged the startup file just over a size ceiling we hold ourselves to — a soft cap meant to keep any one file from quietly becoming a place too big to reason about. Rather than wave the line through, we lifted the overlay-building code out into a neighbouring file. No behaviour changed; the startup path simply got smaller and easier to read, which is exactly the moment you want it to, right after it has surprised you.
What actually shipped
It is worth being plain about what these three powers do and don’t do yet, because “done” is the word that nearly fooled us. For now they are look-and-feel: they change the sky, the falling particles, the mood over a patch of ground for a while, and then fade on their own. They do not yet reach down and change what the plants and creatures beneath them do — teaching a wish to bite the simulation is a deliberate later step, not part of this work. And because a player’s weather wishes are rebuilt from the record of what they did whenever a world reloads, none of this needed a single byte of new save data.
That is a modest list for a feature we’d called finished a day early. But the close that finally stuck wasn’t the all-green board — it was the moment the world’s caretaker actually started the game, played it, felt the gust quiet under calm a storm, and only then signed off. A green run can tell you the machinery moves. It can’t tell you the door opens. We added a check to ask that last question, and then we let the game answer it the way it always has the final say — by being played.



