Shut Up and Compute | islam.ninja

AI’s interpretation is running faster than its evidence. The fix is less prophecy, more contact.

There’s an entire industry now for having opinions about AI.

Think pieces. Panels. Forecasts. One side says it will free us. The other says it will wreck us. Almost none of it is testable.

The interpretation is running faster than the evidence.

Physics went through something like this once. In the early decades of quantum mechanics, the predictions worked. The math was productive. The arguments about what it all meant never really ended. Decades later, David Mermin gave that attitude a name: shut up and calculate. Not because meaning did not matter. Because the formalism was moving faster than the story, and if you waited for a final interpretation you would wait forever.

AI is in that phase now. The tools are moving faster than the consensus about what they mean. That does not make the bigger questions fake. Labor effects, institutional lag, power concentration, distributional harm — all real. But if your goal is to understand what is changing at the level of work itself, altitude becomes a trap.

The unit that matters is not “work.” It is tasks. Handoffs. Approvals. The little steps that take forever.

That is the altitude to look from. Not because the higher questions are unimportant, but because this is where the higher questions become measurable. A claim about replacement, leverage, dependence, or harm has to pass through some actual chain of work before it becomes real.

When thinking gets cheap, what stays expensive?

Usually not the whole job. Usually the judgment around it.

Drafting gets cheap. Deciding what deserves a draft gets expensive. Code gets cheap. Defining the constraints and verifying correctness gets expensive. Research gets cheap. Knowing what to ignore gets expensive.

The work does not disappear. It moves. And you do not find out where it moved by arguing from a distance. You find out by touching the workflow.

In one engineering workflow I care about, the drafting step has collapsed. A feature that used to take an hour of blank-page effort now appears in minutes. But the bottleneck did not vanish. It moved upstream — into framing the problem clearly enough that the model can act without wandering — and downstream, into verification: whether the output is merely plausible, whether it respects hidden constraints, whether it fits the system around it. The gain was real. The new cost was real too. That is the shape of the shift worth studying.

The same thing shows up in smaller places. Give an agent a bug report and it can usually produce a patch before a human has finished warming up. But the real question is not “can it code.” It is whether the bug report was specific enough, whether the reproduction was true, whether the test actually covers the failure, whether the fix touched a boundary the report did not mention. If you measure only generated code, AI looks like pure acceleration. If you measure the whole loop, the speedup is still there, but the expensive part has moved into problem definition and review.

This is where most commentary becomes less useful. Grand theorizing is often a way of not building. If the implications are big enough, you can justify standing back. If the meaning is unsettled, you can stay in critique mode indefinitely.

Some theorizing is necessary. Theory is how you connect local observations to bigger structures. But theory built too early is often just mood with a vocabulary. Mermin was not telling physicists to stop thinking. He was telling them not to confuse stories with results.

A better posture for AI than prophecy is the one Taiichi Ohno took on the factory floor. He did not begin with a grand theory of industrial civilization. He watched where parts piled up. Where workers waited. Where defects slipped through. The unit was the handoff, the blockage, the wasted motion. The theory that later emerged — what became the Toyota Production System — was powerful precisely because it was reverse-engineered from contact with the system.

Treat AI less like a debate and more like an instrument that reveals where your previous assumptions were wrong. Check workflows, not vibes. Ask small questions you can answer. What got faster? What got cheaper? What got riskier? What now needs tighter review? What suddenly became possible because a coordination cost collapsed?

That means keeping a boring ledger before drawing the interesting conclusion. How long did the task take before and after? Which steps disappeared? Which new review steps appeared? How many outputs were accepted without revision, and how many merely looked acceptable until someone with context touched them? The ledger will not answer the civilization-scale questions by itself. It will keep the answers from floating away from reality.

Real understanding tends to arrive later, from people who stayed close enough to reality to be corrected by it. They will still have views about ethics, labor, and power. They will just have earned them.

Shut up and compute. Not because meaning does not matter. Because this is how you get close enough for meaning to show up.