Skip to content
islam.ninja
Go back

The Engine Hunts

14 min read

The fastest cognitive engine ever built needs the same thing Watt’s steam engine needed in 1788: not more power, but a mechanism to make power usable.


In 1788, James Watt had a problem that had nothing to do with power.

His improved steam engine was the most capable prime mover in the world. It could drive looms, pumps, millstones — anything a factory owner connected to its rotating shaft. But factory owners were complaining. The engines surged and stalled. When a loom engaged or a millstone bit into grain, the load on the engine changed, and the engine could not adapt. Too much steam and the machinery ran dangerously fast — belts snapped, gears stripped, workers were hurt. Too little and the factory ground to a halt. The engine had more than enough force. What it lacked was the ability to match its output to what the world around it could actually use.

Drawing on a device already used in windmills, Watt adapted the centrifugal governor. Two heavy metal balls mounted on hinged arms, spinning with the engine shaft. As the engine sped up, centrifugal force swung the balls outward, which partially closed the steam valve, which reduced power, which slowed the engine. As the engine slowed, the balls fell inward, opening the valve, admitting more steam. A continuous mechanical feedback loop. The governor did not generate power. It regulated power. It sat between the engine and its work and continuously adjusted the one to match the other.

Before the governor, Watt’s engine was a machine with a temperament — powerful but unpredictable, useful only where its surges could be tolerated. After the governor, it was a machine that could be trusted. Factories that had relied on water wheels for their predictability began switching to steam. Not because the engine got stronger. Because it got governable.

We have built the most powerful cognitive engine in history. It can reason, plan, write, code, and analyze faster than any system before it. We are now connecting it to the world — to tools, APIs, databases, browsers, email, money, physical actuators. And it is surging and stalling. Generating more actions than the world can absorb. Committing to plans before the consequences of its last plan are visible. Overcorrecting at a speed no human feedback loop can match.

The driveshaft exists. The governor does not.

I. The Current Machinery

A language model, on its own, can only talk. The generation of agent runtimes built over the last two years — LangGraph, CrewAI, OpenClaw, OpenAI’s Assistants API, Anthropic’s tool-use architecture — changed that. They gave models persistence and tool access. The model can now call APIs, execute code, query databases, send messages, navigate browsers. These are driveshafts — infrastructure that couples the cognitive engine to the machinery of the world. The coupling is a genuine achievement. But a driveshaft transmits force. It does not regulate it.

A customer service agent receives a complaint about a delayed shipment. It diagnoses the issue in two seconds: the package is stuck at a distribution center. It initiates a reshipment through the fulfillment API — a call that will take forty-five seconds to process. While the call is in flight, the agent drafts an apology email, generates a discount code through the promotions system, and queues a follow-up message for twenty-four hours later. By the time the customer reads the first automated confirmation — “Your replacement is on its way” — the agent has already committed three more actions downstream.

Fifteen seconds into the reshipment call, the customer sends a follow-up: the original package just showed up. It had been sitting at a neighbor’s house. They don’t need a replacement — they’d like a refund for the two-day delay.

The reshipment is in flight — a second package heading to someone who already has theirs. The discount code exists for a problem that no longer does. The follow-up is queued. The apology email is in the customer’s inbox, promising a replacement they no longer need. None of these actions can be easily unwound, because none of them were designed to be unwound. Each was a one-shot tool call — fire and forget. The runtime transmitted each cognitive output to the world faithfully. It did not ask whether the premise behind the plan had changed since the plan was made.

This is not a failure of intelligence. The agent’s reasoning was sound at every step. It is a failure of regulation. The system generated actions faster than the world settled from previous actions, and nothing in the architecture modulated that rate. The agent does not know that its reshipment call has not completed. It does not have a way to stage actions as tentative commitments that can be revised before they become irrevocable. It acts at the speed it thinks, and the world receives those actions at whatever pace it can manage, and the gap between the two is where things go wrong.

This is the simplest failure: no governor at all. The more instructive failure is what happens when you add one.

II. The Hunting Problem

Watt’s governor worked. But not forever.

In the decades after 1788, steam engines grew faster and more powerful. The simple centrifugal governor, adequate for Watt’s relatively slow engines, began to fail in a specific and instructive way. As the engine sped up, the governor would respond — closing the throttle. But it would close the throttle too far. The engine slowed too much. The governor detected the slowdown and opened the throttle. Too far again. The engine surged. The governor slammed the throttle closed. The engine oscillated — fast, slow, fast, slow — hunting for a stable speed it could never quite reach.

By the mid-nineteenth century, with roughly seventy-five thousand steam engines running across England, hunting was a critical industrial problem. An oscillating engine damaged machinery, wasted fuel, and endangered workers. The governor was supposed to stabilize the system. Instead, it was destabilizing it — because its corrections were larger than the deviations they were correcting.

In 1868, James Clerk Maxwell published “On Governors” in the Proceedings of the Royal Society — the first mathematical analysis of feedback control. Maxwell showed that the hunting problem was not a defect in any particular governor. It was a property of the relationship between the governor, the engine, and the load. Stability required that the corrective signal be proportional to the error — not too aggressive, not too delayed, not too large relative to the system’s inertia. Get the proportions wrong and the governor amplifies the very instability it was built to suppress. Eighty years later, Norbert Wiener would cite Maxwell’s paper as a founding document of cybernetics.

This failure mode has a direct analogue in AI agents, and it is not the same problem as agents reasoning without feedback — the idle loop, the feedback desert, where the agent thinks faster than the world responds and fills the gap with unsupervised speculation. The hunting problem is the opposite. It is what happens when the agent acts faster than its own feedback settles, and each action generates new signals that trigger further action before the consequences of the previous action are fully visible.

A coding agent detects a failing test. It generates a fix. The fix passes the immediate test but breaks a dependency. The agent detects the broken dependency. It generates another fix. That fix introduces a subtle regression. The agent detects the regression. Each correction is locally rational. The sequence is globally unstable. The agent is not reasoning in the dark — it has feedback. But it is acting on that feedback faster than the system’s state has finished changing, so each action is responding to a snapshot that is already stale by the time the response lands. The governor is present — the agent has a feedback loop — but the governor is untuned, and an untuned governor hunts.

Maxwell’s insight was that you cannot fix hunting by making the governor stronger. A more aggressive corrective response makes the oscillation worse. You fix hunting by tuning the governor to the system — matching the speed and magnitude of the correction to the speed and inertia of the thing being corrected. This is not a lesson about caution. It is a lesson about engineering: the feedback mechanism must be calibrated to the dynamics of the system it regulates, or it becomes part of the problem.

III. What the Governor Becomes

If the feedback mechanism must be calibrated to the system it regulates, what does calibration look like when the system is not a steam engine but a cognitive agent acting on the world through software?

Early operating systems trusted each program to yield the processor voluntarily. Cooperative multitasking. It worked until one program misbehaved — an infinite loop, a request that never returned — and the entire system froze. Every other program waited, helpless, for a process that would never yield. The shift to preemptive scheduling, where the operating system could interrupt any process at any time based on priorities, deadlines, and resource constraints, was the governor arriving in computing. The OS stopped trusting the engine to regulate itself and built the regulation into the architecture.

Agent runtimes are still cooperative. The agent decides when to act, when to wait, what to do next. Nothing outside the agent modulates that rate. When the world pushes back — a Stripe API rejecting requests because it cannot keep up, a fulfillment system queued thirty seconds deep, a human who has not responded — the agent does not feel the resistance as a signal to decelerate. It retries, or fails, or moves on. In fluid engineering, this missing signal has a name: backpressure. When a pipe’s downstream capacity is lower than its upstream flow, the system physically resists the excess, and the sender slows. Agent architectures have the resistance — the slow APIs, the rate limits, the human latency. What they lack is the propagation. The signal that would close the loop between the world’s capacity and the agent’s output rate never reaches the agent’s decision-making.

Most runtimes have timeouts. Almost none have backpressure. A timeout kills the action. Backpressure slows the actor. One is a circuit breaker. The other is a governor.

And even if the rate were governed, there is the problem of actions already dispatched. The customer service agent’s reshipment was irrevocable the moment it fired. In a governed system, it would have been staged — proposed, held pending confirmation that the premise still holds, committed only when the downstream system acknowledges readiness. The principle is familiar outside engineering: an escrow account holds funds until both parties confirm the deal, and releases them only when conditions are met. Database engineers formalized the same idea decades ago — transactions proposed, validated, and committed as a single unit, or rolled back entirely if conditions change. Treating agent actions as transactions against the world rather than one-shot invocations is what connects regulation to reversibility. Not every action can be staged — an email lands in someone’s inbox, a payment clears a bank account. For irreversible actions, the governed system’s role shifts from staging to gating: ensuring the premise is sound before the action dispatches, because there will be no rollback after.

Scheduling, backpressure, staged commitments — each is a form of the calibration Maxwell described: matching the agent’s output to the world’s capacity to absorb it. The raw materials exist in the downstream infrastructure already — rate limits, transaction support, slowdown signals in every API. What does not exist is the feedback loop that connects them to the agent’s decisions.

The industry asks what agents can do. The better question is what governs how fast they do it. Without that feedback loop, capability alone is Watt’s engine before the governor: powerful, unpredictable, trusted only where its surges can be tolerated. The biggest failure mode is not an agent too dumb to act. It is an agent smart enough to generate more plausible actions than the world can absorb.

The governor did not make the engine weaker. It made the engine trustworthy. Factories switched from water wheels to steam not when the engine became more powerful — it was already more powerful — but when the governor made its behavior predictable under changing loads. More useful precisely because more governed. When thinking becomes cheap, the bottleneck moves to its complements. The governor is one of them — not safety in the sense of preventing catastrophic misuse, but safety in the engineering sense. The mechanism by which a powerful system produces stable output under variable conditions. The thing that makes the engine worth trusting with real work.

IV. Where This Breaks

The obvious objection: steam is a dumb force. It does not plan. It does not model the world. It pushes a piston, and that is all. A language model is not steam. It reasons, anticipates, adjusts. A sufficiently capable model might govern itself — might learn when to act and when to wait, when to commit and when to stage, when the world is ready and when it is not. The external governor might be a transitional artifact, necessary only because current models lack the judgment to self-regulate, and destined to be absorbed into the model itself as metacognition improves.

This objection deserves serious engagement, because it may be right. If models develop reliable uncertainty calibration — the ability to know what they do not know, and to modulate their actions accordingly — then many of the governor’s functions could migrate into the model. The scheduling, the backpressure sensitivity, the commitment staging — all of these could, in principle, become learned behaviors rather than architectural constraints.

But control theory suggests something stronger than a practical limitation. To regulate itself, a system must observe its own state, evaluate whether correction is needed, and act on the evaluation. Each step takes time. During that time, the system continues running — its state changes even as the observation is being processed. When the system is its own governor, the correction cycle runs at the same speed as the process it is correcting. The measurement is always trailing the reality. This is not a gap that faster models close. It is a property of self-regulation in any system whose state changes meaningfully during its own observation cycle. Maxwell would have recognized the structure immediately: it is the hunting problem, internalized.

A model might learn to predict rather than react — to steer toward where the world is going rather than where it was. But the world that agents act on — APIs, humans, markets, physical systems — resists reliable prediction for a structural reason: the agent’s actions are part of the causal system it is trying to predict. Its predictions inform its actions, and its actions change the world the predictions were about. The faster it acts, the faster it invalidates the model it is steering by.

I suspect the external governor will shrink as models improve. But the need for architectural mediation between a fast process and a slow world is structural, not temporary. The governor’s form may change — from crude centrifugal balls to sophisticated adaptive control — but the function endures, because the problem it solves does not go away when the engine gets smarter. It gets sharper.

Whether the governor is permanent infrastructure or a temporary scaffold while models mature is an open question. What is not open is the need for it now. The engines are running. The driveshafts are connected. The hunting has already started.


Watt treated the governor as a correction — a device bolted onto a finished machine to stop the surging. I think Maxwell saw something Watt did not: that the governor was the other half of the machine. Stability, as he had demonstrated, is not a property of the engine alone — it is a property of the relationship between engine, governor, and load. An engine without a governor is not merely unregulated. It is incomplete, because the capacity to match output to load is not an accessory to the engine’s purpose. It is the purpose. Raw power was never the point. Useful work — power governed to what the world can absorb — was always the point.

The cognitive engine is running. The governor is the work.


Share this post on:

Previous Post
Taste Is a Scar
Next Post
The Spec Frontier