Skip to content
islam's
Go back

When the Code Writes Itself

15 min read

On specification, taste, and the skills that survive automation

In 1952, Grace Hopper wrote the first compiler. It took instructions written in something close to English and translated them into the binary that machines could execute. Her colleagues told her it was impossible—that computers could not write programs. She built it anyway, and the entire profession of software engineering grew up in the space her invention created. For seventy years, that space has been defined by a single act: translation. A human understands a problem. A machine can execute instructions. The engineer bridges the gap.

That gap is closing.

Large language models can generate working code from natural language descriptions. They can read a codebase and explain what it does. They can find bugs, suggest refactors, write tests, scaffold entire applications. They are not yet reliable enough to replace engineers, but they are reliable enough to change what engineering means. Hopper automated the translation from human thought to machine code. AI is automating the next layer up—from human intent to working software. The distance between wanting and having is shrinking at a rate that would have seemed absurd five years ago.

This essay is about what that shift demands of the people who build software. Not a prediction about which jobs survive, but an attempt to think clearly about which skills become more valuable, which become less, and why. The core claim is simple: when execution gets cheap, direction gets expensive. The question is what “direction” means when the thing being directed is a software system.

I. What Is Actually Being Automated

It is important to be precise about this, because imprecision leads to either panic or complacency, and neither is useful.

What AI is automating is the mechanical middle of software engineering: the translation of a well-understood specification into working code. Given a clear description of what a function should do, an LLM can produce it. Given a known pattern—an API endpoint, a database query, a UI component matching a standard design—it can generate it faster than a human and often with fewer errors. Given a failing test, it can usually fix the code. The mechanical middle is where most junior engineering hours are spent, and it is where the bulk of the profession’s economic value has historically been located.

But the mechanical middle is not the whole act. Above it sits specification: deciding what to build, why, under what constraints, and how to know if it works. Below it sits diagnosis: understanding why a system behaves unexpectedly, tracing failures across layers of abstraction, reasoning about emergent behavior in complex architectures. And surrounding the entire process sits judgment: knowing when a technically correct solution is the wrong one.

AI is excellent at the middle. It is weak at the top and bottom. This asymmetry is the key to understanding everything that follows—but it is also an assumption, and we will return to test it.

II. The Skills That Depreciate

If the mechanical middle is being automated, then the skills that made an engineer fast at the mechanical middle lose relative value.

Syntax fluency—the ability to write correct code quickly in a given language—matters less when the machine writes correct code faster. Memorization of APIs, framework conventions, and boilerplate patterns matters less when the machine has memorized all of them. The ability to scaffold an application from scratch matters less when scaffolding is a prompt away. Even certain kinds of debugging—finding the typo, spotting the off-by-one error, tracing the null pointer—are increasingly within the model’s competence.

This is uncomfortable for engineers whose identity is built around these skills, and that discomfort deserves respect. A craft practiced for years does not become less meaningful because a machine can replicate parts of it. But the market does not trade in meaning. It trades in scarcity.

The engineer who is purely an implementer—who needs a detailed spec handed to them and returns working code—is occupying the exact space AI is most aggressively entering.

III. The Skills That Appreciate

What remains scarce tracks closely with the constraints that survive any explosion in capability. When power becomes abundant, value migrates to the things that power cannot buy: attention, coordination, trust, the awareness of irreversibility, and judgment. In software, these take concrete forms.

Attention becomes specification. The ability to determine what should be built—not in the loose sense of product intuition, but in the rigorous sense of defining behavior, constraints, edge cases, failure modes, and success criteria precisely enough that a system can execute against it. Fred Brooks identified this in 1986: the essential difficulty of software is not coding but deciding what the code should do. AI makes this insight louder, because it removes the accidental difficulty that used to muffle it. When the machine can write any function you can describe, the quality of your description becomes the bottleneck.

Specification is harder than it looks. Most software failures are not bugs in the implementation. They are bugs in the specification—requirements that were incomplete, ambiguous, contradictory, or simply wrong. AI does not fix this. If anything, it makes it worse, because the speed of implementation means that badly specified software gets built faster. The feedback loop between “this isn’t what I meant” and “but it’s what you asked for” gets tighter and more punishing. The spec is the real code now. Everything else is compilation.

Irreversibility becomes system thinking. As AI handles more component-level work, the engineer’s value migrates to architecture—understanding how pieces interact, where coupling creates fragility, how a system will behave under load, under failure, under the slow drift of changing requirements over years. Architectural decisions are the irreversible commitments of software: choose your database, your service boundaries, your consistency model, and you will live with those choices long after the code around them has been rewritten ten times. System thinking is irreducibly holistic. You cannot prompt your way to it, because it requires holding the entire system in your head and reasoning about properties that emerge only from the interactions between parts. LLMs are excellent at local coherence. Architecture is a global property.

Trust becomes evaluation. Knowing whether the output is good. When an AI generates code, someone has to determine whether it is correct, performant, secure, maintainable, and appropriate for the context. The AI generates candidates. The engineer judges them. And judgment, here, is a form of trust calibration—knowing when to trust the machine’s output and when to doubt it, knowing which parts of a generated solution to accept and which to rewrite. Generation is a solved problem. Judgment never was. In a world of abundant generation, evaluation becomes the scarce resource.

Coordination becomes problem framing. Before specification, before architecture, someone must decide what problem is actually being solved. This is distinct from what the user says they want, which is often a solution rather than a problem. The ability to interrogate a request, identify the underlying need, and reframe it in terms that admit a better solution requires empathy, domain knowledge, and the kind of cross-boundary negotiation that is itself a coordination challenge—aligning stakeholders around a shared understanding of what they are building and why.

And then there is taste. This is the hardest skill to categorize, because it is not about managing constraints. It is about discernment. Taste in software is the ability to make decisions that are not derivable from the specification—choices about abstraction, naming, structure, and interface design that make a system legible, maintainable, and adaptable. Two implementations can satisfy the same spec and have radically different qualities: one is a pleasure to work in, the other accumulates technical debt with every change. AI can generate both. It cannot reliably tell them apart. The engineer with taste can.


These skills are not a list. They are a loop—and the loop is where the real competitive advantage lives.

If creation is easy, then creation is worthless. Anyone can generate a v1. AI makes the first version of almost anything close to free: the prototype, the scaffold, the initial implementation. This means v1 is no longer a moat. The moat moves to iteration—the ability to get from v1 to v10 faster and more intelligently than anyone else. And iteration is where specification, evaluation, system thinking, taste, and problem framing converge in practice. You ship. You read the feedback. You diagnose what is actually wrong versus what users say is wrong. You decide whether the fix is a code change, a spec change, or a reframing of the problem entirely. You make the trade-off between patching and rebuilding. You hold the system’s architecture in your head while absorbing information from production that no one anticipated. Then you do it again.

Each cycle through this loop requires every skill the essay has described, applied under time pressure, with incomplete information, against real-world feedback that is messy in exactly the ways AI struggles with. A model can generate a candidate fix from a bug report. It cannot yet look at declining retention metrics, a confusing onboarding flow, and three contradictory user interviews and determine that the actual problem is the mental model the product assumes—and that the fix is not a code change but a design rethink. That act of interpretation, sitting between ambiguous human behavior and precise technical response, is what iteration demands. It may also be more durable than specification as a human advantage, because it depends not on describing what you want in advance but on recognizing what is wrong after the fact—and the space of what can go wrong in production is far larger and stranger than any prompt can anticipate.

The engineer whose instinct after shipping is to watch, listen, and adjust—who treats the first version as a hypothesis rather than a deliverable—is practicing the skill that matters most when creation is cheap. Iteration is not a phase of the project. It is the project. Everything before it is a rough draft.

IV. The Author-to-Editor Problem

If the appreciating skills are specification, system thinking, evaluation, taste, and problem framing, then the job of the software engineer is shifting from author to editor—from the person who writes the code to the person who directs and judges it. This is not a simple promotion. It is a different cognitive mode, and the difference is more treacherous than it appears.

Writing code is an act of construction: you start from nothing and build until the thing works. The architecture reflects your reasoning because you made the decisions that produced it. Reviewing AI-generated code is an act of reverse engineering: you start from something that mostly works and must infer the intent behind decisions you did not make, then evaluate whether that intent matches yours. This is cognitively harder in a specific way—you are reasoning about someone else’s thought process, except there is no someone else. The generator has no intent. It has patterns. You are searching for coherence in output that was produced without it, and the danger is that you will find coherence that isn’t there, because humans are built to see reasons where there are only correlations. The hardest bug is the one in code you didn’t write and no one intended.

The new workflow is iterative and conversational. The engineer describes what they want. The AI produces a candidate. The engineer evaluates, identifies gaps, refines the description, iterates. This loop is faster than manual coding but only if the engineer can evaluate quickly and specify precisely. An engineer who cannot tell good code from bad code—who cannot spot the subtle architectural mistake, the hidden performance cliff, the security hole that passes all tests—is not accelerated by AI. They are endangered by it, because they will ship bad software faster and with more confidence. The machine is a force multiplier. It multiplies whatever the engineer brings to it, including incompetence.

There is a painful irony here. The skills needed to evaluate AI-generated code are best developed by writing code yourself, extensively, for years. The junior engineer who never goes through that apprenticeship—who starts by prompting rather than by building—may never develop the judgment needed to know when the machine is wrong. We are automating the apprenticeship and expecting the masters to appear anyway. The automation of the learning process threatens to undermine the very expertise the automation depends on.

This is the field’s most important unsolved pedagogical problem, and organizations that ignore it will pay for the neglect. If the entry-level work that trained previous generations of senior engineers is automated, how do new engineers develop judgment? Apprenticeship models, pair programming, and deliberate practice on problems below production scale become more important, not less. Organizations that cut the junior pipeline to capture short-term AI productivity gains will find, in five years, that they have no senior engineers—only prompt operators who cannot diagnose a failure they did not generate.

V. The Crack in the Thesis

Everything above rests on an assumption that deserves to be tested: that specification is safe from automation. That the top of the stack—deciding what to build, for whom, under what constraints—is irreducibly human. This is the story engineers tell themselves, and it may be the story they most want to believe, because it places their highest-status skill beyond the reach of the machine. Convenient beliefs should be examined with extra suspicion.

The boundary is already moving. AI systems can talk to users, ask clarifying questions, identify ambiguities in requirements, and propose solutions. They can generate test cases from descriptions, surface edge cases the specifier missed, and iterate on a design through conversation. They are not good at this yet. But “not good yet” is not the same as “cannot,” and the history of AI capabilities over the past five years suggests that the distance between those two claims closes faster than anyone expects.

If specification is eventually automated—or even substantially assisted—then the skill hierarchy this essay describes shifts again. The engineer’s value migrates further upward, to the places where human judgment is most entangled with context, relationships, and the kind of tacit knowledge that resists formalization. Problem framing, stakeholder negotiation, ethical reasoning about what should be built at all—these become the last high ground. But “last high ground” is a military metaphor, and it implies a retreat.

The honest position is this: the boundary between what AI can and cannot do in software is not fixed. This essay’s analysis is an argument about the current boundary, not a permanent one. The skills it identifies as appreciating are appreciating now, and probably for the next several years, and possibly for much longer. But the engineer who builds their career on the assumption that any particular skill is permanently safe is making the same mistake the pure implementer made: confusing the current bottleneck with a law of nature.

There is a second crack worth noting. The essay argues that taste appreciates—that the market will pay more for elegance, maintainability, and good design as generation becomes cheap. This may be wrong. Most enterprise software is not elegant. It is adequate. It ships, it works, it gets maintained by whoever is available, and no one asks whether the abstractions are beautiful. If AI-generated code is consistently adequate—verbose, conventional, but functional—the market may simply stop paying for more than that. Taste may narrow from a broadly valued skill to a niche one, prized at the high end and irrelevant everywhere else. The engineer who bets everything on taste may find that they have refined a palate for a meal the market no longer orders.

VI. The Deeper Shift

Beneath the practical questions about skills and org charts, there is a more fundamental change.

For decades, software engineering has been a field where you could succeed primarily through execution. If you could write code fast and correctly, you had a career. The field rewarded implementers generously because implementation was the bottleneck. This attracted a certain kind of mind: people who enjoy the concrete, the precise, the satisfyingly mechanical act of making something work.

The new era demands that the same people develop capacities they may not have practiced. Specification is an act of thinking, not typing. Evaluation is an act of judgment, not production. System thinking is an act of imagination, not construction. Taste is an act of discernment, not effort. These are not soft skills in the pejorative sense. They are the hardest skills in the profession, the ones that were always most important but least visible because the mechanical middle consumed so much time and attention.

The pattern is not unique to software. It is civilizational: when execution gets cheap, the question of what to execute becomes the primary challenge. In software, “what to execute” means what to build, how to structure it, and how to know if it is right. The engineers who thrive will be those who were always doing this work and are now freed to do more of it—and those who recognize the shift early enough to develop the capacities it demands, including the honesty to notice when the ground shifts again.


The code writes itself now, or will soon enough that it matters. The question that remains is the one that was always underneath: does anyone know what the code should do, and why, and for whom, and how to tell if it worked?

That question has always been the real job. We are about to find out how many of us were actually doing it.

Hopper’s compiler solved the problem of translation. It did not solve the problem of knowing what to say. Seventy years later, a far more powerful translator has arrived. The problem it leaves behind is the same one she left behind. It is still the harder problem.


Share this post on:

Previous Post
What Determines Value When Power Is No Longer Scarce?