Vibe Coding Is Over. Specs Are Still Broken.

The function worked. That was the problem. A team I talked to in May had shipped a pricing module the quarter before, built fast with a coding agent, green tests, clean deploy, nobody complained. Then a new engineer opened the file to add a discount tier and just sat there for a while, scrolling, because the thing computed correct prices through a path that no human had actually designed and no comment explained. It passed. Nobody could say why. That gap, between code that runs and code anyone understands, is the whole story of 2026.

So here is my unpopular opinion, stated plainly. We didn't kill vibe coding. We rebranded the ambiguity and called it a methodology. The industry spent the last year declaring vibe coding dead and anointing spec-driven development as the grown-up replacement, and the second half of that sentence is genuinely good news. The first half is a story we tell ourselves. Moving from a loose prompt to a structured spec changes where the ambiguity lives. It does almost nothing about whether the ambiguity exists.

The numbers behind the 2026 reckoning are not subtle. A CodeRabbit analysis published December 17, 2025 compared AI-co-authored pull requests against human-only ones and found the AI batch averaged 10.83 issues per pull request versus 6.45, roughly 1.7 times more, with logic and correctness problems showing up about 75% more often (CodeRabbit, December 2025). A separate large-scale academic study tracked 304,362 AI-authored commits across 6,275 GitHub repositories and found that 24.2% of the issues those commits introduced are still sitting in the latest version of the code (Liu et al., "Debt Behind the AI Boom," 2026). And Veracode's Spring 2026 security analysis found models pick the insecure way to implement a task about 45% of the time when given the choice (Veracode, Spring 2026). Three different lenses. One picture.

10.83 vs 6.45

issues per pull request in AI-co-authored code versus human-only code, roughly 1.7x more, across CodeRabbit's December 2025 analysis

Source: CodeRabbit, "AI vs Human Code Generation Report," Dec 2025

Why is 2026 being called the year of technical debt?

Because the bill came due all at once. For two years the dominant story about AI coding was velocity: more lines, more pull requests, more shipped per sprint. That story was true. It was also incomplete, the way "I drove here in record time" is true right up until you mention the speeding ticket and the wrong exit. The output went up. What the output cost in cleanup is only now showing on the books, and it is showing everywhere at the same time.

Look at the survival number again, because it is the one that should keep engineering leaders up at night. In that study of 304,362 AI-authored commits, the researchers did not just count bugs at the moment of writing. They followed them. Nearly a quarter of the problems those commits introduced were still alive in the codebase at the latest revision, which means they were not caught in review, not caught in testing, not caught by the next person to touch the file. They just stayed. Quietly. Compounding into the thing everyone now calls technical debt.

And the adoption is total, which is why this is not a niche problem. JetBrains surveyed 24,534 developers for its 2025 Developer Ecosystem report and found 85% of them regularly using AI tools (JetBrains via InfoWorld, 2025). Eighty-five percent. So whatever AI does to code quality, good or bad, it is now doing it to almost everyone's code, which is exactly why a quality problem that used to be a rounding error has turned into a year-defining theme.

Did spec-driven development actually fix vibe coding?

Partly, and I want to give it full credit before I take some back. Spec-driven development is a real advance. By anchoring an AI agent to a structured, durable specification instead of a throwaway prompt, tools like GitHub Spec Kit and AWS Kiro genuinely fix two of vibe coding's worst failure modes: intent drift, where the agent wanders off over a long run, and context decay, where it forgets what you told it twenty minutes ago. Those were real wounds. Spec-driven development closes them. I am not being polite; I mean it.

Here is the part the celebration skips. A spec is a record of decisions someone already made. That is its entire nature. It encodes what you knew when you wrote it, with more rigor and more structure than a prompt ever did, and then it hands that encoding to a machine that will execute it faithfully at high speed. Faithful execution of a wrong decision is not a fix. It is the same mistake, now formatted, version-controlled, and shipped with confidence. The clearer the spec, the more efficiently it delivers whatever was decided, including the parts that were decided badly or never really decided at all.

I changed my mind about this once, actually, so I will not pretend the line was always obvious to me. Early on I thought a good spec template would force better thinking, and it does nudge people, a little. But a template can only ask about the things its author already imagined. It will dutifully prompt you for error handling and acceptance criteria. It will never prompt you to consider the compliance rule you have never heard of, or the downstream team you didn't know consumed your data. Structure organizes what you know. It is silent on what you don't.

Vibe coding put the ambiguity in the editor, where at least a human had to look at it. Spec-driven development moved the ambiguity one step upstream, into the spec, and made it look resolved. That is not progress on the actual problem. That is better packaging for it.

Where does AI-amplified technical debt really originate?

Upstream of the editor, every time. This is the claim the whole article rests on, so let me be blunt about it. The 2026 technical-debt wave did not start when an agent wrote a clumsy function. It started weeks earlier, when a requirement was assumed instead of validated, when "real-time" meant three different things to three stakeholders, when a security control nobody named simply never made it into the input. The agent didn't invent the gap. It inherited it, then multiplied it across a few thousand lines at machine speed.

Think about the Veracode finding through that lens. Models choose the insecure path 45% of the time, and the security pass rate has barely moved off 55% in two years despite dramatically more capable models (Veracode, Spring 2026). People read that as "the models aren't secure enough," and sure, partly. But here is the harder reading. A control that was never specified will never appear in the output, no matter how good the model gets. You cannot generate a safeguard for a threat nobody named during discovery. The missing requirement and the missing safeguard are the same hole, seen from two ends.

I'll concede the counterargument in one sentence, because it is fair: process and talent and code review all matter too, and a disciplined team writes better prompts and catches more in review. True. But none of that reaches back to the moment the requirement was wrong, and that moment is where the most expensive defects are born, which is the through-line of everything we have written on why your specs, not your agent, are the problem and the hidden cost your AI coding stack ignores.

From the field

The most useful thing published on this in 2026 did not come from a vendor selling a fix. It came from researchers willing to count. Yue Liu, David Lo, and their colleagues at Singapore Management University ran a study they bluntly titled "Debt Behind the AI Boom," analyzing 304,362 verified AI-authored commits across 6,275 real GitHub repositories in Python, JavaScript, and TypeScript (arXiv, 2026).

What makes the work honest is the second half of the method. They didn't stop at "AI commits introduce issues," a finding that surprises no one. They tracked whether those issues got fixed, and found that 24.2% of them were still surviving in the latest revision, roughly 37 surviving issues for every 100 AI-authored commits. The debt is not theoretical. It is measurable, and most of it is just sitting there.

Credit where it is due: this is the kind of measurement the field needed, and it points the right direction. The researchers are clear that the issues concentrate in predictable failure modes, the ones a clearer upstream intent would have prevented. The lesson is not "AI writes bad code, stop using it." The lesson is that validated intent is the cheapest defense we have, and almost nobody is buying it before they generate.

Vibe coding and spec-driven development both inject intent downstream, so the gap surfaces late. Requirements intelligence injects it at discovery, where the fix is still cheap.

What does requirements discipline change before code is generated?

It changes the input, which is the only place left where a fix is still cheap. By the time a defect reaches the spec, you can catch it. By the time it reaches the code, you can catch it more expensively. By the time it reaches production it can cost orders of magnitude more, which is the entire argument of the 29x rule. Requirements discipline moves the catch all the way to the front, before the machine has multiplied anything, and that is disclosure time: this is what Specira is built to do, so weigh the bias accordingly.

Concretely, requirements intelligence runs structured discovery before a prompt or a spec exists. It forces each stakeholder to define the fuzzy word in numbers, so "real-time" stops being three secret meanings hiding in one clean line. It asks, every time, which compliance and security obligations touch this data, so the control nobody thought of gets named while naming it is still free. It assigns an owner to every trade-off, so the choice between speed and cost is made by a person rather than defaulted by an agent. Then it hands that validated set downstream. Same Spec Kit. Same Kiro. Same fast agent. Finally aimed at the right system.

The format was never the problem

Vibe coding and spec-driven development are two answers to the question of how to capture intent. Neither answers the harder question of whether the intent was correct in the first place. That question lives upstream, in discovery, and no file format reaches it.

So adopt the specs. Keep the fast agents. Just stop pretending the spec is where quality is decided. Validate the requirement first, then generate as fast as you like, because speed only becomes debt when it is pointed at the wrong thing. Get the order right and the same tools everyone has become a durable edge. Get it backwards and you ship the wrong thing faster than ever, which is the same thread running through why AI agents still can't ask the right questions.

What are the most common questions about vibe coding in the enterprise?

Vibe coding is the practice of building software by prompting an AI model in natural language and accepting what it returns without closely reviewing the generated code. Andrej Karpathy coined the term in early 2025. It is fast and feels productive, but it pushes ambiguity straight into a running codebase, because the prompt rarely captures the real requirement. The industry spent 2026 reacting against it, but the underlying problem was never the editor. It was the unvalidated requirement behind the prompt.

Partly. Spec-driven development fixes intent drift and context decay by anchoring AI agents to a structured specification instead of a loose prompt, and that is a real improvement worth adopting (GitHub Spec Kit, AWS Kiro). But it relocates ambiguity rather than removing it. A spec encodes the decisions someone already made, so if the requirement behind the spec was never validated, the agent now ships the wrong thing precisely and with more confidence. The quality ceiling is set upstream at discovery, not at the spec format. We argue this in detail in Your Specs Are the Problem.

Because the cost of two years of fast AI generation is now landing in production. A CodeRabbit analysis published in December 2025 found AI-co-authored pull requests average 10.83 issues each versus 6.45 for human-only PRs, roughly 1.7 times more (CodeRabbit, 2025). A large-scale academic study of 304,362 AI-authored commits across 6,275 repositories found that 24.2% of AI-introduced issues still survive in the latest revision (Liu et al., 2026). The debt did not start in the editor. It started in requirements nobody validated, and faster generation just shipped the gaps faster.

Often, yes. Veracode's Spring 2026 GenAI Code Security analysis found that when models could choose between a secure and an insecure way to implement a task, they chose the insecure option about 45% of the time, and the security pass rate has stayed roughly flat near 55% since 2024 despite far more capable models (Veracode, 2026). The deeper issue is that a security control nobody specified will not appear in the output, no matter how good the model is. Missing requirements produce missing safeguards.

By fixing the input before the machine multiplies it. Requirements intelligence runs structured discovery before any prompt or spec is written: it surfaces hidden assumptions, resolves conflicting stakeholder definitions, names the trade-offs, and records who decided what. A validated requirement set then feeds whatever you use to generate code, whether that is a prompt, GitHub Spec Kit, or AWS Kiro. The agent stays fast. It is finally fast at building the right thing, which is the only version of speed that does not turn into debt. See what requirements intelligence is.

Nicolas Payette

CEO and Founder, Specira AI

Nicolas Payette has spent 25 years in enterprise software delivery, leading digital transformations at companies like Technology Evaluation Centers and Optimal Solutions. He founded Specira AI to solve the root cause of project failure: unclear requirements, not slow code.

Vibe Coding Is Over. The Specs Are Still Broken.

Why is 2026 being called the year of technical debt?

Did spec-driven development actually fix vibe coding?

Where does AI-amplified technical debt really originate?

What does requirements discipline change before code is generated?

The format was never the problem

What are the most common questions about vibe coding in the enterprise?

Keep the fast agents. Validate the requirement first.

Why is 2026 being called the year of technical debt?

Did spec-driven development actually fix vibe coding?

Where does AI-amplified technical debt really originate?

What does requirements discipline change before code is generated?

The format was never the problem

What are the most common questions about vibe coding in the enterprise?

Related reading

Your Specs Are the Problem. Not Your AI Agent.

The Hidden Cost Your AI Coding Stack Ignores

Keep the fast agents. Validate the requirement first.