This is episode forty-one of Git Good. On February third, two thousand twenty-five, a man named Andrej Karpathy posted something on social media that would name a phenomenon millions of developers were already living but could not articulate. Karpathy was not some random blogger. He was the former director of artificial intelligence at Tesla, a founding member of OpenAI, a Stanford computer science PhD. When he spoke, the industry listened. And what he said was this.
There is a new kind of coding I call vibe coding, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It is possible because the large language models are getting too good. I just talk to Composer with SuperWhisper so I barely even touch the keyboard.
He went further. He described accepting all code changes without reading the diffs. Copying error messages back to the AI with no commentary. Watching it fix itself. He said he just sees stuff, says stuff, runs stuff, and copy pastes stuff, and it mostly works.
The term vibe coding entered the vocabulary of software development faster than almost any neologism in recent memory. Within weeks, it was everywhere. Blog posts. Conference talks. Job listings. Twitter arguments. Thinkpieces about whether programming was dead. The speed of adoption told you something. Karpathy had not invented a practice. He had named one that was already happening in thousands of quiet moments, in thousands of terminals, where developers were accepting AI suggestions without reading them and hoping for the best.
He added a caveat that most people who repeated the phrase conveniently forgot. It is not too bad for throwaway weekend projects, he said. But quite amusing. The caveat implied limits. The repetition erased them.
To understand what vibe coding means for Git, you first need to understand how much code is already being written by machines. The numbers are large enough that they should make you pause.
By late two thousand twenty-five, GitHub's own data showed that forty-six percent of all code committed by Copilot users was generated by the AI. Not assisted. Generated. For Java developers, the number was sixty-one percent. GitHub Copilot had twenty million cumulative users by July of that year, and by January two thousand twenty-six, four point seven million of them were paying subscribers. Ninety percent of Fortune one hundred companies had adopted the tool.
A separate analysis by GitClear, which examined over two hundred and eleven million changed lines of code, found that forty-one percent of code committed globally in two thousand twenty-five was initially generated or suggested by artificial intelligence. And a DX impact report from the fourth quarter of that year measured something more specific. Twenty-two percent of merged code was fully AI-authored. Not AI-assisted, where a human types and the machine suggests. Fully authored. The machine wrote it. The human approved it.
Those three numbers, forty-six, forty-one, and twenty-two, are not contradictions. They measure different things. Forty-six percent is Copilot users specifically. Forty-one percent is all committed code across the industry. Twenty-two percent is the strictest definition, code that was entirely machine-written and still made it through review into production. The point is not which number is correct. The point is that all three numbers would have been zero five years earlier, and all three numbers are climbing.
The tools enabling this are Cursor, Claude Code, GitHub Copilot, Windsurf, and a growing field of competitors. Eighty-four percent of developers now use or plan to use AI coding tools. The market reached seven point three seven billion dollars in two thousand twenty-five. This is not an experiment. It is the water everyone is swimming in.
Here is where the story meets Git. Every one of those AI-generated lines of code arrives in a repository through the same mechanism as every human-written line. A commit. A commit has an author field. A timestamp. A message. Maybe a signature. The metadata tells a story. This person, at this time, made this change, for this reason. Except now, increasingly, the story is fiction.
A developer sits at their desk. They describe a feature in plain English. The AI generates four hundred lines of code. The developer runs it. It works. They commit with the message "add user authentication." Git records their name as the author. Git blame, the command that traces every line to the human who wrote it, will forever point to them. If that authentication has a vulnerability six months from now, the trail leads to a person who may not be able to explain how the authentication works, because they never read the code.
This is not hypothetical. A study examining fifteen identical applications built with various AI coding tools found sixty-nine vulnerabilities across them. The code worked. It passed basic testing. It was also riddled with security holes that the developers who prompted it into existence did not catch, because they did not read it closely enough, or because reading AI-generated code for security flaws requires exactly the expertise that vibe coding assumes you can skip.
The Git metadata model was designed by Linus Torvalds for the Linux kernel, where knowing who wrote every line was a matter of legal necessity. The author field is not a credit. It is a chain of responsibility. When that chain points to a human who did not write the code and cannot explain it, the chain is broken. The metadata still exists. It just does not mean what it used to mean.
A developer named Aidan Cunniffe built a tool called Git AI specifically to solve this problem. It uses git notes, a little-known feature that attaches metadata to commits without altering the commit itself, to track which prompts generated which line ranges in a commit. The tool works with Claude Code, Cursor, and GitHub Copilot. It preserves attribution through rebases, cherry-picks, and squashes, all the operations that normally destroy metadata.
You cannot know which code in your repo is AI-generated by counting lines. You have to follow each inserted line through to a commit, any resets, rebases, merges, through a pull request and into a production build.
Cunniffe's tool is clever, but it highlights the deeper problem. Git was built with an assumption so fundamental that it is never stated. The author is a person. The person wrote the code. The code reflects the person's understanding. Remove any of those assumptions and the entire trust model wobbles.
Claude Code, Anthropic's command-line AI tool, automatically adds a co-authored-by line to commits. A reasonable attempt at transparency. But early on, the email address it used was not registered to Anthropic on GitHub, which meant commits were falsely attributed to an unrelated user who happened to have that email. The metadata that was supposed to provide clarity created confusion instead. A small incident, quickly fixed, but a preview of the attribution chaos that scales with adoption.
Four responsibility models have begun to emerge in the industry. AI as tool, where the developer owns all results, the way you own what comes out of your compiler. AI as junior developer, where the developer supervises and reviews. AI as agent, where the code requires policy frameworks and audit trails. And AI as teammate, where the machine needs its own identity in the version control system. None of these models has won. Most teams have not even chosen one consciously. They are vibe coding their governance the same way they are vibe coding their software.
In July two thousand twenty-five, an organization called METR published a study that should have been front-page news. They recruited sixteen experienced open-source developers. Not beginners. Not students. Developers who had worked on their repositories for an average of five years, on projects averaging over a million lines of code and twenty-two thousand GitHub stars. Real developers on real codebases.
The study gave them two hundred and forty-six real issues from their own repositories. Bug fixes, features, refactors. Tasks they would have done anyway. Each task was randomly assigned to either allow or prohibit AI tools. When AI was allowed, developers used primarily Cursor Pro with Claude Sonnet, the same frontier models everyone was excited about.
The finding was this. Developers using AI tools took nineteen percent longer to complete their tasks.
Not faster. Slower. On their own codebases. With tools they chose themselves. On tasks they had identified as valuable.
But here is the finding that should keep you awake. Before the study, developers predicted AI would make them twenty-four percent faster. After experiencing the nineteen percent slowdown, after living through it task by task, they still believed AI had made them twenty percent faster.
Developers expected AI to speed them up by twenty-four percent, and even after experiencing the slowdown, they still believed AI had sped them up by twenty percent.
The perception gap is the most important detail in this entire story. It means you cannot trust developers to self-report whether AI helps them. The feeling of productivity and the reality of productivity have come unglued. Developers accepted less than forty-four percent of the code AI generated, meaning more than half the time, they reviewed, tested, and modified suggestions only to reject them in the end. The time spent evaluating bad suggestions was invisible to them. It felt like progress because the machine was producing output. Output is not progress.
The METR researchers were careful to note the limits of their study. Sixteen developers. Familiar codebases. Large, mature projects. AI might perform differently on greenfield code, on smaller projects, on tasks where the developer has no existing context. But the study measured exactly the scenario where vibe coding is most dangerous. Experienced developers, on code that matters, convinced they are faster when they are not.
In February two thousand twenty-six, Anthropic, the company behind Claude, published a study that examined what happens to skill formation when developers learn with AI assistance. They recruited fifty-two junior engineers, most with at least a year of weekly Python experience. The task was to learn Trio, an unfamiliar asynchronous programming library. Half the group used AI tools. Half did not.
The AI-assisted group finished about two minutes faster. That difference was not statistically significant. But on the comprehension test afterward, the AI-assisted group scored seventeen percent lower. The manual group averaged sixty-seven percent. The AI group averaged fifty percent.
The AI group finished faster but understood less. The biggest gap was on debugging questions, exactly the skill you need most when the code is not yours.
The study found what the researchers called a stark divide within the AI group itself. Those who used AI to ask conceptual questions, to understand what the library was doing and why, scored sixty-five percent or higher. Those who delegated the code generation, who let the machine write and just ran the results, scored below forty percent. How you used AI mattered more than whether you used it.
This is the skill atrophy argument made concrete. And it has a parallel outside software that is worth hearing. In commercial aviation, the industry calls it automation complacency. Pilots who rely heavily on autopilot systems gradually lose proficiency in manual flying. The autopilot handles ninety-nine percent of flight hours perfectly. But when it fails, when conditions exceed its design parameters, the pilot needs skills they have not practiced in months or years. The Federal Aviation Administration has issued multiple advisories about this exact problem. And in aviation, when skills atrophy and automation fails, people die.
Software is not aviation. Nobody dies when an authentication function has a vulnerability. Usually. But the pattern is the same. The tool handles the routine work. The human stops practicing the hard work. The tool encounters something outside its training. And the human no longer has the skills to step in.
The skill atrophy argument assumes you had skills to lose. There is a harder question lurking underneath it. What about the developers who never build those skills in the first place?
Entry-level tech hiring declined twenty-five percent year over year in two thousand twenty-four. Software developers aged twenty-two to twenty-five saw their employment drop nearly twenty percent between late two thousand twenty-two and July two thousand twenty-five. For roles exposed to AI automation, employment fell six percent for ages twenty-two to twenty-five while increasing nine percent for ages thirty-five to forty-nine. Computer science graduates faced six point one percent unemployment. Tech internship postings fell thirty percent since two thousand twenty-three, even as applications rose seven percent.
Seventy percent of hiring managers said they believed AI could perform intern-level jobs. Fifty-seven percent said they trusted AI output over the work of recent graduates.
There is a brutal logic at work here. If AI can write junior-level code, and if reviewing AI code requires senior-level judgment, then the junior role, the one where you learn by writing bad code and having it reviewed, shrinks or disappears. The entry point into the profession narrows. The pipeline that produces the senior developers who are supposed to supervise the AI gets thinner.
Season one of this series told the story of how Git democratized contribution. Anyone could fork, branch, commit, and send a pull request. The barrier to entry was knowing Git, which was hard, but at least it was a learnable barrier. Vibe coding appears to lower that barrier further. Anyone can describe what they want. The machine writes the code. You do not even need to know Git, because the AI handles the commits too.
But here is the thing about the democratization argument. A person who builds an application through vibe coding has built an application. They have not learned to build applications. If the AI changes, if the tool disappears, if the generated code breaks in a way the AI cannot fix, the person is exactly where they started. The tool gave them a result. It did not give them a capability.
The last chapter of this story is about the quiet crisis happening in pull request review queues around the world.
Code review has always been bottlenecked by human attention. The traditional model assumes that a reviewer can read a diff, understand the change, spot potential issues, and approve or request changes. This model works because a human author writes at human speed. A typical pull request might change fifty to two hundred lines. A reviewer can hold that much context in their head.
Now imagine a pull request that changes two thousand lines, generated in thirty seconds by an AI that was given a one-paragraph description. The code works. The tests pass. But is it secure? Is it efficient? Does it introduce patterns that will make the next change harder? Answering those questions requires the reviewer to understand two thousand lines of code they did not write, that follows no particular human style, that may implement the requirement in a way no human would choose.
The review becomes harder at the exact moment it becomes more important. When a human writes code, the reviewer can infer intent from style, from naming conventions, from the structure of the change. These are signals of human thought. AI-generated code has no intent. It has patterns. The reviewer is no longer checking whether a colleague made a mistake. They are auditing whether a statistical model produced something safe. That is a fundamentally different task, and almost nobody has been trained for it.
Some teams have responded by leaning into automated testing. If you cannot read the code, test the behavior. Does the function return the right values? Does the endpoint respond correctly? Does the authentication actually authenticate? This is a reasonable adaptation. But behavioral testing catches what you think to test for. It does not catch what you do not think to test for. Security vulnerabilities, race conditions, subtle data leaks. These live in the code, not in the behavior, and finding them requires reading the code. Which is the thing vibe coding explicitly skips.
Last episode told the story of the invisible maintainers, the people holding up software infrastructure without recognition or pay. This episode is about a different kind of invisibility. The invisibility of understanding. Code exists in repositories that nobody fully comprehends. Commits carry names of authors who cannot explain what they committed. Review queues fill with changes that reviewers cannot meaningfully evaluate. The Git log is a history of the project, but the history is becoming fiction, an accurate record of what happened with no insight into why or how.
The honest question is whether this matters. Software has always been full of code that nobody understands. Legacy systems. Inherited codebases. That one module everyone is afraid to touch. The difference is that the old mystery code was written by humans who understood it at the time. The understanding existed once and was lost to turnover and time. Vibe-coded software was never understood by anyone. The understanding never existed in the first place.
Maybe that is fine. Maybe software is heading toward a model where the code is disposable, regenerated from specifications whenever it needs to change, and the specifications are what matter. Maybe the commit history becomes a log of prompts rather than a log of implementations. Maybe git blame becomes meaningless because no human is to blame.
Or maybe the autopilot fails. Maybe the AI generates something that works perfectly until it does not. And when a developer opens git log to trace the problem, they find a long chain of commits by a person who accepted every suggestion, read no diffs, and vibed their way to a codebase that nobody, human or machine, can explain.
Karpathy called it vibe coding. He meant it as a description. It is becoming a warning. The next episode looks at what happens when the code that nobody reads becomes the code that everybody regulates. That was episode forty-one.
Git diff HEAD tilde one. It shows you exactly what the last commit changed. In the vibe coding world, this might be the first time anyone actually looks at the code. Run it after every commit, even the ones you think you understand. Especially the ones you think you understand. The diff does not care who wrote the code. It does not care whether you prompted it into existence or typed it by hand. It just shows you what changed. In an era when commits arrive faster than humans can read them, the simple act of looking at a diff is becoming a radical practice.