The Supply Chain: Five Hundred Milliseconds

The Chain You Cannot See

This is episode thirty-six of Git Good.

Every piece of software you use is built on other software. Your web browser depends on hundreds of libraries. Your phone's operating system depends on thousands. The server that delivered this podcast episode to your device depends on a compression library, a cryptography library, a networking library, a logging library, each of which depends on other libraries, each of which depends on still others. The chain goes down and down, and at the bottom are packages so small and so foundational that most developers have never heard of them.

This is the software supply chain. Not a metaphor. A real chain, where every link is a piece of code written by someone you have never met, maintained on their own time, published to a repository that anyone can read. You trust all of it. You have to. There is no practical alternative. No company builds everything from scratch. No developer audits every dependency. The chain is too long, too deep, too tangled. So you trust.

In episode nineteen of this series, we told the story of how that trust was exploited. A developer named Andres Freund noticed that his SSH logins were five hundred milliseconds slower than they should have been, and that observation led to the discovery of the most sophisticated supply chain attack ever attempted against open source software. The XZ Utils backdoor. Two and a half years in the making. A near miss that could have compromised every Linux server on the internet.

Episode nineteen told that story through the lens of Git's trust model. Who do you trust, and how do you verify that trust? This episode tells a different story. Not who exploited the chain, but what the chain is, why it is so fragile, and what the world is trying to build to make it stronger. Because the XZ backdoor was not just an attack on one library. It was a stress test of the entire way modern software is assembled. And the results were not encouraging.

How Software Gets Built

To understand why the XZ attack mattered beyond one compression library, you need to understand how software actually reaches your machine.

When a developer writes a program, they do not write every piece of it. They import libraries. A library for handling dates. A library for parsing data formats. A library for encryption. Each of those libraries was written by someone else, published to a package registry like PyPI or npm or crates.io, and made available for anyone to download and use. The developer types a command, and the package manager pulls in the library along with every library that library depends on. Those are called transitive dependencies. You asked for one thing. You got a tree.

A typical modern web application might declare thirty or forty direct dependencies. But when you count the transitive ones, the dependencies of the dependencies, that number can balloon into hundreds or thousands. The JavaScript ecosystem is especially deep. A single npm install can pull in over a thousand packages. Each one is a link in the chain. Each one was written by someone. Each one is a potential point of failure.

And here is the part that matters for this story. Nobody reads them all. Nobody can. A developer installing a library to parse JSON dates is not going to read the source code of every transitive dependency that library pulls in. They are not going to check who maintains each one, or when it was last updated, or whether the person who pushed the latest release is the same person who started the project. They trust the chain because the alternative is building everything from scratch, and nobody builds everything from scratch.

Git is the infrastructure underneath all of this. Every package on npm, PyPI, crates.io, and most other registries lives in a Git repository. When you install a dependency, you are trusting the code in that repository. You are trusting that the person who pushed the latest commit is who they claim to be. You are trusting that no one tampered with the code between the Git repository and the compiled package. You are trusting the chain.

And until March two thousand twenty-four, almost nobody was verifying any of it.

The Machine Inside the Machine

Episode nineteen described how Andres Freund traced his slow SSH logins to the XZ Utils compression library and discovered the backdoor. But it did not go deep into how the backdoor actually worked, because episode nineteen was about the trust model, not the technical mechanism. This episode goes into the machine.

The backdoor was not in the source code. That is the first thing to understand, and the thing that made it so dangerous. If you read every line of C in the XZ Utils repository, every header file, every function, every comment, you would not find it. The malicious code was hidden inside binary test files. Files named things like bad-3-corrupt_lzma2.xz. Files that looked like compressed test data, the kind of thing every software project has in its test suite. Unremarkable. Boring. The kind of files that code reviewers skip because they are not human-readable.

But those files were not just test data. Embedded inside them, obfuscated and compressed, was a script. During the build process, when a developer or a distribution compiled XZ Utils from source, a modified build script called build-to-host.m4 would extract the hidden payload from the test files and inject it into the compiled library. The backdoor existed in the distributed tarballs but not in the visible source code. You could audit the repository all day and never see it.

Now here is where it gets precise. Six months before the backdoor was planted, in June two thousand twenty-three, an account called Hans Jansen submitted code that added support for something called ifunc resolvers to the XZ Utils build system. Ifunc, short for indirect function, is a legitimate feature of the GNU C Library that allows a program to choose which version of a function to use at startup. It is used for performance optimization, letting the system pick the fastest implementation for the hardware it is running on. The ifunc code Hans Jansen contributed was innocent. It did what it claimed to do.

But the backdoor exploited it. The malicious payload, once extracted during the build, used the ifunc mechanism to hijack a critical function. Specifically, it replaced a function called RSA_public_decrypt, the function that OpenSSH uses to verify authentication credentials. When a user tried to log in via SSH, the backdoor intercepted the authentication, checked for a specific cryptographic key, an Ed four forty-eight private key held only by the attacker, and if the key matched, granted full remote code execution. Before the login even completed. Before the system logged anything. Silent, invisible access.

There is one more twist that made this possible. OpenSSH does not normally use the XZ compression library. There is no reason for a compression library to be involved in SSH authentication. But many major Linux distributions, including Debian and Red Hat, patch OpenSSH to integrate with systemd, the system and service manager. That patched version of OpenSSH loads a library called libsystemd. And libsystemd loads liblzma, the XZ compression library. A chain of dependencies, each one reasonable on its own, created a path from a compression library to the authentication system of the most widely used remote access tool on the internet.

That is the supply chain in miniature. Not one failure, but a sequence of reasonable decisions that, taken together, created a vulnerability nobody designed and nobody intended. Each link was doing its job. The chain as a whole was a weapon.

The timing was calculated too. Jia Tan released the compromised version five point six point zero in late February two thousand twenty-four. Debian Sid and Fedora Rawhide, the unstable development branches of two major Linux distributions, picked it up quickly. From there, the compromised library would have migrated into the stable releases that run on production servers around the world. The window between insertion and mass deployment was narrow, perhaps weeks. And the attacker knew this. When the first release caused Valgrind warnings that might attract attention, Jia Tan released version five point six point one just two weeks later, patching the bugs in their own backdoor. Refining the weapon after its first test flight. They were debugging their attack in production.

The Discoverer

Andres Freund has described himself as a software engineer at Microsoft who works on PostgreSQL. He is, by all accounts, the kind of person who treats half a second of unexplained latency as a problem worth solving rather than a nuisance to ignore.

He was running micro-benchmarks, the precise kind of performance testing where even small amounts of noise in the system can skew the results. To reduce that noise, he needed his machine to be quiet. That is when he noticed that SSH logins were consuming more CPU than they should. Not a lot more. Just enough to be wrong.

I was doing some micro-benchmarking at the time, needed to quiesce the system to reduce noise. I saw sshd processes were using a surprising amount of CPU, despite immediately failing.

He profiled the SSH daemon and found that it was spending time inside liblzma, the XZ compression library. That should not happen. A compression library has no business running code during an SSH login. He remembered an odd Valgrind warning from automated PostgreSQL testing a few weeks earlier, a complaint about memory corruption that he had not gotten around to investigating. Now it clicked. The Valgrind warning and the SSH slowdown were connected. Both pointed to liblzma. Both pointed to something that should not be there.

Freund started reverse-engineering what the library was actually doing during authentication. What he found was not a bug. It was a weapon. A backdoor designed to give its creator silent access to any server running the compromised library.

On March twenty-ninth, two thousand twenty-four, he posted his findings to the oss-security mailing list, the public forum where security researchers disclose vulnerabilities to the community.

After observing a few symptoms I started to investigate. The upstream xz repository and the xz tarballs have been backdoored.

Within hours, GitHub suspended the XZ Utils repository. Debian shut down its build systems. Red Hat and SUSE issued emergency advisories. The vulnerability was assigned a severity score of ten out of ten. The highest possible rating.

Computer scientist Alex Stamos put the stakes in plain language.

This could have been the most widespread and effective backdoor ever planted in any software product. It would have given its creators a master key to any of the hundreds of millions of computers around the world that run SSH.

A master key to hundreds of millions of machines. Caught because one engineer refused to ignore five hundred milliseconds.

The Weakest Link

The technical sophistication of the backdoor was extraordinary. But the way the attacker got into position to plant it was not technical at all.

Lasse Collin had maintained XZ Utils since around two thousand five. He built it into one of the most critical compression libraries on Linux. It compresses kernel images, package archives, log files. If you run Linux, you use it. Collin maintained it alone, in his spare time, as what he called an unpaid hobby project.

By two thousand twenty-two, he was open about the toll it was taking.

I have not lost interest but my ability to care has been fairly limited mostly due to longterm mental health issues but also due to some other things. It is also good to keep in mind that this is an unpaid hobby project.

A solo maintainer of critical infrastructure, burning out in public, and the system had no mechanism to help. No funding. No co-maintainer pipeline. No organizational support. Just one person and a library that millions of machines depended on.

Then the pressure started. Accounts that security researchers later identified as likely sock puppets began complaining on the mailing list about the slow pace of development. An account called Jigar Kumar pushed hardest.

Progress will not happen until there is new maintainer.

Two weeks later, Kumar made the suggestion even more explicit, directly invoking Jia Tan by name.

Jia, I see you have recent commits. Why can you not commit this yourself?

Another account, Dennis Ens, pushed in the same direction. The pattern was coordinated. Complain about the pace. Praise Jia Tan. Suggest, sometimes subtly and sometimes not, that Jia Tan should be given more responsibility. Whether these accounts were operated by the same group behind Jia Tan has never been proven. But the timing and the message aligned perfectly. They were manufacturing the appearance of community frustration and pointing it at the person they wanted installed as maintainer.

It worked. Collin brought Jia Tan in. By October two thousand twenty-two, Jia Tan was added to the project's GitHub organization. By the end of November, the bug report email was changed to a shared alias that went to both Collin and Jia Tan. By December, Jia Tan had direct commit access to the repository. By March two thousand twenty-three, they were building and releasing official versions. The transition from stranger to trusted co-maintainer took less than eighteen months.

In total, Jia Tan made at least four hundred and fifty commits to the XZ repository. The vast majority of them were genuine improvements. Bug fixes, code cleanups, test enhancements. Only eight commits were malicious. Everything else was the price of entry, two and a half years of real work to earn the position from which the real work could begin. The entire attack surface was one exhausted human being.

What Did Not Catch It

Here is the part that should worry the entire software industry. The XZ backdoor was not caught by any of the systems designed to prevent exactly this kind of attack. It was not caught by code review. It was not caught by automated scanning. It was not caught by distribution maintainers inspecting packages before release. It was not caught by any security tool, any framework, any process, any audit.

It was caught by a PostgreSQL developer who happened to be micro-benchmarking on a machine that happened to have the compromised version installed, who happened to notice five hundred milliseconds of extra latency, and who happened to be the kind of person who could not let it go.

That is luck, not process, and the margin was as thin as luck always is. And the XZ attack was not even the first time the open source supply chain had been exploited this way. In two thousand eighteen, a developer named Dominic Tarr handed over maintenance of the event-stream JavaScript package to a stranger who had offered to help. The stranger inserted code that stole cryptocurrency wallets. The attack was cruder than XZ but the pattern was identical. A burned-out maintainer, a helpful stranger, a betrayal of trust. Season one of this series covered event-stream in episode twelve. Six years later, the same pattern worked again, just with more patience and far more technical sophistication.

The supply chain can be attacked in other ways too. In two thousand twenty-one, a security researcher named Alex Birsan published a paper describing dependency confusion. The concept is deceptively simple. Many companies use internal packages, libraries they built for their own use, hosted on private registries. Those internal packages often have names that do not exist on public registries. If an attacker publishes a malicious package to a public registry like npm or PyPI using the same name as a company's internal package, the company's build system might pull from the public registry instead of the private one and install the attacker's code. Birsan tested this against thirty-five major technology companies, including Microsoft, Apple, and PayPal. It worked against more than half of them. He did not need to spend two years building trust. He did not need a social engineering campaign. He just needed a package name and a public registry.

There is also typosquatting, where an attacker publishes a malicious package with a name almost identical to a popular one. A developer who types "reqeusts" instead of "requests" in their dependency file installs the attacker's package. It sounds trivial. It works often enough that package registries now actively scan for it.

The supply chain has many weak points. The XZ attack hit the most human one.

Building Walls Around the Chain

After the XZ backdoor, the security community did not just wring its hands. Real work began, and some of it had been in progress for years before XZ made the urgency undeniable. Three efforts stand out.

The first is SLSA, pronounced "salsa." Supply-chain Levels for Software Artifacts. Google proposed the framework in two thousand twenty-one, and it is now maintained by the Open Source Security Foundation. SLSA defines four levels of supply chain security, each more rigorous than the last. Level one requires that a build system automatically generates provenance, a record of how a software artifact was built, what went in, and what came out. Level two requires that the provenance is cryptographically signed. Level three requires that the build process itself is hardened, that no single person can tamper with it. Level four requires two-person review for releases and fully reproducible builds.

The XZ backdoor would have been detectable at SLSA Level three. The attack succeeded because one person could modify the build artifacts, the tarballs distributed to Linux distributions, without anyone else reviewing the change. At Level three, the build process would be isolated from individual maintainers, and any modification would leave a verifiable trace. At Level four, a second pair of eyes would have been required before the release went out.

The problem, of course, is adoption. Most open source projects operate at SLSA Level zero. No provenance. No signing. No verified builds. Just a tarball published by whoever has the keys.

The second effort is Sigstore, an open source project that provides free code signing. Traditionally, signing software releases required managing cryptographic keys, a complex and error-prone process that most individual maintainers skip. Sigstore eliminates the key management burden through something called keyless signing. A developer authenticates through an identity provider, Sigstore issues a short-lived certificate, and the signature is recorded in a public transparency log called Rekor. Anyone can verify the signature later without needing the original key.

Think of it as a public notary for software. You prove who you are, the notary stamps your package, and the stamp goes into a permanent public record. If someone later claims to have released version two point three of your library, anyone can check the record. Sigstore does not solve the trust problem, it does not tell you whether the person signing the package is trustworthy, but it creates accountability. If a signed package turns out to be malicious, you know exactly who signed it and when. That alone changes the calculus for an attacker.

The third effort is reproducible builds. The idea is simple in principle and extraordinarily difficult in practice. If the same source code, built with the same tools, in the same environment, always produces the same binary output, byte for byte, then anyone can verify that a compiled package actually came from the source code it claims to come from.

Debian has been working on reproducible builds for over a decade. NixOS was designed around the concept from the beginning. The Reproducible Builds project tracks progress across distributions, and as of two thousand twenty-four, over ninety percent of packages in Debian can be reproduced. The goal is to eliminate the gap between source code and compiled binary, the exact gap that the XZ attacker exploited.

Remember, the XZ backdoor was hidden in the distributed tarballs, not in the Git source code. If distributions had been verifying that the tarball produced the same binary as the source repository, the discrepancy would have been visible. The tarball contained files that the repository did not. A reproducible build system would have flagged that difference. The attack worked precisely because nobody was checking whether the thing they downloaded matched the thing the repository said it should be.

Together, SLSA, Sigstore, and reproducible builds form the outline of a trustworthy supply chain. Signed provenance so you know who built what. Hardened build systems so no single person can tamper with the output. Reproducible builds so anyone can verify the result. The pieces exist. The challenge is getting the millions of packages in the world's registries to adopt them. And the challenge underneath that challenge is the one that runs through this entire season of Git Good. Who pays for the work?

The AI Amplifier

There is a new variable in the supply chain equation, and it is growing fast.

AI-generated code is entering repositories at an accelerating rate. GitHub Copilot, code generation models, automated pull request bots. The volume of code being contributed to open source is increasing, and an increasing fraction of it is written by machines. This changes the supply chain calculus in ways that are not yet fully understood.

Consider the Jia Tan pattern. A patient attacker builds trust over years through legitimate contributions. An AI could generate those contributions faster, at higher volume, across more projects simultaneously. An attacker using AI assistance could maintain convincing activity across dozens of open source projects at once, building trust in each one, waiting for the right moment. The attack surface is not one burned-out maintainer. It is every burned-out maintainer.

On the defense side, AI also struggles with the XZ pattern. The backdoor was hidden in binary test files and activated through a build script. Current code review tools, including AI-powered ones, focus on source code changes. They read diffs. They flag suspicious patterns in code. But they do not inspect binary blobs. They do not trace build system modifications through to their runtime effects. The XZ backdoor was specifically designed to evade the kind of review that AI tools are best at. It was a test of whether the review process looks at what matters, and the answer was no.

There is a more fundamental issue. AI code review creates a false sense of security. A project that runs AI analysis on every pull request can truthfully say that all changes are reviewed. But "reviewed" and "understood" are not the same thing. An AI that flags common vulnerability patterns will miss a novel attack hidden in a legitimate feature. The XZ backdoor survived human code review too, but at least human review could in theory follow the chain of reasoning from build script to runtime behavior to authentication bypass. AI review, as it exists today, checks patterns against known vulnerability databases. Checking boxes is not the same as catching a patient, creative adversary who is specifically designing around whatever the boxes measure.

The episode thirty-three finding applies here too. That episode showed that repositories using AI coding assistants had forty percent more credential leaks than repositories without them. The same dynamic extends to the supply chain. AI coding assistants produce more code, faster. That code pulls in more dependencies. Those dependencies pull in more transitive dependencies. The chain gets longer and deeper with every generated function. And the tools designed to manage the chain were already struggling to keep up before AI entered the picture.

There is an irony worth naming. The companies building AI coding tools are the same companies whose platforms host the open source supply chain. They are simultaneously making the chain more complex and trying to sell tools to manage that complexity. Whether the tools can keep pace with the complexity they help create is an open question. The XZ attack suggests the answer is not yet.

The Impossible Ask

Strip away the frameworks and the tools and the acronyms and the supply chain problem comes down to one sentence. We are asking volunteers to be the last line of defense for the world's critical infrastructure.

Lasse Collin maintained XZ Utils alone, unpaid, for nearly two decades. His library was embedded in every major Linux distribution. It compressed the packages that updated operating systems, the logs that tracked server health, the archives that stored backups. Billions of devices depended on his work. And the system's response to that dependency was nothing. No funding. No support. No co-maintainer. No organizational home. Just gratitude, occasionally, when things worked, and blame, immediately, when things broke.

This is not just the XZ story. OpenSSL, the cryptography library that secured most of the internet's encrypted connections, was maintained by a handful of volunteers when the Heartbleed vulnerability was discovered in two thousand fourteen. The entire budget for the project was roughly two thousand dollars a year in donations. After Heartbleed, the Linux Foundation created the Core Infrastructure Initiative to fund critical open source projects. It helped. OpenSSL got proper funding. But the fundamental dynamic did not change. Most critical open source infrastructure is still maintained by people who are not paid to do it.

A two thousand twenty study by the Linux Foundation and the Laboratory for Innovation Science at Harvard identified the most commonly used free and open source software components in production applications. The list included dozens of small libraries maintained by one or two people, some of whom had not made a commit in months. These were not obscure projects. They were dependencies of dependencies, deep in the supply chain, present in millions of applications. The developers who wrote them often had no idea how widely their code was used.

The SLSA framework, at Level three and above, requires hardened build systems and multi-person review. That requires more than one person. For thousands of small but critical libraries, there is not even one person being paid to maintain them, let alone two. The framework describes what a secure supply chain looks like. It does not solve the problem of who pays for it.

Episode forty of this series will go deeper into the invisible maintainers, the people holding the internet together on nights and weekends. The XZ backdoor is the sharpest illustration of why their invisibility is dangerous. The attacker did not find a technical vulnerability. They found a human one. A person who was alone, exhausted, and grateful when someone offered to share the weight.

After the backdoor was discovered and removed, Lasse Collin resumed maintaining XZ Utils. The repository was eventually restored. Collin was a victim, not a perpetrator, a fact that bears repeating because the internet's initial reaction included blame directed at the person who had been targeted by what appears to have been a professional intelligence operation. The lesson is not that Collin should have been more careful. The lesson is that a system where one person, unpaid, is the sole guardian of infrastructure used by billions is a system built to fail.

The Next Five Hundred Milliseconds

The XZ backdoor was assigned CVE two thousand twenty-four dash three zero nine four. Severity: ten out of ten. Affected versions: five point six point zero and five point six point one of XZ Utils. Impact: potential remote code execution on any Linux system running a patched OpenSSH server with the compromised library. The response was swift, the damage was contained, and the internet continued to function. Because one person noticed, and because that person refused to ignore half a second of unexplained latency.

The security community responded with urgency. New scrutiny fell on single-maintainer projects. The OpenSSF increased funding for critical infrastructure audits. SLSA adoption accelerated. Sigstore integration became a priority for major package registries. The XZ incident became a case study taught in every supply chain security presentation for the next two years.

But here is the thing that keeps security researchers awake. The XZ backdoor was caught by accident. Not by design. Not by a framework. Not by a tool. Not by a process. By one engineer's refusal to ignore half a second.

The defenses being built, SLSA, Sigstore, reproducible builds, are real and valuable. They make the supply chain harder to attack. They raise the bar. They create verification where before there was only trust. But they are being adopted slowly, unevenly, and the attack surface is growing faster than the defenses. Every new package, every new dependency, every new link in the chain is another point where trust is required and verification is optional.

The next Jia Tan is already out there. Security researchers say this with the weary certainty of people who study patterns for a living. The XZ attack was not the first supply chain compromise and it will not be the last. They might be submitting their first helpful patch right now, to a library maintained by someone who is tired and alone and grateful for the help. They might be generating those patches with AI, maintaining plausible activity across ten projects instead of one. They might be more careful about performance. They might not leave a five hundred millisecond fingerprint. They might be more patient. They might target a smaller library that nobody monitors. They might skip the build system entirely and find a new angle that nobody has thought to defend against yet.

And the question the XZ incident poses is not whether the supply chain can be secured. It can, in theory. SLSA Level four, with signed provenance, hardened builds, reproducible outputs, and multi-person review, would catch most of what the XZ attacker did. The question is whether the world is willing to pay for it. Whether the companies that profit from open source infrastructure are willing to fund the people who maintain it. Whether the frameworks that describe security will ever be matched by the resources to implement them.

Season one of Git Good told you how the tool was built. This season has been showing you what it built. And in Act four, The Shadow, you have seen the dark side of permanence. Secrets that never die. Blame as a weapon. Identity as a trust problem. And now the supply chain, where every link is someone else's trust and every break is everyone's problem.

Git records everything faithfully. Every contribution. Every commit. Every release. It cannot tell you whether the person behind the commit deserves your trust. That question is as old as human civilization, and it still has no technical answer.

Half a second. That was the margin. Next time, there might not be a margin at all. That was episode thirty-six of Git Good.

Git verify-tag checks whether a release tag was actually signed by who it claims to be. When a maintainer tags a release, they can sign it with their cryptographic key. Git verify-tag confirms that the signature is valid and that the tag has not been tampered with. In a world where the most sophisticated supply chain attack in open source history was committed through the official release process, this is the command that asks the one question worth asking: did this release actually come from the person it claims to come from? Most people never run it. After the XZ incident, more people should.