Trust and the Supply Chain

Half a Second Too Slow

On a Friday evening in late March two thousand twenty-four, a software engineer named Andres Freund was running benchmarks on his PostgreSQL database. Freund worked at Microsoft, but his real passion was PostgreSQL, the open source database he had been contributing to for over a decade. He was testing on a machine running Debian Sid, the bleeding-edge development version of Debian Linux, and he noticed something strange. His SSH logins were taking about half a second longer than they should.

Half a second. Most people would not notice. Most people who noticed would shrug. A slow network, a loaded server, a background process eating cycles. But Andres Freund was the kind of engineer who noticed half a second and could not let it go.

He started digging. The SSH daemon was using an unusual amount of CPU during authentication. That was wrong. Authentication should be fast. He ran the connections through Valgrind, a tool that watches how programs use memory, and Valgrind threw errors. Something was corrupting memory during the login process.

Freund traced the problem to a library called liblzma, part of the XZ Utils compression package. This made no sense. A compression library should have nothing to do with SSH authentication. There was no legitimate reason for it to be running code during a login handshake.

Unless someone had put code there on purpose.

What Freund had stumbled onto, by accident, because he was the kind of person who could not ignore half a second, was the most sophisticated supply chain attack ever attempted against open source software. An attack that had been two and a half years in the making. An attack that, if it had succeeded, would have given an unknown actor a secret backdoor into virtually every Linux server connected to the internet.

And the story of how that attack was built is a story about Git. Not about Git's code, or its cryptography, or its hash functions. About Git's trust model. Because Git can verify that a commit came from a specific key. But it cannot verify that the person behind the key deserves your trust.

Jia Tan's Two-Year Con

On October twenty-ninth, two thousand twenty-one, a developer using the name Jia Tan submitted a patch to the xz-devel mailing list. The patch was unremarkable. A small improvement, the kind of contribution that open source projects receive every day. Nobody paid much attention.

By February two thousand twenty-two, Jia Tan had their first commit merged into the XZ Utils repository. Then another. Then another. Over the following months, the contributions kept coming. Bug fixes. Code cleanups. Test improvements. Documentation updates. Nothing flashy, nothing suspicious. Just steady, competent, helpful work. In total, Jia Tan would make at least four hundred and fifty commits to the XZ repository.

This is what patience looks like when it is weaponized. Jia Tan was not trying to sneak malicious code into the project. Not yet. They were building something far more valuable than a code change. They were building trust.

Meanwhile, the person they needed trust from was struggling.

Lasse Collin had been the sole maintainer of XZ Utils since around two thousand five. He had built liblzma into a critical piece of infrastructure, the compression library that virtually every Linux distribution depends on. It compresses kernel images, package archives, log files. If you use Linux, you use liblzma, whether you know it or not.

But maintaining critical infrastructure as a solo volunteer takes a toll. By twenty twenty-two, Collin was open about the fact that he was not doing well.

My ability to care has been fairly limited mostly due to longterm mental health issues but also due to some other things.

If you listened to episode twelve of this series, about maintainer burnout and the event-stream attack, this should sound familiar. A solo maintainer of critical infrastructure, burning out under the weight of responsibility nobody asked them to carry and nobody pays them for. The event-stream attack in two thousand eighteen exploited the same vulnerability, not in the code, but in the human. A burned-out maintainer handed the keys to a stranger who seemed helpful.

The XZ attack was event-stream scaled up to something that looks like a professional intelligence operation.

Here is where the social engineering gets precise. In the summer of twenty twenty-two, accounts that appear to be sock puppets started pressuring Collin on the mailing list. An account called Jigar Kumar complained about the pace of development.

Progress will not happen until there is new maintainer.

Another message from Kumar, two weeks later, made the nudge explicit.

Jia, I see you have recent commits. Why can't you commit this yourself?

Another account, Dennis Ens, pushed in the same direction. The pattern was consistent across both accounts. Complain about the slow pace of maintenance. Suggest, sometimes subtly and sometimes not, that Jia Tan should be given more responsibility. Create the impression that the community was frustrated and that bringing Jia Tan on as a co-maintainer was the obvious solution.

Whether these accounts were operated by the same person or group behind Jia Tan has never been proven. But the timing and the message were coordinated. And the target, a burned-out maintainer who had already said publicly that he was struggling, was chosen with care.

And it worked.

In May twenty twenty-two, Collin wrote on the mailing list that Jia Tan had been helping off-list and might have a bigger role in the future. By October twenty-eighth, Jia Tan was added to the Tukaani GitHub organization. By November thirtieth, the bug report email was changed to a shared alias that went to both Collin and Jia Tan, and the project README was updated to list both as maintainers. By December thirtieth, Jia Tan had direct commit access to the repository.

In March twenty twenty-three, Jia Tan released XZ Utils version five point four point two. Their first release. A trusted co-maintainer now, with the keys to one of the most widely deployed compression libraries on the planet.

Two years of patient, helpful, boring contributions. Two years of building a reputation commit by commit. Two years of waiting for the right moment to use the trust they had earned.

The Backdoor

The backdoor itself was a masterpiece of concealment. It was not hidden in the source code. If you read every line of C in the XZ Utils repository, you would not find it. Instead, it was buried in binary test files, files that looked like compressed test data but contained obfuscated malicious code that would be extracted and compiled during the build process.

In June twenty twenty-three, another account, this one using the name Hans Jansen, contributed code that added a feature called ifunc support to the build system. This code was innocent on its own. But it would later be exploited as part of the backdoor mechanism, a piece placed on the board months before the endgame.

On February twenty-third, two thousand twenty-four, Jia Tan committed the malicious test files. On February twenty-fourth, they released XZ Utils version five point six point zero.

Here is what the backdoor did. When a Linux system was built with the compromised version of liblzma, and when that system ran an SSH server, the backdoor would intercept the authentication process. An attacker holding a specific cryptographic key, an Ed four forty-eight private key, could send a specially crafted authentication request. The backdoor would recognize the key, bypass all authentication, and grant the attacker remote code execution on the machine.

Not just access. Full remote code execution. Before the SSH authentication even completed. The vulnerability was assigned a severity score of ten out of ten, the highest possible rating.

Think about what that means. SSH is how system administrators connect to servers. It is how cloud infrastructure is managed. It is how millions of Linux servers worldwide are accessed remotely. A backdoor in the SSH authentication chain, silently deployed through a routine library update, would have been catastrophic. Every SSH-enabled Linux server that updated to the compromised version would have been vulnerable. And nobody would have known, because the backdoor was designed to be invisible. It added half a second of latency and a small amount of extra CPU usage. That was the only trace.

On March ninth, Jia Tan released version five point six point one, which patched some of the anomalous behavior that Valgrind had detected in the first version. They were fixing the bugs in their own backdoor, making it harder to detect. Refining the weapon after its first test flight.

The compromised versions had already been picked up by Debian Sid and Fedora Rawhide, the unstable development branches of two major Linux distributions. They were making their way toward the stable releases that run on production servers worldwide. The window between insertion and mass deployment was closing.

And then Andres Freund noticed that his SSH logins were half a second slow.

Andres Freund and the Performance Anomaly

Freund is a principal software engineer at Microsoft and one of the most respected PostgreSQL developers in the world. He is, by all accounts, the kind of person who reads database source code for relaxation and who treats unexpected latency as a personal affront.

When he traced the SSH slowdown to liblzma, he did not file a bug report and move on. He started reverse-engineering what the library was actually doing during SSH authentication. And what he found made no sense, unless the library had been deliberately tampered with.

On March twenty-eighth, twenty twenty-four, Freund privately reported his findings. A CVE number was assigned the same day. The next day, March twenty-ninth, he posted his full analysis to the oss-security mailing list, the public forum where security researchers disclose vulnerabilities.

After observing a few symptoms I started to investigate. The upstream xz repository and the xz tarballs have been backdoored. At first I thought this was a compromise of Debian's package, but it turns out to be upstream.

That last sentence is the one that sent shockwaves through the industry. Not a compromised downstream package. Upstream. The source itself.

The response was immediate. Within hours, GitHub suspended the XZ Utils repository and Jia Tan's account. Debian shut down its build systems and began rebuilding from trusted sources. Red Hat and SUSE issued emergency advisories. Every major Linux distribution scrambled to check whether the compromised versions had reached their stable releases.

And then Jia Tan vanished. The account went silent. The email addresses stopped responding. Whoever had spent two and a half years building trust, patiently contributing to an open source project, socially engineering a burned-out maintainer, and carefully constructing a backdoor that could have compromised the global internet, simply disappeared.

To this day, nobody knows who Jia Tan is. The sophistication and patience of the attack, combined with the specific target, an SSH authentication bypass activatable with a single cryptographic key, has led many security researchers to conclude that this was a state-sponsored operation. The level of effort, two and a half years of sustained social engineering and technical work, is not the profile of a lone actor looking for a payday. This looks like intelligence work. But there is no proof. Jia Tan is a ghost.

And here is the part that should keep you up at night. This attack was discovered by accident. Andres Freund was not looking for a backdoor. He was benchmarking a database and noticed a performance anomaly that should not have existed. If the backdoor had been slightly more efficient, if it had not added that half second of latency, if Freund had not been running his tests on a machine with the compromised version, if he had been the kind of engineer who shrugged off small delays, the compromised library would have shipped to every major Linux distribution's stable release. Millions of servers. Silent, undetectable access for whoever held the key.

The security of the internet's infrastructure came down to one engineer's refusal to ignore half a second.

Secrets That Never Die

The XZ backdoor exploited trust in the supply chain, the path code takes from a developer's machine to your server. But there is another kind of security problem that Git creates, one that is far more common and far less dramatic but no less dangerous.

People accidentally commit secrets.

Database passwords. API keys. Private encryption keys. Cloud credentials. Authentication tokens. Every day, developers push files to Git repositories that contain sensitive information they never intended to share. Sometimes it is a configuration file with a hardcoded password. Sometimes it is a dotenv file that should have been in the gitignore but was not. Sometimes it is a private key generated for testing that ended up in a commit and was pushed to a public repository on GitHub.

According to GitGuardian, a company that scans public repositories for leaked credentials, nearly thirteen million secrets were exposed on public GitHub repositories in twenty twenty-three alone. That is a twenty-eight percent increase from the previous year. By twenty twenty-four, the number had grown to almost twenty-four million. Seven in every thousand commits on GitHub contains at least one secret. Nearly five percent of all active repositories have exposed at least one credential.

Those numbers are staggering. But the truly alarming statistic is what happens after a secret is exposed. GitGuardian sends alert emails when it detects leaked credentials. Ninety percent of exposed secrets remain active five days after notification. Only two point six percent are revoked within the first hour.

This is where Git's design, the same design we celebrated in episodes four and five, becomes a security liability. Remember how Git stores everything as content-addressed objects in a permanent history. When you commit a file, Git creates a snapshot. When you delete that file in the next commit, Git creates another snapshot. But the first snapshot, the one containing the secret, still exists in the object store. Running git log will not show you the deleted file in the current state of the project. But the data is there, in the repository's history, recoverable by anyone who clones the repository and knows where to look.

Deleting a file does not delete it from Git's history. It removes it from the current snapshot. The previous snapshot still has it. And every clone of the repository has every snapshot. This is exactly the data preservation we praised when we talked about the reflog in episode fourteen. Git almost never loses anything. Until the thing you want to lose is an AWS key that grants access to your production database.

Tools exist to rewrite Git history and truly remove sensitive data. The command git filter-repo can walk through every commit in a repository and excise specific files or patterns. But this is a destructive operation. It changes every commit hash from the point of the removal forward, which means every collaborator's clone is now out of sync with the repository. For large projects with many contributors, rewriting history is painful and sometimes impractical. In practice, the correct response is to assume the secret is compromised the moment it is pushed and rotate it immediately. Change the password. Revoke the API key. Generate new credentials. Treat the old ones as burned.

The irony is sharp. Git's design is brilliant for preventing accidental data loss. Content-addressing means nothing gets silently overwritten. The reflog means nothing gets truly forgotten. Every clone is a full backup. These are features. Until the data you want to lose is a secret that should never have been committed in the first place.

Signed Snapshots and the Trust Gap

So Git's permanent history makes leaked secrets hard to truly erase. And its social trust model, where anyone with commit access can push code, made the XZ attack possible. Is there a technical solution?

Partly. Git has supported cryptographic commit signing since two thousand twelve. The idea is straightforward. When you create a commit, you sign it with your private key. Anyone who has your public key can verify that the commit came from you and has not been tampered with. Running git log with the show-signature flag displays the verification status of each commit. Running git verify-commit checks a specific commit's signature against known keys. Running git verify-tag does the same for tagged releases.

GitHub shows a green "Verified" badge next to signed commits. It is a small visual indicator that someone has cryptographically proven authorship. You can sign with a GPG key, the traditional approach that requires managing a separate keychain and understanding public key infrastructure, or with an SSH key, a simpler option that GitHub added in twenty twenty-two. Since most developers already have an SSH key for pushing to GitHub, the SSH signing path removes one of the biggest barriers to adoption.

In theory, widespread commit signing would close part of the trust gap. If every commit to XZ Utils had been signed with a verified key, and if distribution maintainers had verified those signatures before building packages, the backdoor would have been traceable to a specific cryptographic identity. It would not have prevented the attack. Jia Tan had legitimate commit access and could have signed commits with their own key. But it would have created a stronger chain of evidence, and it would have made certain types of tampering harder to pull off undetected.

In practice, almost nobody signs their commits. The exact adoption rate is difficult to measure because GitHub does not publish global statistics on signing. But browse any major open source project and count the green "Verified" badges. They are rare. Most commits are unsigned. The friction of setting up keys, the complexity of managing them across machines, the fact that unsigned commits work perfectly fine and nobody stops you from pushing them, all conspire to make signing something that security-conscious developers do and everyone else ignores.

SSH signing has lowered the barrier. Your existing SSH key can now serve double duty as your signing key. A few lines of Git configuration and you are set. But even with the reduced friction, the cultural shift has been slow. Signing feels like extra work for a problem most developers have never personally experienced. It is a seatbelt in a world where most people have never been in a crash.

This creates a gap between what Git can verify and what Git actually verifies in practice. The cryptographic machinery is there, built in, ready to use. The social adoption is not. And so most of the software supply chain operates on implicit trust. You trust that the person who pushed the commit is who they say they are. You trust that the maintainer reviewed the code carefully. You trust that the binary test files are actually test files. You trust, because verifying is hard and trust is easy and nothing has gone wrong yet.

Until it does.

The Hardest Problem in Software Security

Here is what the XZ backdoor really taught us. Git's trust model is ultimately social, not technical.

Consider what Git can verify. It can verify that a file's content has not been tampered with, because the content hash would change. It can verify that a commit was signed by a specific cryptographic key. It can verify that the chain of commits has not been rewritten, because every commit includes its parent's hash. The technical machinery is sound. Episode five covered this in depth, how content-addressing creates an unbroken chain of verification from the very first commit.

Now consider what Git cannot verify. It cannot verify that the person behind a key is trustworthy. It cannot verify that a code review was thorough. It cannot verify that a maintainer was not socially engineered over two years by someone with the patience and resources of a nation-state. It cannot verify intent.

Jia Tan did not hack Git. They did not exploit a buffer overflow or bypass an authentication system. They followed every rule. They submitted patches through the proper channels. They got code reviewed by the existing maintainer. They earned commit access through demonstrated competence over hundreds of contributions. They signed off on releases through the project's established process. Every single step was legitimate, right up until the step that was not.

The XZ attack exploited the fundamental assumption underneath all of open source: that contribution is a signal of good faith. Open source works because strangers help each other. The entire model depends on the idea that someone you have never met can submit a patch, and if the patch is good, you merge it. This is the bazaar model from episode two, the one Eric Raymond championed. No gatekeepers. No credentials check beyond the quality of the code. Anyone can participate. That openness is what makes open source powerful. And it is what makes it vulnerable.

After the XZ incident, the security community debated responses. Require two-person review for all changes to critical libraries. Fund maintainers so they are not burned-out volunteers susceptible to social engineering. Mandate commit signing for all releases. Improve reproducible builds so that compiled packages can be verified against their source code. Each of these addresses part of the problem. None of them solve the fundamental issue, which is that trust is a human judgment, and human judgment can be manipulated by someone willing to invest two and a half years of patience.

Lasse Collin, for his part, resumed maintaining XZ Utils after the backdoor was removed. The project's GitHub repository was eventually restored. Collin was a victim, not a perpetrator, a fact worth stating clearly. He was a solo volunteer maintaining critical infrastructure that millions of machines depend on, and he was targeted by what appears to be a professional intelligence operation exploiting the very thing that makes open source work. The lesson is not that Collin made a mistake by trusting a helpful contributor. The lesson is that a system where one exhausted, unpaid person is the sole gatekeeper for infrastructure used by millions is a system designed to fail.

Episode five explained how Git uses content-addressing to create an unbroken chain of data integrity. Every object has a fingerprint. Change one bit and the fingerprint changes. This makes Git excellent at detecting accidental corruption or unauthorized modification. But the XZ backdoor was not unauthorized. Jia Tan had commit access. The backdoor was committed through the proper process, reviewed, merged, released. The fingerprints faithfully recorded exactly what Jia Tan put into the repository. The cryptography worked perfectly. It verified the data. It could not verify the intent behind the data.

This is where the series arc comes together. Git was born from a crisis of trust, the BitKeeper license revocation in episode three. Linus built it to be trustless at the data layer. Every bit verified, every snapshot fingerprinted, every clone independent. Then GitHub, in episodes ten through twelve, built a social trust layer on top of Git, where reputation, contribution history, and community standing substitute for formal verification. The XZ attack showed what happens when someone games the social layer while the technical layer watches, faithfully recording every commit, because the commits are cryptographically valid even when the intent is malicious.

The hardest problem in software security is not cryptography. Cryptography is the part we know how to do. The hardest problem is deciding who to trust, and that problem is as old as human civilization, and it has no technical solution. There is no algorithm for trustworthiness. There is no hash function for good faith.

Andres Freund noticed half a second. The internet owes him a debt it will probably never fully appreciate. But the next Jia Tan is already out there, somewhere, patiently submitting their first helpful patch to a project maintained by someone who is tired and overwhelmed and grateful for the help.

And Git will faithfully record every commit.

Git verify-commit answers one question: did this commit come from the key it claims to come from? That is all. A valid signature does not mean the code is safe. It does not mean the review was thorough. It means a specific cryptographic identity attached their name to this change. In a world where the most sophisticated attack ever attempted against open source infrastructure was committed through entirely legitimate processes, that thin guarantee is the best the machine can offer. The rest is up to humans.