The Regulated World: When Git Became a Legal Requirement

The Letter of the Law

This is episode forty-two of Git Good. In the spring of two thousand twenty-three, Deb Nicholson sat down to write a blog post that she hoped would not be necessary. Nicholson was the executive director of the Python Software Foundation, the nonprofit that oversees the programming language used by millions of developers, data scientists, and students worldwide. The foundation also runs PyPI, the Python Package Index, a public repository where anyone can upload a Python library and anyone else can download it for free. Over four hundred thousand packages. Billions of downloads per month. The entire thing maintained by a small team on a modest budget.

The European Union had just proposed a new regulation called the Cyber Resilience Act. The stated goal was sensible. Too much software was shipping with known vulnerabilities. Too many internet-connected devices were sold without any promise of security updates. The regulation would require that products with digital elements, meaning anything with software in it, meet specific cybersecurity requirements before being sold in Europe. Companies would be liable for vulnerabilities. There would be fines. Up to fifteen million euros or two and a half percent of global annual turnover, whichever was higher.

Nicholson read the draft and saw a problem. The text drew no clear line between a corporation selling a product and a nonprofit hosting a public repository. Under the proposed language, the Python Software Foundation could potentially be held liable for any product that included Python code. Every web application, every data pipeline, every machine learning model, every script some teenager wrote for a school project. All of it touched Python. All of it flowed through PyPI. And the foundation had never charged a cent for any of it.

The risk of huge potential costs would make it impossible in practice for us to continue providing Python and PyPI to the European public.

She was not being dramatic. She was describing the logical endpoint of a law that assigned liability upstream. If the Python Software Foundation could be sued because a company in Munich shipped a product with a bug in a library downloaded from PyPI, then the foundation had two choices. Spend money it did not have on compliance and legal defense, or stop serving European users entirely. Neither option made anyone more secure.

The Python Software Foundation was not alone in noticing the problem. But the alarm it raised would set in motion a debate that reaches to the heart of what this series has been exploring for forty-one episodes. What happens when the thing Linus Torvalds built to avoid bureaucracy becomes the very thing bureaucracy demands?

The Regulatory Wave

The story did not start in Europe. It started in the United States, on a Friday afternoon in May of two thousand twenty-one, when President Biden signed Executive Order fourteen-zero-two-eight. The title was "Improving the Nation's Cybersecurity," and it was a direct response to two incidents that had rattled the American government. The SolarWinds breach, in which Russian intelligence had compromised a widely used network management tool and used it to infiltrate federal agencies. And the Colonial Pipeline ransomware attack, which shut down fuel supplies across the southeastern United States for nearly a week.

The executive order did many things, but one requirement stood out. Any company selling software to the United States federal government would need to provide a software bill of materials. An SBOM. A formal record of every component inside the software, every library it depended on, every dependency of every dependency, all the way down. Think of it as a nutrition label for code. Instead of listing calories and sodium, it lists every piece of software baked into the product, who made it, what version it is, and where it came from.

The National Telecommunications and Information Administration published the minimum requirements two months later. Three areas. Data fields, meaning what information each entry must contain. Automation support, meaning it had to be machine readable, not a PDF someone typed up by hand. And practices and processes, meaning how you generate it, how you share it, how you keep it up to date. Three accepted formats. SPDX from the Linux Foundation. CycloneDX from OWASP. SWID tags from an existing industry standard. The deadline for compliance was September of two thousand twenty-three.

Then came the framework. NIST, the National Institute of Standards and Technology, published Special Publication eight hundred dash two eighteen in February of two thousand twenty-two. The Secure Software Development Framework, version one point one. It organized secure development into four practice groups. Prepare the Organization. Protect the Software. Produce Well-Secured Software. Respond to Vulnerabilities. Every software vendor selling to the federal government would need to attest that they followed these practices.

The United States was telling its suppliers: show us what is in your software, and prove you built it securely. Europe looked at that and said: we will go further. We will tell everyone.

The Cyber Resilience Act was proposed by the European Commission in September of two thousand twenty-two. Where the American executive order applied only to government procurement, the CRA would apply to every product with digital elements sold in the European market. Not just to government vendors. To everyone. Hardware manufacturers, software companies, anyone putting a connected device or a software product into the hands of a European consumer. The scope was enormous. And the open source community read the draft text and realized, with growing alarm, that it might include them.

The Exemption Problem

The fight over the open source exemption lasted eighteen months. It was the most consequential policy battle the open source community had ever waged, and most developers never heard about it.

The core question was deceptively simple. If a volunteer writes a library on a Saturday afternoon and publishes it to GitHub, and a corporation downloads that library, bundles it into a product, and sells it to a hospital in Berlin, and the library has a vulnerability that gets exploited, who is responsible?

The corporation, obviously. They chose to use the library. They shipped the product. They charged money for it. They should have tested it.

But the original CRA draft was not so clear. The language focused on who "placed the product on the market," and the definition of market activity was broad enough that hosting a package on a public repository could qualify. The Python Software Foundation was not selling Python. But it was placing Python on the market in the sense that anyone in Europe could download it.

Assigning liability to every upstream developer would create less security, not more.

The argument was sharp and it was correct. If a volunteer maintainer in Argentina could be held liable under European law because a German company used their library, the rational response was to stop publishing software. Not to write more secure software. Not to add vulnerability scanning. Just to stop. Remove the library from the public repository, or block European downloads, or shut down the project entirely. The law would not create better security. It would create less software.

The Open Source Initiative, the Eclipse Foundation, the Linux Foundation, and dozens of smaller organizations mobilized. They did not argue against regulation. They argued that regulation needed to understand how open source actually works. A company that bundles open source code into a commercial product is the one making the commercial decision. The liability should follow the money, not the code.

The European Parliament listened. Slowly. The final text of the Cyber Resilience Act, published as Regulation twenty-four slash twenty-eight forty-seven in the Official Journal of the European Union on December tenth, two thousand twenty-four, included a significant concession. Free and open source software that is not monetized by its developers is not considered a commercial activity under the regulation. If you write code, publish it for free, and do not charge for it, you are generally exempt.

Generally. That word is doing heavy lifting. Because the exemption created a new category in European law, one that had never existed before in any regulatory framework anywhere in the world. The open source steward.

The Steward

An open source steward, according to the CRA, is a legal person, other than a manufacturer, that has the purpose or objective of systematically providing support on a sustained basis for the development of specific products with digital elements qualifying as free and open source software and intended for commercial activities.

Read that sentence again. It means a foundation. The Apache Software Foundation. The Python Software Foundation. The Eclipse Foundation. The Rust Foundation. The Linux Foundation. These organizations do not sell software. But they systematically support the development of software that is used commercially. They are not manufacturers, but they are not uninvolved bystanders either. They occupy a middle ground that, until December of two thousand twenty-four, had no legal name.

Open source stewards get a lighter regulatory burden than manufacturers. They do not need to pass the full conformity assessment. But they do have obligations. They must develop and document a cybersecurity policy. They must cooperate with market surveillance authorities. And starting in September of two thousand twenty-six, they must report actively exploited vulnerabilities and severe incidents.

Mike Milinkovich, the executive director of the Eclipse Foundation, recognized that the open source community could not wait for regulators to define what compliance looked like. In April of two thousand twenty-four, he brought seven foundations together. Apache. Blender. OpenSSL. PHP. Python. Rust. Eclipse. Their goal was to build common specifications for secure software development before the regulation became fully enforceable.

Time is of the essence. The regulation comes into force in two thousand twenty-seven, and the open source community needs to build the processes and specifications that will allow us to comply without destroying the collaborative development model that makes open source work.

The specifications they are building draw from what foundations already do. Coordinated vulnerability disclosure. Peer review. Release processes. The argument Milinkovich and the other foundation leaders are making is subtle but important. Open source already does most of what the CRA requires. It just does it informally. The challenge is not to invent new security practices. It is to document existing ones in a way that satisfies a regulator.

But here is the part that keeps foundation directors awake at night. The CRA exempts non-commercial open source. The steward category gives foundations a lighter burden. But what about the gray area in between? A solo developer who maintains a popular library and accepts fifty dollars a month through GitHub Sponsors. Is that commercial activity? A maintainer whose employer pays them to work on an open source project twenty percent of the time. Is the project commercial because the maintainer is compensated? The Linux kernel is free and open source, but much of its development is funded by corporations with commercial interests. Where does the line fall?

The regulation does not say. The implementation guidance, still being drafted, will have to answer these questions. And the answers will determine whether thousands of small open source projects can continue to operate as they always have, or whether the compliance burden pushes them to stop distributing in Europe entirely.

The Commit as Compliance Artifact

This is where the story circles back to Git. Because when regulators ask for an audit trail, a verifiable record of who wrote what code, when it was reviewed, and whether it was approved, they are describing something that already exists. They are describing git log.

Every git commit records an author, a date, a message, and a cryptographic hash that chains it to every commit before it. Every pull request on GitHub or GitLab records who reviewed the code, what comments were made, and when it was merged. The entire development history of a project, every decision, every change, every approval, is already there, baked into the repository metadata that most developers never think about.

The SBOM requirement makes this even more concrete. When you generate a software bill of materials for a project, you are tracing its dependency tree. And that dependency tree is defined by files that live in a git repository. The package dot json. The requirements dot txt. The Cargo dot toml. The go dot mod. Each one is a versioned file with a commit history showing exactly when each dependency was added, who added it, and what version they pinned. The SBOM is not something you bolt on after the fact. It is something you extract from the history that git was already keeping.

Season one of this series told the story of how Linus Torvalds built git with cryptographic integrity at its core. Every commit is hashed. Every hash depends on every previous hash. The entire history is a chain that cannot be altered without detection. He did this because he needed to trust the Linux kernel's development process, to know that the code arriving from thousands of contributors had not been tampered with along the way. He was solving a trust problem, not a compliance problem.

But regulators have a trust problem too. They need to verify that the software in a medical device or a power grid controller was developed according to secure practices. They need proof. And signed git commits, the feature that Season one episode nineteen explored, provide exactly that proof. A commit signed with a GPG or SSH key is a cryptographic attestation. It says: this specific person, verified by this specific key, authored this specific change at this specific time. It is not just a log entry. It is evidence.

GitHub added persistent commit signature verification in November of two thousand twenty-four, allowing verified commits to retain their status even when signing keys are rotated. The timing was not coincidental. The CRA had entered into force the month before.

The feature that almost nobody used is becoming the feature that regulators might require. Signed commits were always available. The tooling was always there. Most developers never bothered because there was no reason to. Now there is a reason. It is called compliance.

The Small Project

Here is the part of the story that is hardest to tell fairly, because both sides are right.

The regulations are necessary. The SolarWinds breach happened because nobody was checking what was inside the software. The xz-utils backdoor, which episode thirty-six of this series covered in detail, succeeded because a lone maintainer was too exhausted to question the person offering help. The internet runs on software that nobody audits, maintained by people nobody pays, with security practices that range from rigorous to nonexistent. Something had to change.

But the regulations are also crushing for small projects. The CRA was designed with Microsoft and Siemens and Samsung in mind. Companies with legal departments and compliance teams and budgets measured in billions. The open source exemption and the steward category were supposed to shield smaller projects from that weight. And for projects under a major foundation, they probably will.

But thousands of useful, widely depended-upon projects do not sit under any foundation. They sit in a single developer's GitHub account. That developer might accept donations. They might do consulting work related to the project. They might have a day job at a company that uses the project commercially. Any of these activities could, depending on how implementation guidance is written, push their project from "non-commercial" into "commercial activity" and trigger the full compliance requirements.

A solo developer with no legal team, no compliance budget, and no foundation backing does not have the resources to generate SBOMs, document cybersecurity policies, file vulnerability reports with European authorities, and undergo conformity assessments. The cost is not just financial. It is time. The hours spent on compliance are hours not spent fixing bugs, reviewing pull requests, or writing the code that made the project valuable in the first place.

The defenders of the regulation say the exemption handles this. The critics say the exemption is ambiguous enough to be useless in practice. Both are partially right, and we will not know who is more right until the first enforcement actions arrive after two thousand twenty-seven.

The Machine in the Audit Trail

And then there is the question that nobody in two thousand twenty-one, when Executive Order fourteen-zero-two-eight was signed, could have anticipated. What happens when the code is not written by a person at all?

Episode forty-one of this series explored AI-generated commits. The vibe commit, where a developer describes what they want and an AI writes the code. The pattern is accelerating. GitHub's own data shows Copilot contributing to over a million pull requests per month. AI-authored code is flowing into git repositories at a rate that would have seemed absurd three years ago.

The CRA requires traceability. Who wrote this code? When? Was it reviewed? For human-authored code, git provides clear answers. The author field. The committer field. The signed key. But for AI-generated code, those fields become fiction. The author field says the developer's name. The commit is signed with the developer's key. But the developer did not write the code. They prompted it. They may not have read every line. They clicked merge.

If that code contains a vulnerability, and it will, because AI models produce vulnerable code at measurable rates, who is liable under the CRA? The developer whose name is on the commit? The company that employed them? The AI company whose model generated the code? The platform that trained the model on open source repositories?

The regulation does not answer this question. It was written before AI code generation became widespread. The NIST framework does not address it either. The SSDF version one point two draft acknowledges the existence of AI-generated code but does not assign liability differently. The audit trail that git provides, the very trail that makes git so valuable for compliance, assumes that the author field means what it says. That assumption is eroding.

I want to decide for myself. I am very much against unnecessary rules imposed by society.

Linus Torvalds said that in an interview years before any of this happened. He built git because the Linux kernel needed a tool that respected the developer's autonomy. No central server telling you what you could or could not do. No gatekeeper approving your changes before they existed. Just a distributed system that tracked everything and trusted the humans using it to sort out the rest.

The irony is almost too neat. The tool built to avoid bureaucracy has become the bureaucracy's favorite artifact. The commit history that Linus designed so he could trust kernel contributors is now the audit trail that European regulators will use to verify compliance. The signed commits that almost nobody bothered with are becoming mandatory infrastructure. Git was supposed to be the opposite of process. It was supposed to be freedom.

That was episode forty-two. The regulated world is coming, and git is at its center, not because anyone planned it that way, but because git already does what regulators need. It records who did what, when, and in what order. It chains those records together cryptographically. It stores them in a format that can be audited, searched, and verified. Linus built a tool for tracking patches. Regulators found a compliance engine. And somewhere between the freedom git was designed for and the accountability it is being asked to provide, the future of open source will be decided. Next episode, we follow the money.

Git log with the format flag, percent H, percent a n, percent a e, percent a D, and the verify-signatures flag. It prints every commit hash, author name, author email, and date, then checks whether the commit was signed and whether the signature is valid. Run it on your own repository and you will probably see a long list of unsigned commits followed by, if you are lucky, a few signed ones near the end. That ratio is about to change. Not because developers suddenly care about cryptography, but because regulators are about to care for them.