Every Clone is a Kingdom

Borrowing Books from the Library

Picture a library. Not a great metaphor for software, but bear with me, because this is exactly how developers thought about code for fifteen years. You walk in, you find the book you need, you check it out. While you have it, nobody else can edit it. When you're done, you return it. The library has one copy. The library is the truth.

That was version control before Git. You checked out code from a central server. The server had the history. The server had the branches. The server decided who could do what. Your local machine had a working copy, a temporary loan, and nothing more. Want to see what happened six months ago? Ask the server. Want to commit your work? Talk to the server. Want to create a branch? The server handles that. Everything of consequence happened on one machine in one location, and if that machine was unreachable, you waited.

In the last episode, we looked at what Git stores: four object types, content-addressed by cryptographic fingerprints, snapshots instead of differences. A beautifully simple storage model. But knowing what Git stores doesn't explain why it changed everything. The real revolution wasn't the data model. It was what happened when you ran one command.

git clone.

When you clone a Git repository, you don't get a working copy. You don't get a checkout. You get everything. Every commit, every branch, every tag, every piece of history from the very first snapshot to the most recent. Your machine becomes a complete, self-contained repository. Not a mirror. Not a backup. A full peer, indistinguishable from the one you cloned from.

Every clone is a kingdom. And in two thousand five, that was a genuinely strange idea.

The Kernel's Web of Trust

To understand why Linus designed Git this way, you have to understand how the Linux kernel is actually built. It's not a project where five hundred developers all push to one server. It never was, even before Git existed. The kernel has a hierarchy, but it's not the corporate kind with org charts and access control lists. It's a web of trust built on personal reputation.

At the bottom are individual developers writing patches. They send their work to a subsystem maintainer, the person responsible for, say, networking drivers, or the USB stack, or the graphics subsystem. That subsystem maintainer reviews the patches, tests them, and if they're good enough, pulls them into their own tree. Above the subsystem maintainers sit a handful of senior maintainers managing broader areas. And at the top sits Linus, who pulls from maybe ten or fifteen people he personally trusts.

The way I work, I only need to trust five, ten, fifteen people. If I have a network of trust that covers those people, I can pull from them.

This is crucial. In a centralized system, trust is binary. Either you have commit access to the server or you don't. And that creates politics. Who gets access? Who decides? In the Linux kernel with its thousands of contributors, giving everyone write access to one server would be chaos. But restricting access to a small group would create a bottleneck.

This whole commit access issue, which some companies are able to ignore by just giving everybody commit access, causes endless hours of politics in most open source projects. If you have a distributed model, it goes away. Everybody has commit access. You can do whatever you want to your project.

Torvalds laid that out in his two thousand seven Google talk, and distributed version control dissolves the whole problem. Every developer has their own complete repository. They can commit, branch, experiment, do whatever they want, without asking anyone's permission. The question shifts from "who is allowed to write" to "whose work do I choose to incorporate." Trust flows through people and reputation, not through server permissions. The networking maintainer trusts a driver developer because they've reviewed their patches for years. Linus trusts the networking maintainer because that person has been solid for a decade. Nobody needs a central authority to assign roles. The hierarchy is social, not technical.

Git was designed around this workflow because this workflow already existed. The kernel community had been operating as a distributed trust network since the early nineties, long before they had tools to match. Linus didn't invent a new way of working. He built a tool that finally fit the way his community already worked.

Committing on a Plane

But the distributed model wasn't just about the kernel's unusual governance. It had a practical consequence that changed how every developer on earth worked, even solo developers with no lieutenants and no hierarchy.

You could work offline.

This sounds unremarkable now. But in two thousand five, a lot of software development happened on laptops carried to coffee shops, airports, and conference halls. If your version control was centralized, losing your internet connection meant losing your ability to commit. You could still edit files, sure. But you couldn't save snapshots of your progress. You couldn't create branches. You couldn't look at history. You couldn't compare your changes against what came before. You were working blind. Linus put it plainly.

You can take a truly distributed source control management system, you can take it on a plane and even if they don't offer wifi, you just continue working. You can do everything you would do even if you were connected to a nice gigabit ethernet directly to the backbone.

With Git, your entire history lives on your machine. Want to search through six months of commits for when a bug was introduced? Instant, no network needed. Want to create five experimental branches and throw away four of them? Go ahead, nobody else is affected. Want to commit every fifteen minutes as you work through a problem? The commits happen locally, at disk speed, with no round-trip to a server. When you eventually reconnect, you push everything at once. Or you don't push at all. That's your choice.

The speed difference was staggering. Operations that took seconds or minutes in Subversion, because they required a network round-trip, took milliseconds in Git. Checking the log, comparing branches, switching between features, all of it happened at the speed of local disk access. And as Linus understood from the beginning, speed changes behavior. When committing is instant, you commit more often. When branching is free, you branch for everything. When searching history is fast, you actually search it.

The practical result was that developers who switched to Git started working differently without consciously deciding to. They committed smaller, more frequent changes. They experimented more freely. They used history as a tool instead of treating it as an archive they rarely opened. The tool shaped the behavior.

Origin is Just a Name

Here is where the distributed philosophy gets a little philosophical.

When you run git clone and pull down a copy of a repository, Git sets up a connection back to where you cloned from. It calls that connection "origin." This feels meaningful, like origin is the source, the real repository, the canonical copy. But it isn't. The name is arbitrary. Git picks "origin" the same way it picks "main" for your first branch. It's a sensible default, nothing more.

You can rename it. You can delete it. You can add five more remotes pointing to five different copies of the same project, each managed by a different person, and Git will happily track all of them with git remote add. There is no hierarchy in the data model. No copy is special. No server is blessed. The command git remote just manages a list of bookmarks, addresses where other copies of the repository live.

This is not how most teams use Git. Most teams pick one server, call it origin, and treat it as the source of truth. And that's fine. It's a perfectly good convention. But it is a convention. It's a social agreement, not a technical requirement. Git itself has no opinion about which copy matters.

The distinction sounds academic, but it has real consequences. In two thousand sixteen, a startup called GitLab accidentally deleted their production database. Six hours of data were lost. But because every developer had a full clone of the repository, they rebuilt the server from a local copy. The history was safe. The code was safe. The only thing lost was the web interface, and even that was restored within a day. That is the real consequence of distributed design: resilience. When every clone is complete, every clone is a backup.

Peer to Peer, If You Want It

Git supports something almost nobody uses but almost everyone finds interesting when they learn about it. Two developers can share code directly, machine to machine, with no server in between.

You can add a colleague's laptop as a remote. You can pull their branches over the local network, or over SSH, or technically even from a USB drive someone carried across the room. Git doesn't care where the other repository lives. A server is just a computer that happens to be always on and has a network address. There is nothing in Git's design that requires it. As Linus put it in that same Google talk.

The whole point of being distributed is I don't have to trust you. Remember, distribution means nobody is special.

In practice, almost everyone uses a central server. GitHub, GitLab, Bitbucket, some company's internal server. And that's rational. Servers are convenient. They're always available. They provide a stable address. They offer features like web interfaces and issue trackers that Git itself doesn't have. The peer-to-peer capability is like a fire exit. You hope you never need it, but knowing it's there tells you something important about the building.

What it tells you is that servers are a convenience, not a dependency. Git repositories are first-class objects that live anywhere and talk to anyone. The architecture is radically egalitarian. And this is by design, because Linus built Git for a community of thousands of developers spread across the world, working in different time zones, on different continents, under different organizational structures, and he needed a tool where nobody was a bottleneck and nothing was a single point of failure.

The Tension That Never Resolves

So here is the beautiful contradiction at the heart of modern software development.

Git was designed to be distributed. Every clone is complete. Every copy is equal. No server is required. The architecture encodes a philosophy: decentralization, independence, resilience. Linus built it that way because he fundamentally does not trust centralized authority.

If you're not distributed, you are not worth using. It's that simple.

And yet. Within three years of Git's creation, a company in San Francisco would take this radically distributed tool and build the most centralized platform in the history of software development on top of it. A platform so dominant that losing access to it would disrupt millions of projects overnight. A platform that would become, for all practical purposes, the single point of failure that Git's design was built to prevent.

The code is distributed. The social layer is centralized. The repository can live anywhere. The issues, the pull requests, the discussions, the contributor graphs, the identity itself, all of that lives in one place. Git gives you independence. The platform built on Git creates dependence.

This tension between Git's distributed architecture and the centralized platforms built on top of it never resolves. It can't. Decentralization offers resilience and independence. Centralization offers convenience and community. People want both, and they're fundamentally in tension.

But that story, how a command-line tool designed for independence became the foundation of the most centralized developer platform ever built, is a story for later. For now, just sit with the architecture. Every clone is a kingdom. Every developer has the complete history. Trust flows through people, not permissions. And the command git clone doesn't borrow a book from the library.

It copies the entire library to your desk.

git clone, followed by a URL. One command. What arrives on your machine is not a download and not a checkout. It is the entire history of a project, every commit anyone ever made, copied to your desk. The server is now optional. That is what distributed means, and it is the single most consequential thing Git changed about how software gets built.