Before the Concrete: A Midnight Walk Through the Machinery

The Thing Nobody Says About Personal Knowledge Bases

The thing nobody tells you about personal knowledge bases is that most of them die of boredom. Not of technical failure. Not of hardware crashes or data loss or provider acquisitions. Boredom. The person who built it got excited for three weeks, set up a folder structure, wrote a few dozen notes, and then life moved on. The notes are still there. Nobody opens them. Nobody updates them. After a year they describe a version of the world that no longer exists. After two years they are archaeology.

Tonight a man in Sweden named Pär and the thing he was talking to, which was a computer pretending reasonably well to be a conversation partner, spent about an hour and a half walking around the edges of a question that sounds boring and is actually load-bearing. The question was, how do you build a knowledge substrate for one person that does not die? And to answer that question they kept running into smaller questions. What is the right size for a menu. What is a database really doing when it guesses at meaning. Why does old fashioned word counting still beat fancy math on some kinds of text. Where does automation end and ritual begin, and which side of that line is the side things survive on.

This episode is a walking tour through the machinery. You are going to learn what all of these pieces are, what they are for, and why they came up in a conversation about keeping a collection of notes alive for decades. There is no conclusion at the end. There is no list of decisions that got made. Pär decides later. This is the part where we look at the materials before any concrete gets poured.

Capture Won. Focus Did Not.

Before any of the technical stuff, you have to know one thing about Pär, because it is the idea that drives everything else. Pär once built two apps for himself. He called them Capture and Focus. Capture was a simple inbox. Anywhere he was, any time of day, he could dump a thought into it. Walking the dog, at a café, half asleep. Capture took the thought. Focus was different. Focus was the thing you sat down in front of in the morning to structure your day. To decide what you were doing. To bring intention to your time.

Capture is still alive. Focus is dead.

Not because Focus was worse. In fact, Focus was a more ambitious piece of software. It tried to do more. It had more opinions. It had a better story about what it was for. But Focus required a ritual. You had to sit down, open it, and engage. And human attention, especially human attention shaped by attention deficit, does not reliably show up for a ritual. It shows up for whatever is easiest. Capture was easiest, always, because it accepted anything without asking questions. So Capture collected years of material and Focus collected dust.

The lesson Pär took from this is the organizing principle for everything we are about to walk through. If the substrate requires ritual to maintain, the substrate will die. If the substrate ingests what you are already doing, without you needing to sit down and tend to it, the substrate lives. This is the rule. Every piece of machinery we discuss after this point is either moving closer to that rule or moving further from it, and that is the axis that decides whether it belongs.

The Waiter Problem

Now we can start with the first piece of machinery, which is a thing called MCP. You will hear it a lot. MCP stands for Model Context Protocol, and the easiest way to think about it is this. Imagine a restaurant. Claude is the chef. You are the diner. But Claude cannot walk out of the kitchen. Claude has to send a waiter. The waiter takes your order, walks over to the database or the calendar or the file system, gets the thing, brings it back to the kitchen. That waiter is an MCP tool.

MCP was invented by Anthropic, the company behind Claude, and it is now being donated to the Linux Foundation as an open standard. Which means it is going to be the thing. Every AI assistant is going to use something like this. It is the protocol by which a language model reaches out of its own head and touches the world.

The problem with waiters, though, is that you can have too many. If your restaurant has three waiters, they each know their tables and the kitchen runs smoothly. If your restaurant has thirty one waiters, two things go wrong. One, the menu for each waiter has to be memorized by the chef, and the chef only has so much memory. Two, when an order comes in, the chef has to pick which waiter to call, and with thirty one options the chef starts calling the wrong waiter. Not always. Just often enough to hurt.

Pär's Director service has thirty one MCP tools right now. That is a lot of waiters. And tonight we were about to add more. Tools to read the lessons library. Tools to read the playbook. Tools to check validation claims. Tools to list open experiments. Five more, maybe six. Which would push the count past forty. And that is where things get genuinely wobbly.

The Two Costs

The mistake most people make is thinking of tool sprawl as one problem. It is actually two problems sharing a name.

The first problem is load cost. Every MCP tool has to announce itself to Claude at the start of a conversation. The announcement includes the name, what the tool does, what parameters it takes. Each tool's announcement is maybe two hundred words of text. Thirty one tools means about four thousand words of menu sitting in Claude's working memory every single time. You are paying that cost whether you use the tools or not. In a restaurant metaphor, the chef is memorizing every waiter's table assignments before taking a single order. The kitchen is slower because it is full of menus.

There are clever ways to reduce load cost. The newer versions of Claude have something called deferred tool loading, where the chef only memorizes a tool's full details when the tool actually gets called. The older web version of Claude, which is the one you reach through a browser, does not have this yet. So on the web, every tool pays full price.

The second problem is selection cost. This one does not go away with clever tricks. Once you have more than about twenty or twenty five tools, the chef starts calling the wrong waiter. Not because the chef is stupid. Because three tools might all have names that sound like they do the same thing. Query sessions. Search sessions. List recent writes. All three read from the same table with slightly different filters. The chef picks one and it might not be the best one. Or the chef picks the right one but with the wrong parameters. Or the chef does not pick any and asks the diner to clarify.

This matters because the thing Pär wants to build, a knowledge substrate that spans his whole life, could easily reach a hundred possible operations if we are not careful. You cannot have a hundred waiters. So before we can add anything, we have to figure out how to say more with fewer tools.

The Split Nobody Names

Here is a distinction that turned out to be the hinge of the whole evening. There are two kinds of sessions Pär has with Claude. One is through the web. A browser tab. Claude sitting in that tab has no shell, no file system, no way to run a command. Claude is a guest in a hotel room. Everything Claude wants to do has to go through the hotel concierge, and the concierge is the MCP protocol. If Claude wants to know what projects Pär is working on, Claude asks MCP. If Claude wants to read a file, Claude asks MCP. MCP is the only way out.

The other kind of session is Claude Code. This is what Pär is using right now, running in his terminal. Claude Code is a resident of the house. It can open the fridge. It can run a script. It can fetch a web page. It has a full shell, it can write files, it can read directories, it can call any HTTP endpoint directly. Claude Code does not need MCP to do most things. Claude Code can do things Claude.ai web cannot dream of.

And this changes the question completely. Because the waiters we have been counting, all thirty one of them, exist specifically to give the web version things it cannot do on its own. Claude Code does not need a waiter to read a file. Claude Code reads the file. Claude Code does not need a waiter to run a script. Claude Code runs the script.

So the question is not, how many waiters total does this system need. The question is, how many waiters does the hotel guest specifically need. Everything else can be plumbed as a normal HTTP endpoint, a normal script, a normal CLI command. Claude Code will figure it out. The waiter count drops. And the things that needed to be tools because they were also convenient for Claude Code can live better lives as actual command line tools instead.

This is a really important idea. It means the MCP surface should shrink as Pär gets more sophisticated, not grow. The mature endpoint is a small set of MCP tools that exist specifically because they help the web version, and everything else lives as plain old web requests.

What a Substrate Actually Is

The word substrate has been floating around this whole time. It is worth slowing down on. A substrate is not the same thing as a pile of files. A pile of files is what most people's notes look like. You have some markdown here, some PDFs there, some exported chat logs in a folder somewhere, maybe a few notebooks. When you want to find something, you grep. Or you open each file by hand. Or you give up.

A substrate is different. A substrate knows things about itself. It knows which files exist, what they are about, when they were last touched, who mentioned what inside them. It knows that this note and that note are related. It responds to questions. You do not have to know where something is to find it. You can ask.

The difference between a pile of files and a substrate is not the files. It is the index. The index is the thing that makes the pile queryable. And a good index does more than find exact words. A good index finds meanings. A good index finds relationships. A good index keeps itself up to date without you thinking about it.

So when we talk about building a knowledge substrate for Pär, we are not talking about creating a wiki folder. The wiki folder already exists, scattered across ten places. We are talking about indexing what he already has, and making that index query itself automatically when you ask it things. The files stay where they are. The substrate is the layer of understanding built on top.

RAG: A Library Card for the Language Model

Which brings us to a term you will hear all over the place. RAG. Three letters, stands for Retrieval Augmented Generation. Sounds complicated. It is actually simple.

A language model has read a lot of text during training. But it cannot read everything, and what it read is frozen in time. Claude does not know what Pär wrote in his notes last week, because Claude was trained months ago. So if you ask Claude a question about Pär's notes, Claude has to either guess, which is bad, or somebody has to hand Claude the relevant notes before Claude answers, which is good. That second thing is RAG.

RAG is basically giving the language model a library card. When a question comes in, before Claude answers, a little system goes and searches the library. It pulls the three or four most relevant books off the shelf. It sets them in front of Claude. Claude reads them. Claude answers the question based on what is in those books.

The question then becomes, how does the little system know which books to pull. And that is where the interesting machinery sits. There are several different ways to search a library. They have different strengths. They have different weaknesses. And the best systems use more than one at a time.

Postgres and the Boring Plumbing

To index Pär's notes, we need somewhere to keep the index. The obvious answer, which a lot of people in this space miss, is Postgres. Postgres is a database. It is also the least glamorous piece of software in modern infrastructure. It has been around for almost thirty years. It is extremely good at its job. It is already running on Pär's server for the Director service we have been talking about. We do not need a new thing. We just need to teach the old thing new tricks.

Postgres can be extended with plugins. Two plugins matter for this story. The first is called trigram, sometimes written as pg underscore trigram. The name comes from the fact that it breaks words up into three letter chunks, called trigrams, and uses those chunks to do fuzzy matching. Trigram search is what lets you type half a word and still find the thing. It is what makes your search bar forgiving about typos. When you look for Järpströmmen and type Järpström, trigram figures out that those are almost the same string and gives you the hit anyway. Not glamorous. Very useful.

The second plugin is called pgvector. This one is where the modern AI stuff lives. Pgvector teaches Postgres to store something called an embedding. Which is a concept we need to pause on, because it is not obvious.

Embeddings: Coordinates in Meaning Space

Imagine every word, every sentence, every paragraph has an address. Not a street address. An address in a kind of mathematical space where the distance between two addresses tells you how similar the two pieces of text are in meaning. So the sentence "the cat sat on the mat" lives at some coordinates, and the sentence "a feline was resting on the rug" lives at coordinates that are very close to the first, because they mean nearly the same thing. And the sentence "the quarterly earnings report was disappointing" lives far away, because it is about something completely different.

How does any of this work. There are neural networks, trained on enormous amounts of text, whose job is exactly this. You feed in some text, they produce a long list of numbers. Usually between three hundred and several thousand numbers. That list of numbers is the address. It is called an embedding. The neural network has learned, through exposure to everything humans write, how to place text at coordinates that reflect meaning.

The magical consequence is that you can now search by meaning. Someone types a question. You turn the question into an embedding. You look in your database for stored embeddings close to that one. The closest stored embeddings are the texts that mean something similar to the question, regardless of whether they use the same words. This is semantic search. This is what makes RAG feel smart.

Pgvector is the piece of software that lets Postgres store these embeddings and measure distances between them fast. That is its whole job. It is a coordinate system for meaning, bolted onto a boring database.

The Old Guard Librarian

Before embeddings existed, how did you search for things. You counted words. Specifically, you counted which words appeared in which documents, weighted by how rare those words were across the whole collection. Rare words matter more. If everyone's document mentions the word "the," you cannot distinguish documents by the word "the." If only one document mentions "Järpströmmen," then a query about Järpströmmen should probably return that document.

There are two famous algorithms for doing this. One is called TF-IDF, which stands for term frequency inverse document frequency. Spelled out, that means, how often does this word appear in this document, divided by how often it appears everywhere. The other is called BM25, which is a slightly fancier cousin that handles long documents better. Both of them are, at their core, counting words in a clever way.

Now here is the thing. Old fashioned word counting is not glamorous. It does not involve neural networks. It does not involve coordinates in meaning space. It is the kind of algorithm your librarian grandmother would understand. And yet, for certain kinds of text, it beats embeddings. Decisively.

Tonight, Reddit surfaced a piece of work where somebody tested pure TF-IDF against function signatures in eighteen code repositories. Eighty percent hit rate at the top five results. Ninety eight percent reduction in the amount of text the model had to look at. Zero neural network dependencies. The researcher summarized the finding in one sentence, which is worth quoting. "Code identifiers are already the compressed representation. Embedding them actually loses information."

That sentence is worth sitting with. Because it tells you something important. Embeddings are fuzzy on purpose. They blur the difference between synonyms. That blur is a feature when you are searching free text. It is a bug when the exact name of a function is the thing you care about. BM25 and TF-IDF keep the sharp edges. They care about exact words. For some corpora, sharp wins.

This is why the eighty six lessons Pär has in his Lab, which are structured, which use specific technical terms, which refer to exact experiment numbers and exact model names, might actually be better searched by old fashioned word counting than by fancy embeddings. For questions like, "what do we know about cost versus quality tradeoffs," embeddings win. For questions like, "where does experiment fifty seven get cited," word counting wins. Two different tools for two different jobs.

Hybrid Retrieval

Which leads directly to the idea that has been winning in modern RAG systems. Do not pick one. Use both. Run the word counter first, because it is cheap. It filters the library down from thousands of documents to maybe the top thirty that match the query's exact words. Then, on those thirty, run the embeddings search. The embeddings reorder the thirty by meaning similarity. You take the top five. That is your result.

This is called hybrid retrieval. BM25 first pass, embeddings rerank on top K, where K is some small number like twenty or fifty. Two passes. Cheap then expensive. You get the best of both. Exact matches are never thrown away. Semantic matches bubble up through the reranking. And the total cost is manageable because the expensive step only ever runs on a shortlist.

There is a fancy way to combine the two results called reciprocal rank fusion. Sounds complicated. The actual math is, if a document shows up in both lists, it gets a boost proportional to how high it ranked in each. Things that are in both lists win. Things that are only in one list rank lower. It is the wisdom of crowds for search.

The Language Problem of Embeddings

One of the trickier gotchas tonight was about embedding models. We have been talking about embeddings as if they were one universal thing. They are not. There are many different models that produce embeddings, and each one produces them in its own private language. OpenAI's latest embedding model produces coordinates in one meaning space. A model called Voyage produces coordinates in a different meaning space. An open source model called BGE produces coordinates in yet another.

The coordinates from model A cannot be compared with the coordinates from model B. The number one point two in one language might mean something totally different in another language. If you try to measure the distance between an OpenAI embedding and a Voyage embedding, the number you get is nonsense. Total noise. Which means, if you have a library where some of the books are embedded by model A and some by model B, search is broken. It looks like it is working. The results just do not actually make any sense.

You have to pick one. Everything gets embedded by the same model. And crucially, if you ever change your mind, you have to re embed the entire library, because the old embeddings are useless in the new meaning space.

This is not catastrophic. It just means the embedder choice is load bearing. Pick carefully. Pick something you can live with for a long time. For Pär's purposes, OpenAI's cheap embedder is the lazy default. Voyage is more portable. A local neural network running on Pär's own Mac is free but adds a dependency. Each of these has tradeoffs. None of them is wrong. But once you pick, you have to stick.

Cook to Order versus Cook Ahead

There are two philosophies for when to do the expensive work of embedding. One is called ahead of time, abbreviated A O T. The other is called just in time, abbreviated J I T. They are different shapes of the same work, and they fail in different ways.

Ahead of time says, let us embed everything in the library up front. Take a whole night if we need to. Run through every note, every lesson, every session transcript, produce an embedding for each one, store them in the database. Now any question can be answered instantly, because the embeddings are all ready.

This sounds good. It is seductive. It is also fragile. Because the embedding process takes time, and it runs on a schedule, and schedules slip. The cron job that does the embedding might fail silently. The disk might fill up. The model might get updated and now the old embeddings are in the wrong meaning space. You only notice when search gets worse, which is a slow erosion you might not see for months. Ahead of time is what the Reddit thread about the three point five million Wikipedia article project was hitting. Three and a half years of just embedding. Most of it for articles the user would never ask about.

Just in time is different. Just in time says, do not embed everything. Keep a fast text index, something cheap, something that always reflects the current state. When a question comes in, use the fast index to narrow down to a few candidates. Only then, on those few candidates, do the embedding work. Cache the result. Next time that document is needed, the cached embedding is there. Over time, your hot set, the stuff that gets asked about, ends up fully embedded. The cold stuff, the stuff nobody ever cares about, never gets the expensive treatment. You save both time and money, and the system degrades gracefully. If the embedding service goes down, search still works. Worse, but works.

For Pär's substrate, just in time is almost certainly the right shape. Because the Lab has eighty six lessons and he asks about maybe ten of them with any regularity. Embedding the other seventy six up front is wasted work.

The Always On Alarm

Now we circle back to the Capture versus Focus principle, because it applies to how the substrate gets updated. If the substrate only learns about a new note when the nightly batch job runs, that is Focus shaped. You have to wait. You have to sit down and run the job. The system is not responsive. Whereas if the substrate notices immediately, within a few seconds, that a file changed, and updates itself without asking, that is Capture shaped. You write a note, you can query it right away.

The technology for this is called file system watching. On Mac, there is a small program called fswatch, short for file system watch. It sits in the background and listens to the operating system. Every time a file in a directory you told it about gets created, modified, or deleted, fswatch fires a notification. That notification can kick off any command you want. In our case, the command would be, push the file to the server, update the index.

The delay between writing a note and being able to query it drops from hours, which is where git based scheduled scans land, down to about two seconds. Which is basically instant from a human perspective. And crucially, it requires no action from Pär. He just writes. The alarm goes off. The substrate learns. Capture shaped.

There are similar mechanisms on every operating system. Linux has inotify. Windows has something called the file system watcher. The principle is identical. The operating system already knows when files change. You just have to ask it to tell you.

The Party Game

Now we get to the piece that is most different from what came before. Everything up to this point has been about finding things. Typed entity graphs are about something else. They are about understanding how things relate.

Imagine a party. A hundred people in a room. You want to know things about them. You could list each person and what they do. That is what a normal search gives you. You could ask, who at this party works in banking. The search answers. But a richer question is, who at this party used to work with who, and who introduced who, and who is married to whose cousin. That is not a list. That is a web. And getting at it requires that you know, in advance, which kinds of relationships matter.

A typed entity graph is exactly that web. Every note, every project, every person, every API, every experiment, every lesson is a node. Between them are edges. But the edges have types. They are not just "connected." They are "cites," or "uses this API," or "was spawned from," or "contradicts," or "is a version of," or "was written by." Each edge tells you something specific about the relationship.

This is where the magic happens. Because now you can ask questions like, if the DashScope API expires tomorrow, what projects break. You run the query, you follow every edge of type "uses this API" backwards from DashScope, and you get your list of projects. Or you can ask, what did I write about Järpströmmen over the last five years, across my newspaper archive and my chats with Claude. You run the query, you follow every "mentions" edge from the Järpströmmen entity, and you get a unified timeline.

The Vanity of Pretty Graphs

There is a thing that most personal knowledge bases, including the very popular Obsidian, offer. Bidirectional links. You write the name of another note inside this note, and automatically the other note also knows about this one. You can then see a little graph of your notes and their connections. It looks beautiful. Like a constellation of thoughts.

It is also mostly vanity. Because the connections are untyped. All you know is that these two notes are related, somehow. Related how. Why. For what purpose. Unclear. The graph looks pretty. You cannot actually ask it operational questions. If I delete this note, what else gets orphaned. If this API expires, what is affected. The pretty graph does not know.

What we are talking about tonight is different. Typed edges. Every relationship labeled. Because with types, you can ask real questions. Without types, the graph is wallpaper.

The researchers at BrainDB, which we will get to in a minute, call this the difference between memory and a memory bank. A memory is a loose recollection. A memory bank is a structured account. Both matter. But the second one is what lets you do arithmetic on your past.

The Thing Grep Cannot Do

Here is the other advantage of typed graphs. Multi hop queries. Grep, the text search tool, can find all files that mention a word. That is a one hop query. It looks in one place. But some questions require two or three hops. What does the nephew of this person know about this topic. That question involves three entities. A person. The person's nephew. The nephew's knowledge about a topic. Grep cannot assemble that. You would have to do it by hand. Read the first person's file. Figure out who their nephew is. Open the nephew's file. Search for the topic. The language model could do it if you handed it all three files, but figuring out which three files is exactly the problem.

A typed graph does it in one query. Follow the "related to" edge from the person. Filter for edges of type "is nephew of." Then from that node, follow edges of type "mentions" with the topic. You get your answer without ever reading the files in full. You just traverse the graph.

This is the argument, ultimately, for why a graph earns its complexity. For small collections of notes, where everything you care about is in one document, grep is fine. For collections with real relational structure, where questions span across multiple documents and multiple kinds of relationships, grep cannot assemble the answer. You need the graph.

BrainDB, Stolen Overnight

One of the funniest moments of the night was when I searched Reddit for ideas on this architecture and the very first result, posted the day before, was basically our design. Somebody named dimknaf had shipped a project called BrainDB. It was inspired by an idea from Andrej Karpathy, one of the original OpenAI founders, who had written a short gist titled "LLM wiki." Karpathy's idea was, what if you gave a language model a wiki it could read and write. A persistent memory outside its own head. Structured. Queryable.

BrainDB took that idea and ran with it. Typed entities. Typed edges. Postgres as the backing store, with trigram search and pgvector. Graph traversal up to three hops. Temporal decay, where older entries fade in relevance unless something accesses them, like memory itself. And a little feature that turned out to be the most interesting part. On ingestion, when a new fact arrives, the system runs a quick pass to figure out what existing entities this new fact connects to, and it writes those edges automatically.

The comments under the Reddit post were exactly our debate tonight. One person said, why are you doing all this, markdown files in a git repository do the same thing. The author said, for small collections yes, but for messy relational stuff with multi hop queries, markdown misses. One person said, the context windows on language models are getting so big that you could just stuff the whole wiki into every conversation. The author said, that is true for small wikis, but it is also expensive and unfocused. Both sides had points. Pär's case spans both. Simple lookups can stay on markdown. Hard relational questions want the graph.

The worthwhile takeaway was that the design is not speculative. Somebody shipped it yesterday. It works. It uses the same components we were converging on. Postgres, trigram, pgvector, typed edges. And it uses them in the same way.

Dream Consolidation

The feature from BrainDB that is worth stealing is that ingestion time edge discovery thing. Let me paint the picture of what it does.

When you write a new note, the system reads it, and while you are off doing other things, it looks at every existing entity in the graph and asks, does this new note mention any of these. Does this new note relate to any of these. If it finds matches, it writes edges automatically. You did not ask it to. You just wrote a note. Later, when you come back and ask a question, the graph already has the connections. They got made while you were not looking.

This is very much like what the human brain does during sleep. You learn things during the day. While you sleep, the brain goes through them, connecting new information to what you already knew. You wake up and some thing that was confusing yesterday feels clearer, and you do not quite know why. Because the connecting happened in the background.

For Pär's case, this matters especially because Pär cycles through obsessions. He gets deeply into one topic for weeks. Then he moves on. Then years later he comes back. If the substrate was only as smart as the day he stopped paying attention, it would be static. Frozen in the shape of his last obsession. But if the substrate keeps connecting things while he is off caring about something else, then when he comes back, some of the thinking has already been done. The substrate has been weaving while he slept. This is, bluntly, the feature that matters most for a knowledge system that outlives human attention cycles. Most systems do not have it. The ones that do are the ones that will still be useful in twenty years.

Scopes

Up to this point we have been talking about the substrate as one thing. One library. It is not. There are many libraries, and they are each called a scope. The Lab is one scope. Pär's AI conversation archive is another. The newspaper OCR archive from Årebladet is a third. His emails are a fourth. Capture items are a fifth. Session transcripts are a sixth. And so on.

Each scope has its own shape. Its own rules. Its own privacy level. The Lab is public-ish. Pär writes about experiments, nothing very sensitive. The chat archive has personal stuff in it. The emails have source protection implications if any Årebladet reporting is in there. The scopes are not equal.

The key move is that the query tool does not need to know which scope it is looking in. You can ask a question across all of them at once. You can also restrict a query to just one scope if you want. The scope is a filter, not a fork. One mechanism, many possible filters. This is what makes cross scope queries possible, which is the thing that takes this from being a collection of search indexes to being a real knowledge substrate.

A cross scope query is something like, tell me everyone who appears in both my newspaper archive and my chat history. Or, what have I discussed with Claude about topic X, what has Årebladet published about X, and what capture items mention X. Those are not things you can do in any of the individual scopes. They require the scopes to share a common entity table, where the same person in the newspaper archive is the same person as the one in the chat log. Which brings us to a famously unsolved problem.

The Name Problem

Imagine three texts. One says "Pär Boman won the award." Another says "P.B. was honored." A third says "Boman, Pär, receives recognition." Are those three texts about the same person. You and I, reading, know they are. A computer does not. The strings are different. Pär Boman and P.B. have no letters in common except B. Pär Boman and Boman Pär have the words in different order. A naive matching system would see them as three different people.

This is called entity resolution, and it is genuinely unsolved. There are fancy techniques. Trigram similarity. Phonetic matching. Hand maintained aliases. Machine learning approaches that use context to disambiguate. None of them are perfect. All of them have false positives, where they merge things that should be separate, and false negatives, where they fail to merge things that are the same.

For Pär's substrate, the pragmatic answer is, start naive. Use trigram to catch obvious variants. Accept some false positives. Build a simple tool where, when a false positive is noticed, a human can say, no, these two entities are different. And when a false negative is noticed, a human can say, these two entities are the same. Over time the aliases accumulate. It gets better. It is never perfect. That is the cost of having a unified entity graph across messy real world data.

Privacy in Layers

Since some scopes are more sensitive than others, and since the substrate is going to be reachable by Claude both in Pär's terminal and in his browser, the question of who can see what gets important fast. The cleanest way to handle this is a simple flag on every item. Visibility local, visibility web, visibility nobody. Default everything to local, which means only sessions running on Pär's own computer can see it. Explicitly opt into web, which means the browser sessions can see it too.

The discipline here is default deny. You do not have to remember to mark something private. Everything is private unless you marked it otherwise. This is the Capture shape of privacy. No ritual of reviewing each note to decide its visibility. The default protects you. You only have to make active decisions when you want to share something.

This is a single field on a database row. It is a tiny amount of code. But it matters enormously because the alternative, which is trying to retroactively figure out what is safe to share, is a nightmare. You write a note about a health issue. You forget you wrote it. Six months later you are in a browser session, with Claude reaching into the substrate, and suddenly the health note surfaces. Default deny prevents that.

The Silent Alarm

Now for the thing that has been haunting this whole session, because it already happened to Pär once. Silent failure. You build a system. It runs. Months pass. Then one day you discover it has been broken the entire time, and nothing ever told you.

This actually happened to Pär's backup system. Forty days of backups were silently failing to upload to the storage service. The system kept saying "backup completed successfully." The local script ran fine. The remote upload did not. Nothing yelled. Because the script was checking, "did my process exit with success," and the process did exit with success. It just did not actually push the file to the remote.

For a substrate with multiple ingestion paths, all of which can silently fail, this is a real danger. File watcher stops. Cron job fails. Webhook endpoint starts returning five hundreds to the sender and nobody notices. Three weeks later you realize you have not seen a new note in three weeks. The substrate is lying to you.

The fix is a very boring piece of engineering called a freshness check. Every ingestion path writes, alongside the data, a timestamp of when the data was written. Every path has a maximum acceptable gap. If the last write was more than, say, twelve hours ago when it should be at most four hours, the system screams. Not a log line. Not an email. An actual push notification or an alert that interrupts Pär's day. Because the alternative is finding out in forty days that you have been talking to a corpse.

This is not glamorous work. It is also the difference between a system that survives and a system that quietly rots. For every ingestion path you add, you add a corresponding canary. It is a promise the system makes. I will tell you when I am not working. If it does not make that promise, do not trust it.

The Substrate Holds What You Do

Here is the idea that, I think, is the most interesting of the night. Most personal knowledge systems are optimized for what you write. You sit down, you type something, the system catches it. But the vast majority of what Pär does in a day is not writing. He is driving. He is talking to Claude. He is pushing code to Github. He is publishing articles in Årebladet. He is burning through API credits. He is taking photos. He is sending text messages.

What if the substrate held a record of all of that. Not just the things he typed into a note file. Every meaningful thing he did, recorded as an event, stored in the graph, available to query.

The mechanism for this is called webhooks, or more generally, event ingestion. Each system Pär uses already knows when something happens. Github knows when he commits. The API logging service knows when he burns credits. The car knows when he drove. The content publishing system for the newspaper knows when an article went live. These systems can be configured to notify another system when their events happen. They send a little HTTP request. The substrate receives it. Writes it to the event log. Done.

This is transformative, and I do not use that word lightly. Because now the substrate is not just a record of what Pär wrote. It is a record of Pär's life. A searchable one. A queryable one. Which lets it answer questions like, what places have I driven past that also appear in the newspaper archive. What Github commits happened on days I was in my home zone versus out. Which of my API providers am I spending the most on, and what projects are those calls for.

And because the event log is time stamped and typed, it becomes the raw material for every future report. Every future podcast. Every future reflection. The Director weekly report is one podcast. You could have a weekly review of your driving. A monthly summary of your API spending. An annual retrospective of everything you worked on. All of these pull from the same event log. None of them need new plumbing. The plumbing already exists. You just pour new queries through it.

The Feature That Survives You

The last idea, and I think the one that matters most in the long run, is what I will call obsession change survival. This is the feature that, to my knowledge, nobody has shipped in a personal knowledge system. It is the feature that specifically addresses the way Pär's attention actually works. Pär gets deeply into something for weeks. Then he moves on. A year or two passes. He comes back.

Most systems, at this point, fail him. They froze on the day he stopped paying attention. The notes are dated. The context is missing. The connections are stale. To come back productively he has to rebuild his model of the topic in his head, which is expensive, which is the thing that keeps people from coming back at all.

A substrate with ingestion time edge discovery, a substrate that keeps weaving connections while you are off doing other things, does not have this problem. While Pär is deep in Årebladet stuff, the substrate is still noticing, in the background, that a thing he wrote last year about a power station connects to a thing he wrote two years ago about a type of concrete. That edge gets written. Years later, when Pär comes back to power stations, he types a question, and the substrate says, here is what you wrote about this, and here are the connections you did not know you had made. Because, in a real sense, they got made while he slept. Or while he was in a different obsession entirely.

This is the substrate remembering for Pär even when Pär is not interested in remembering. It is the fire that does not go out when you stop tending it. It is the garden that keeps some of its order while you are away at sea. It is the feature that lets a personal knowledge system outlive, not just the flashes of attention, but the deeper currents of obsession that carry attention back and forth across years.

Most systems do not do this. They require tending. They punish you for not showing up. A system that does not punish you for being a human being, a human being whose interests shift, whose energy comes and goes, is a rare thing. Worth building. Worth designing for on purpose. Worth spending an hour and a half at one in the morning thinking about the materials for.

The Map, Unpoured

So this is the machinery. MCP as the waiter protocol, with its two costs of loading and selecting. The split between web sessions and Claude Code sessions, which determines what actually needs to be a tool. Postgres as the boring foundation. Trigram for fuzzy matching. Pgvector for storing embeddings. Embeddings as coordinates in a meaning space. BM25 and TF-IDF as the old guard word counters that still win on structured text. Hybrid retrieval that uses both in sequence. The sharp rule that embedding models cannot be mixed. Just in time indexing over ahead of time, because ahead of time breaks quietly. File system watching as the Capture shaped ingestion mechanism that beats scheduled scans. Typed entity graphs as the difference between a pile of files and a queryable substrate. The vanity of pretty bidirectional links versus the operational value of typed edges. Multi hop queries as the thing grep cannot do. BrainDB as evidence that other people are converging on this design right now. Ingestion time edge discovery as the dream consolidation feature. Scopes and cross scope queries as the thing that makes the whole system more than the sum of its parts. Entity resolution as the genuinely unsolved merge problem, handled naively at first and refined over time. Privacy gradients with default deny as the correct handling of sensitive content. Silent failure canaries as the unglamorous but mandatory discipline for any multi path system. Webhook ingestion as the reason the substrate holds what you do, not just what you write. And the obsession change survival feature as the long term reason any of this is worth building at all.

Every one of these is a choice. None of them is free. Some of them are harder than they look. Some of them are easier than they look. The question tonight was not which ones to implement. The question tonight was, do we understand the materials well enough to choose. And now you do. Go back to sleep. Decide later.