Embeddings: What This Means for You

This is the practical companion to episode six of Actually, AI, embeddings.

The Search That Should Not Work

Try something. Open whatever search tool you use at work, Notion, Slack, Google Drive, and type in a question that does not use any of the words in the document you are looking for. If you wrote a memo about moving the London office last year, search for "relocating the UK team." If you have a design doc about making the checkout flow faster, search for "improving purchase speed." If your search tool was built in the last two years, there is a decent chance it finds the right document. If it was built before that, it almost certainly does not.

That is embeddings at work. The old search, keyword search, the kind that has powered everything from library catalogs to email clients since the nineteen seventies, operates on a brutally simple principle: does the document contain the words you typed? If yes, show it. If no, hide it. Keyword search does not know that "relocating" and "moving" mean similar things. It does not know that "purchase speed" and "checkout flow" are about the same problem. It matches strings of characters. Nothing more.

Embedding-based search, which you will hear called semantic search or vector search, does something fundamentally different. It converts your query into a point in a space of meaning. It converts every document into a point in that same space. Then it measures the distance. Documents that are close to your query in meaning, not in spelling, show up first. This is the geometry from the main episode, applied to the mundane task of finding a file.

The difference matters more than it sounds. Studies at enterprise search companies have found that employees spend roughly three and a half hours per week searching for information they know exists somewhere in their organization. That is not a technology problem in the traditional sense. The documents are there. The search engine can see them. It just cannot connect a question phrased one way to an answer phrased another way. Semantic search closes that gap, not perfectly, but dramatically.

How the Tools You Already Use Do This

If you use Notion AI and ask it a question about your workspace, it does not read every page sequentially. It has already converted your pages into embedding vectors, stored them in a database, and when you ask a question, it converts your question into the same kind of vector and looks for the nearest neighbors. The same principle powers GitHub Copilot when it pulls relevant code from other files in your project. It powers the "find similar" button in Google Photos. It powers every enterprise search product that has added "AI-powered" to its marketing in the last two years.

The pattern underneath all of these is retrieval-augmented generation, or RAG if you want the acronym. The deep dive covered the plumbing in detail, the chunking, the vector database, the cosine similarity search. Here is what it means when you are the one using it. Every time you ask a chatbot a question about your own documents, the tool is quietly running an embedding search first, finding the chunks of your documents that are closest in meaning to your question, and then feeding those chunks to the language model along with your question. The model does not have your documents memorized. It gets them handed to it, fresh, every time you ask. The quality of the answer depends almost entirely on the quality of that initial search. If the embedding search retrieves the wrong chunks, the model confidently answers from the wrong material. If it retrieves the right chunks, the model has what it needs.

This is why the same chatbot can give you a brilliant answer one moment and a useless one the next. The language model did not get dumber between questions. The search step found good material for one query and bad material for the other. When someone tells you "the AI did not understand my question," what they usually mean, without knowing it, is "the embedding search did not retrieve relevant context."

The Difference Between "About X" and "Like This One"

Here is a distinction that changes how you use these tools once you see it.

There are two fundamentally different kinds of search. The first is "find me documents about quarterly revenue." The second is "find me documents like this one," where you point at an existing document and ask for its neighbors. They feel similar but they use embeddings in different ways, and they fail in different ways.

The first kind, query-to-document search, converts your typed question into a vector and looks for nearby document vectors. This works well when your question captures the meaning of what you are looking for. It struggles when you know the topic but not the vocabulary the document uses. If your finance team calls it "ARR" and you search for "yearly recurring revenue," the embedding might bridge that gap. Or it might not. Embeddings are trained on general text, and your organization's private jargon may not be well represented in the model's understanding of meaning.

The second kind, document-to-document search, takes an existing document's vector and finds the nearest other documents. This is what powers "more like this" recommendations, "related articles" suggestions, and duplicate detection. It is often more reliable than query search because a full document contains far more meaning signal than a short question. A five hundred word document about server migration carries rich context about infrastructure, timelines, technical decisions. A three word query "server migration plan" carries almost none of that context.

The practical takeaway: when embedding search disappoints you, try giving it more to work with. Instead of typing three words, paste a paragraph. Instead of describing what you want, point at an example of what you want. You are not being lazy. You are giving the geometry more dimensions to work with.

Similar Is Not Same

This is where the elegant math from episode six runs into messy reality. Embedding search finds things that are similar in meaning. It has no concept of "same." And "similar" carries the biases of whoever wrote the text the embedding model was trained on.

The main episode covered the famous result: embeddings trained on Google News encode the stereotype that computer programmers are male and homemakers are female. That finding has a direct practical consequence. If you build a resume search tool powered by embeddings and a recruiter searches for "strong engineering candidate," the system will return resumes that are closer in the embedding space to what "strong engineering candidate" looked like in the training data. If the training data associated engineering strength with male-coded language, words like "dominant," "competitive," "built," the search will quietly prefer resumes that use those words over resumes that use equally valid but differently coded language like "collaborative," "designed," "improved."

Nobody programmed that preference. Nobody even knows it is happening unless they audit the results. The embedding did what it was trained to do: find similar things. The problem is that "similar" was defined by a biased dataset, and the bias is invisible because it lives in geometry, not in rules anyone can read.

This does not mean embedding search is broken. It means you need to understand what "similar" means in context. When you search for "Italian restaurants" and the tool also returns Thai restaurants, that is not a bug. In the embedding space, "Italian restaurant" and "Thai restaurant" are genuinely close because they share the concept of dining, cuisine, and going out. The embedding does not know you want Italian specifically. It knows you want things that are meaning-adjacent to Italian restaurants. Sometimes that is helpful. Sometimes it is noise. The difference depends on whether you wanted precision or exploration.

When Embeddings Fail

Embedding search has specific, predictable failure modes. Knowing them makes you a better user of every AI tool that relies on this technology.

Negation is nearly invisible. If you search for "documents that are not about marketing," the embedding model will dutifully encode that sentence into a vector, and that vector will be very close to documents about marketing. The word "marketing" dominates the meaning. The "not" barely moves the point in space. This is not a quirk. It is a structural limitation. Embeddings encode what a sentence is about, not what it explicitly excludes. If you need to exclude something, you almost always need a keyword filter on top of the semantic search.

Recency and specificity get lost. "Latest quarterly report" and "quarterly report from twenty twenty-two" produce similar embeddings because they are about the same topic. The embedding captures the concept of quarterly reports but poorly distinguishes when. Time, version numbers, and specific identifiers are the kind of precise information that keyword search handles well and embedding search handles poorly. The best systems combine both: use embeddings to find the right topic, then use filters to narrow by date, author, or tag.

Short queries are ambiguous. The fewer words you give an embedding model, the less meaning it can encode, and the more the result depends on the model's assumptions about what those words typically mean. "Apple" alone could land near fruit or near technology in the embedding space, and which one you get depends on the training data's balance of those usages. Adding even a few words of context, "apple orchard" versus "apple developer account," dramatically improves the search because it moves your point in space toward the right neighborhood.

Try This

Here is something you can do in the next five minutes that will make embeddings tangible instead of theoretical.

Open any AI-powered search tool you have access to. Notion AI, Google's AI overview, Bing Copilot, or if you are technical, a tool like the Hugging Face semantic search demo. Now search for the same thing twice. First, use the exact keywords: the specific term, the name, the jargon. Then search again using a description of what you mean, without any of those keywords. Compare the results.

The keyword search will find documents that contain your words. The semantic search will find documents that are about your meaning. The overlap between those two result sets is the territory where keyword search has always worked fine. The difference, the documents that only the semantic search found, that is the geometry of meaning doing its work. Those are the documents that existed in your organization, relevant to your question, invisible to every search you ran before twenty twenty-three.

And the documents that only the keyword search found, the ones the semantic search missed, those are the edge cases where precision matters more than meaning. Where "ARR" must match "ARR" and not "annual recurring revenue." Where the document number matters more than the topic. Those gaps are why the best search systems use both approaches, and why understanding when to use which makes you more effective than the people who just type three words and hope.

The Map Is Not the Territory

Here is the honest version of what embeddings mean for you, right now, in twenty twenty-six.

Every AI tool you use that involves finding, retrieving, or recommending information is using embeddings under the hood. Every chatbot that answers questions about your documents runs an embedding search before it generates a single word. Every "AI-powered search" product is computing distances in a geometric space of meaning. This is not the future. This is the plumbing of the present.

The plumbing works remarkably well for the task of "find things that are about what I mean, not just what I said." It fails predictably at negation, at precise identifiers, at distinguishing time. It carries the biases of its training data in ways that are invisible until you look for them. And the quality of every AI answer you receive is limited by the quality of the embedding search that selected the context the model was given.

The listener who understands this has a real advantage. Not because they can build a vector database, but because they know why the search worked, why it failed, and what to type differently next time.

That was the practical companion for episode six.