Observable Framework: When the Notebook Becomes the Article

A Man Who Made Charts at the New York Times

In the early two-thousand-tens, a software developer named Mike Bostock worked at the New York Times. His title was Graphics Editor, which is a job description that does not capture what he actually did. What he actually did was build the underlying tools that made the New York Times' interactive journalism look the way it did. The animated maps. The scrolling explanations. The data-driven articles that responded to the reader scrolling through them. The visual style that came to define what serious data journalism looked like for a generation.

Bostock did not invent this style alone. The Times had a tradition of graphic journalism going back to the seventies. The graphics desk was a respected institution. The interactive era required new tools, and Bostock built many of them. But his most lasting contribution was a piece of software he created in his spare time while at the Times, called D3.

D3 was a JavaScript library for creating interactive data visualizations on the web. It was released as open source. It became, very quickly, the dominant tool in the field. Within a few years, almost every data-driven article on every major news website in the world was using D3 underneath. The aesthetic of those articles, the smooth transitions, the responsive layouts, the way charts could be brought to life with simple interactions, was the D3 aesthetic. Bostock had shaped what data journalism looked like.

[calm]

In two thousand seventeen, Bostock left the Times to start a company called Observable, with co-founder Melody Meckfessel. The company's product was a notebook environment for data work, in the same family as Jupyter notebooks but designed specifically for the kind of interactive visualization that D3 had popularized. The notebooks lived on the web. They were collaborative. They were the basis for a community of data journalists, scientists, and analysts who used Observable to build and share work.

In twenty-twenty-three, the company released a new product called Observable Framework. The framework was a tool for taking notebooks and turning them into static websites. The notebook is where the work happens. The framework is what publishes the work. The combination represents a specific philosophy about what data journalism should look like, and it deserves attention because of where that philosophy comes from.

The Notebook as Source of Truth

To understand Observable Framework, you need to understand what a notebook is in this context. A notebook is a document that combines text, code, and visualizations into a single artifact. You can write a paragraph explaining what you are doing. You can write code that loads data and produces a chart. You can write more paragraphs commenting on the chart. The notebook reads top to bottom like an article, but the code is real code that actually executes when you open it.

The promise of the notebook format, going back to the early Jupyter projects in the two-thousand-tens, was that it would unify analysis and presentation. You would do your analysis in the notebook. The same notebook would be the publishable article. There would be no translation step between the work and the output. The work would be the output.

In practice, this promise has been only partially fulfilled. Jupyter notebooks are excellent for analysis but awkward to publish. The output is usually a screenshot or a stripped-down HTML export that loses most of the interactivity of the original. Scientists love notebooks for their own work. Journalists have struggled to make notebooks suitable for public consumption.

Observable Framework addresses this gap directly. The framework takes notebooks, processes them, and produces static HTML pages that preserve the interactivity, the styling, and the data-driven nature of the original. The published page looks like a finished article. The underlying source is the notebook. The two artifacts are coupled in a way that other publishing systems do not allow.

How The Pipeline Works

The way Observable Framework works is genuinely interesting and worth a moment of attention. You write your notebook. The notebook contains markdown for the prose, JavaScript for the code, and references to data sources. The framework reads the notebook. It executes the code during a build step. It produces an HTML page that includes the prose, the styled chart, and the interactivity, with all the data baked in.

This is different from how most modern data dashboards work. Most dashboards have a server that fetches data live, processes it, and serves it to the browser. Observable Framework does not have a server. The build step runs once, produces static HTML, and that HTML is what gets published. The published article does not need a server to function. It is just a folder of static files that can be hosted anywhere.

The benefit of the static approach is enormous for journalism. Static files are cheap to serve. Static files do not require maintenance. Static files do not break when an underlying database is down. Static files can be archived. The article you publish today will still work in twenty years, because all the data is baked into the published version. The article is a permanent artifact, not a running application.

[serious]

The downside is that the data is frozen at publication time. If you publish an article that shows the current state of mineral permits in your county, that article will continue to show the state at publication time forever. To update the article, you have to re-run the build. This is usually a feature, not a bug. The article is supposed to reflect a specific moment in time. The fact that the moment is fixed is part of what makes the article a journalistic artifact rather than a live dashboard.

For the cases where you do want continually updating data, the framework supports a build-on-schedule pattern. You can run the build every hour or every day, and the published static site updates with the latest data. The build is automated. The data updates. The published site stays static between builds. The combination gives you most of the freshness of a live dashboard with most of the durability of a static page.

Why Charts Are Different in Observable

Observable has its own charting library, called Observable Plot, that ships with the framework. The library is a more focused, more opinionated successor to D3, written by the same Mike Bostock who wrote D3 originally. The design philosophy is different. D3 was a low-level library that gave you primitives for building any chart imaginable. Plot is a higher-level library that gives you specific chart types and asks you to fit your data into them.

The tradeoff matters in practice. With D3, you could build literally any visualization, but you had to know a lot of D3 to do it. With Plot, you can build most common visualizations in a few lines of code, but you are constrained to the chart types Plot supports. For working journalists, most of whom are not full-time visualization specialists, the constraint is a benefit. Most articles need ordinary charts, well-styled. Plot does ordinary charts well, with very little code. The work happens fast.

The aesthetic of Plot charts is descended from the New York Times graphics tradition. The defaults are clean. The colors are tasteful. The typography is well-chosen. A chart produced with Plot looks like a chart you would expect to see in a serious publication, with no customization required. The defaults represent the accumulated taste of the New York Times graphics desk, distilled into software that anyone can use.

What This Has To Do With Working Journalists

For a one-person newsroom, Observable Framework is one of the cleanest paths from data work to publishable journalism. The pattern is straightforward. You do your analysis in a notebook, which feels like writing an article that happens to include working code. The notebook is your draft. You refine it. When it is ready, you build it. The build produces a folder of static HTML files. You upload the folder to a web host. The article is live.

This pipeline has several specific virtues. The first is that the analysis and the article are the same document. You do not have to redo your analysis when you want to write the article. You do not have to re-export your charts. You do not have to translate between two different tools. The analysis is the article. The article is the analysis.

[calm]

The second virtue is that the article is permanent. The build produces static HTML with all data embedded. The article does not depend on a running server or a working database. If you stop paying for your hosting tomorrow, you can still copy the folder to any other host and the article will work. If you want to archive it on a permanent media archive, you can. The article is durable in a way that most modern interactive journalism is not.

The third virtue is that the article is auditable. The notebook is the source. The build is reproducible. Another reporter, given the same data and the same notebook, can produce the same article. The methodology is transparent. The numbers in the chart can be traced back to the data that produced them. The article shows its work in the same way that DocumentCloud encourages.

The Composition With Everything Else

The pipeline becomes particularly interesting when it is combined with the other tools that have come up in this series. The notebook can load data from a DuckDB database. The DuckDB database can be built from Parquet files that were committed to a git repository by a daily scraper. The scraper might use shot-scraper to handle JavaScript-heavy government websites. The data might be enriched with geocoded coordinates from Nominatim or with entity matches from OpenSanctions. The article might include a designed map produced with QGIS and embedded as a vector image. The article might link to source documents hosted on a viewer modeled after DocumentCloud.

The whole stack works together because the formats are standardized and the tools were each designed by people who cared about the same things. The output of one tool is the input of the next. The result, at the end, is a static HTML article that contains the analysis, the visualization, the sources, and the explanation, all in one durable package, all produced from open source tools, all reproducible from the underlying data.

This is what the modern open data journalism stack looks like in twenty-twenty-six. None of the tools are new. Most of them have been developed over the last decade or so by small groups of people who care about open infrastructure. The combination is genuinely new. The combination is what makes one-person newsrooms competitive with much larger operations.

The Larger Argument

The argument Observable Framework represents, at the level of philosophy, is that journalism should be a single coherent artifact. The work and the publication should not be separate. The data and the article should not be separate. The interactivity and the prose should not be separate. The author and the source should not be separate. Everything should live together, in one document, that is also the source code that produces the published page.

This is a stronger version of the show-your-work argument that DocumentCloud represents. DocumentCloud asks you to link to your sources. Observable Framework asks you to publish your sources as part of the article itself. The notebook contains the analysis. The reader can see the analysis. The reader can fork the notebook and run their own version. The journalism becomes inspectable from the inside.

For most readers, this transparency is invisible. They read the article. They look at the charts. They move on. They never look at the underlying notebook. But the transparency is there. The methodology is auditable. The work can be replicated. The accusation that the journalism is dishonest can be answered by pointing at the source. The article is built on a foundation that cannot be hidden.

[serious]

This is the higher standard, and the tools are now mature enough that adopting it is practical. The static site is fast. The notebook is pleasant to work in. The build pipeline is reliable. The published article is permanent. The whole approach has been refined over years by working journalists at major papers, and the refinements are available to anyone who wants to use them.

The Bostock Lineage

The thing worth saying, as a closing observation, is that the entire trajectory from D3 to Observable to Observable Framework represents one person's career-long working out of a single set of ideas. Mike Bostock has been thinking about how data journalism should work for over fifteen years. The tools have changed. The underlying ideas have become clearer with each iteration. The current state of the open data journalism stack is, in significant part, his ideas refined and re-refined and finally settled into a coherent shape.

This is unusual. Most software is produced by teams. Most tools have multiple owners and competing visions. Observable has had a clear vision from the beginning, because that vision came from one human being who has kept refining it. The result is unusually coherent. The pieces fit together because they were designed by the same mind, working through the same problem, over a long time.

For a working reporter, the practical move is to learn the framework, build a simple article, see how the pipeline feels. The investment is modest. The skills compound. The articles you produce will be more durable, more transparent, and more honest than articles built on other stacks. The longer you work this way, the better the work gets. The notebook becomes the article. The article becomes the journalism. The journalism becomes the record of what mattered, preserved in a form that future readers can still open. That is what good infrastructure makes possible. That is what the open data journalism stack now offers.