gruvor: The Timeline Machine Before the Pretty Picture

Four Days Before Print

Four days before print is a terrible time to discover that your story is not a line. It is a knot. One thread runs through a Swedish aktiebolag. Another runs through Australian company material. A third runs through Bergsstaten and SGU. A fourth runs through Bolagsverket, verklig huvudman, annual reports, late filings, sanctions, names, dates, and documents that do not politely stand in the order you need them.

That is the actual problem with the Energy X92 story right now. Not that there is too little material. The opposite. There is enough material that the article can drown in its own evidence. You have a research stack that already pulls permits, mirrors documents, enriches companies, extracts people, ingests media, checks fact packs, and stores history in parquet snapshots. The machine is not empty. The machine is full. The next job is to stop treating the material as folders and start treating it as time, relationships, and evidence.

The first visual layer should not be made for readers. That sounds backwards, but it is the important bit. The first visual layer should be made for you. It should help you see what happened, when it happened, who was connected to it, and which pieces are hard fact versus still-open holes. Only after that should it become a graphic for Årebladet. Internal truth first. Newspaper clarity second.

The Shape of the Problem

Right now, the gruvor project is already a working research spine. It has DuckDB and GeoParquet underneath, date-partitioned snapshots, current views, permit data, company enrichment, people extraction, media ingestion, and a local FastAPI interface with routes for maps, timelines, companies, people, media, search, and more. That matters because you do not need to invent a separate “graphics project” that will immediately drift away from the evidence.

The danger is building a beautiful diagram by hand. A hand-made diagram feels fast, and for one article it might even work. But the moment a new Bolagsverket reply arrives, or the verklig huvudman check changes, or the ASIC material adds one more date, the graphic becomes a second article draft that has to be manually fact-checked. That is how errors sneak in wearing a nice jacket.

The better move is a derived visualization layer. Not the source of truth. Not another database with its own opinions. A small layer that reads from the existing tables, dossiers, documents, gap files, and media records, then produces clean visual objects: entities, events, edges, and sources. Those four words are enough to make a serious first version.

An entity is a thing: a company, a person, a permit, a document, an authority, a place. An event is something that happened on a date, or around a date, or at a date we only know approximately. An edge is a relationship: someone controls a company, a company applied for a permit, a person signed a document, an entity changed name, two companies are connected through filings or public statements. A source is the proof. Without the source, the event is not an event. It is a note wearing shoes.

That gives you the key principle: every visual object needs evidence attached to it. The graph should not say “connected” as if it is a magic fog. It should say what kind of connection, when it was seen, where the evidence came from, and how confident you are.

Timeline Before Graph

The relationship graph is the sexy one. It will look like journalism. Names, companies, lines, labels, possible networks. It will feel like the investigation has a brain. But the first tool to build is probably the timeline, because timelines catch lies that graphs hide.

A graph can make two things look close even if they happened years apart. A graph can make a stale connection look current. A graph can make a weak source look as strong as a primary document. Time is the antidote. Before asking “who is connected to whom,” the better first question is “what was true on this date?”

For Energy X92, I would build a lane-based timeline. One lane for the Swedish company. One for Neu Horizon or the Australian side. One for Bergsstaten and SGU. One for Bolagsverket and verklig huvudman. One for financial and reporting events. One for the article process itself, including requests sent, comments expected, gaps still open, and checks that must happen before print.

This is not just prettier note-taking. It is an editorial safety tool. If you can see that one event depends on a document from May, another on a company filing from April, and a third on a manual BankID check scheduled for May twentieth, the article stops being a swamp. You know what can be written now, what must be left conditional, and what needs one more pull before publication.

The timeline also lets you handle uncertainty without sounding weak. Some dates are exact. Some are month-level. Some are inferred from documents. Some are still unknown. The visual model should carry that directly. A confirmed date gets one treatment. An approximate date gets another. An unresolved item lives in an undated or pending section instead of being silently dropped.

For the first version, use vis-timeline. It is boring in the best possible way. It can show groups, zooming, ranges, custom event boxes, and click interactions. It works in plain JavaScript. It does not demand a React app, a build step, or three days of “almost there.” It can sit inside the FastAPI and Jinja setup you already have. That matters during article week.

The Graph Without Conspiracy Fog

Once the timeline exists, the graph becomes much safer. A relationship graph should not try to show everything at once. That is how you get conspiracy spaghetti: a massive hairball where the visual feeling is “something is going on,” but the evidence is harder to inspect than before.

The graph should be query-driven. Show me Energy X92 and its direct relationships. Show me people mentioned in decisions and filings. Show me companies connected through ownership, control, roles, documents, or public permit activity. Show me one neighborhood at a time, with source-backed edges.

Every edge needs a verb. Not just a line. A line with no verb is suspicion as graphic design. A good edge says “is verklig huvudman of,” “applied for,” “holds permit,” “signed,” “witnessed,” “mentioned in,” “renamed to,” “same address as,” “same email as,” or “related party according to this document.” If you cannot name the edge type, the edge probably should not be drawn.

This is where Cytoscape dot J S is the best first choice. It is open source, browser-based, good at graph visualization, and strong enough for interaction and filtering. It can live in your current stack without becoming a new religion. You can click a node, highlight neighbors, filter by edge type, and show source snippets in a side panel. It is serious enough for internal use and not so heavy that the tool becomes the story.

Sigma dot J S with Graphology is also worth knowing about. That pairing is better when graphs get larger and rendering performance matters. Sigma handles the WebGL drawing. Graphology handles the graph model and algorithms. But I would not start there for the article. Start with Cytoscape because the first need is not ten thousand nodes. The first need is evidence you can inspect without losing your place.

Graphviz is the other useful tool, but for a different job. It is bad as an exploratory workbench and excellent for deterministic print diagrams. Once you know the five or twelve nodes you want in the newspaper, Graphviz can make a clean, reproducible diagram. But do not use it to think. Use it to publish a selected thought.

The Map as a Time Machine

The map is already part of the gruvor world, and that is a huge advantage. You have permit geometry, protected area overlays, reindeer area overlaps, and the discipline of keeping coordinate systems straight. The missing piece is not “make a map.” It is “make the map obey time.”

A normal permit map shows where. A better permit map shows where and when. The reader needs to understand that permits and applications are not static decoration. They arrive, change, expire, overlap, and belong to specific actors at specific moments. Internally, you need the same thing, because a map without time can make old and current claims look equally alive.

The workbench version should combine three panels. A timeline on one side or below. A MapLibre map on the main canvas. A detail panel for the selected event, permit, company, or source. Click an event and the map highlights the relevant permit area. Click a permit and the timeline filters to events about that permit and its owner. Click a company and the graph opens around that company. Move the time window and the map only shows what was active or relevant in that range.

This is the point where the system stops being a pile of features and becomes a thinking tool. Timeline, graph, and map should not be separate toys. They should be three views into the same evidence.

For the eventual article graphic, the map should be simpler. Maybe one map showing the local permit areas. Maybe a before-and-after if timing matters. Maybe a map plus a small timeline strip. But the internal version should allow mess, filtering, and inspection. The public version should cut hard.

The Small Data Layer

The good news is that the data model can be small. This is not a case for a grand ontology named after a minor Greek god. You need four derived files or tables per case: entities, events, edges, and sources.

An entity record needs an identifier, a type, a label, a canonical name, and optional jurisdiction or organization number. A person, a company, a permit, and a document can all be entities. The point is not philosophical purity. The point is that the timeline and graph can point to the same objects.

An event record needs a date, maybe an end date, date precision, event type, title, summary, primary entity, related entities, source identifiers, confidence, article relevance, and maybe a gap identifier. If the event comes from a verified source, say so. If it is a manual note, say so. If it is unresolved, do not smuggle it in as fact.

An edge record needs a source entity, a target entity, an edge type, a label, source identifiers, confidence, and optional first-seen and last-seen dates. That last bit is important because relationships are not always permanent. Someone can be connected in one filing and not in another. A company can rename. An owner record can go stale. Time belongs on the edge too.

A source record can point back into media items, manual drop-ins, dossier paths, article files, PDFs, or known public documents. The source record is what makes clicking useful. A graph without source clicks is decoration. A graph with source clicks is a research instrument.

The first command can be brutally simple: build visualization data for the Energy X92 case and write JSON files under the knowledge base. That keeps the workbench read-only and safe. Later, if it proves useful, it can become a proper generated visualization mart inside DuckDB.

The Build Slice

Here is the slice I would actually build first. Add a command called gruvor viz build, with a case slug. For now, the only case slug that matters is Energy X92. The command reads from the existing current views, company tables, permit tables, people tables, media items, decision participants, manual drop-ins, and article files where reasonable. Then it writes entities, events, edges, and sources as JSON.

Do not change the parquet schemas. Do not disturb the article’s existing fact-pack verifier. Do not turn this into an editing interface. The first version should be read-only, generated, and disposable. If the generated visualization is wrong, fix the generator or the source data, not the JSON by hand.

Then add two routes. One route shows the timeline. One route shows the graph. Add four API routes that return the generated JSON. Use the existing FastAPI, Jinja, and plain JavaScript pattern. No build step. No new frontend system. No “we just need Vite.” That way lies ten thousand tiny cuts.

The timeline route uses vis-timeline and groups events into lanes. The graph route uses Cytoscape dot J S and styles nodes by entity type and edges by relation type. Both views should include a side panel. When you click an item, the side panel shows the title, summary, confidence, related entities, and source references.

Add tests that are almost annoying in their simplicity. No event without a source unless it is explicitly marked as a manual note. No edge without an edge type. No graph response without nodes and edges. No timeline response without groups. No unknown dates silently dropped. These are the little tests that prevent the tool from becoming pretty nonsense.

The Article Graphics Later

Once the internal workbench is useful, article graphics become a selection problem, not a creation problem. That is exactly where you want to be. You should not be designing the final graphic while still wondering what the story is. You should use the workbench to find the story shape, then export a simplified version.

There are probably three graphics that could matter for the Årebladet article. The first is a compliance timeline: company creation, verklig huvudman status, annual report timing, sanction fee, Bolagsverket comment, and current status. That is likely the clearest visual for readers because it explains the hook without requiring them to understand mining law first.

The second is a map of the permit geography. This answers the local question: where is this in relation to people, places, water, reindeer areas, protected areas, or familiar geography? The map does not need to explain every corporate detail. It needs to make the local consequence visible.

The third is a small relationship diagram. Not a giant network. A small one. Energy X92, relevant people, Neu Horizon or related companies if source-backed, Bergsstaten or permits, and one or two document nodes if needed. The diagram should show only the relationships the article actually uses. Everything else can stay in the internal workbench.

This is the hard editorial discipline: the internal graph may have fifty interesting nodes. The article graphic may need eight. The reader does not owe us patience just because the investigation was complicated.

For exports, start with Playwright screenshots. They are good enough for internal review and maybe web use. For print, you may later want SVG. Graphviz can generate deterministic SVG for relationship diagrams. MapLibre can be captured at high resolution or replaced by a QGIS-rendered final map when cartographic quality matters. The point is not to solve final print typography now. The point is to make sure the data structure can support it when needed.

The Options in Plain English

There are three practical paths.

The safest path is to add a generated timeline first. It uses vis-timeline, reads from generated JSON, and gives you lanes, filtering, source clicks, and a way to see article gaps in time. It is the fastest route to clarity. It is also the least likely to break anything before deadline.

The second path is the relationship workbench. It uses Cytoscape dot J S, generated nodes and edges, and source-backed side panels. It is more visually satisfying and probably more fun, but it has a higher risk of becoming a rabbit hole before the article is done. Build it, but do not let it outrank the timeline this week.

The third path is the publication graphics path. That means selected SVG diagrams, print-focused maps, and a clean timeline graphic for the article. This is the end of the chain, not the start. If you begin here, you risk drawing the wrong story beautifully.

The trap is trying to build a perfect universal investigation visualizer. Do not. Build a small Energy X92 visualizer that happens to have the right shape for future cases. The system can generalize later. Right now, it needs to help one article survive contact with evidence.

The Strange Bonus

There is one underrated bonus in this approach. It creates a memory of the investigation as an object. Not just a folder full of documents and a final article, but a structured account of what was known, when it was known, where it came from, and what remained open.

That matters because this story may not end with one article. There may be follow-ups. Other companies. Other permits. Other municipalities. Cross-newspaper cooperation. A future gruvkartor dot se layer. If the Energy X92 visualization layer is built as a derived case model, it can become the pattern for every later investigation.

It also matches the way your stack already works. Snapshots preserve history. Fact-pack verification catches claims. Dossiers collect company-specific material. The visualization layer should not fight that. It should expose it. It should make the existing discipline visible.

[serious] The most important design choice is honesty about uncertainty. Mark the confirmed facts. Mark the inferred links. Mark the stale records. Mark the open questions. Mark the things waiting for BankID, ASIC, Bolagsverket, or a human phone call. Investigative graphics get dangerous when they make uncertainty disappear.

And that is why this should begin as a workbench, not a poster. The first job is not to impress the reader. The first job is to stop the investigator from being fooled by his own pile of true things.

What I Would Ship First

If I were steering the coding work, I would ship the first version in this order. First, the visualization data generator for Energy X92. Second, the timeline route. Third, source-click side panels. Fourth, the Cytoscape graph. Fifth, the map time filter. Sixth, export.

The first demo should answer simple questions. What happened in order? Which events are source-backed? Which events are pending? Which people and companies appear in the same documents? Which permit areas are tied to which company events? Which article gaps are still alive?

If the tool answers those questions, it is already useful. If it also looks cool, wonderful. But useful comes first. Pretty comes later. This is not because pretty is bad. Pretty is powerful. Pretty is dangerous when it arrives before the structure is true.

The right first graphic for your own brain is the lane timeline. The right first graph is the one-hop Energy X92 relationship graph with strict edge labels. The right first map is a permit map controlled by the selected timeline event. The right first export is a screenshot, not a design system.

Then, when the article is closer to locked, you choose the reader graphics from the evidence. A compliance timeline for the hook. A local map for consequence. A small relationship diagram for context. Three graphics, each doing one job.

That is the whole thing. Turn the project from a document machine into a time-and-relationship machine. Do it without changing the source of truth. Keep every visual object tied to evidence. Build for your own clarity first. Then make the reader version smaller, cleaner, and much less impressed with itself.