Scaleway, Part Three: Where The Catalog Gets Strange

What We're In For

We've covered the floor and the plumbing. Storage and compute. Networking and containers and databases. The boring layers, the layers that make a cloud feel like a cloud.

This episode covers the strange end. The graphics processing unit instances that drive Mistral's training runs. The European-hosted generative artificial intelligence application programming interfaces. The custom-built supercomputers that Kyutai used to train Moshi. The security primitives that you actually want. The Internet of Things hub. Transactional email. Cost monitoring and environmental footprint tracking. And then, at the end, the return to quantum, where things have changed substantially since the brief flirtation Pär had with the platform some time ago.

The angle on this episode is different from the first two. The first two were what could you reasonably adopt. This one is what's actually possible. The far-fetched gets real airtime.

Generative APIs, The Boring Part First

Scaleway runs a generative artificial intelligence application programming interface service. The boring description is, you call an endpoint with a prompt, you get a response. The endpoint is hosted in Paris. The pricing is per token. The free tier covers the first million tokens, which is genuinely enough to evaluate a model on your own data.

The interesting description is that the application programming interface is OpenAI-compatible, which means anything you've already written against OpenAI will work against Scaleway by swapping the base U R L. The OpenAI library, LangChain, every tool that speaks the OpenAI dialect. You change one configuration line. The model running on the other side is no longer OpenAI's. The data path no longer touches American infrastructure.

The catalog of models is larger than you'd expect. As of recent reviews of their documentation, the serverless catalog includes Qwen three point five at three hundred and ninety-seven billion parameters, released by Alibaba as a frontier reasoning model in February of this year. Gemma three at twenty-seven billion. Pixtral for vision. Holo two at thirty billion for graphical user interface understanding. Nemotron at seventy billion from NVIDIA. DeepSeek R one distilled into Llama at both seventy and eight billion. Llama three point one at eight billion and Llama three point three at seventy billion. The OpenAI-released open-weight g p t oss model at one hundred and twenty billion. Mistral Small at twenty-four billion.

These are real production-grade open-weight models. The pricing is competitive with the major providers. The data is processed in Paris. Scaleway publicly commits to not collecting, reading, or analyzing the contents of prompts or outputs. The billing recently changed from per-million-token slices to per-thousand-token slices, which is small but matters for small experiments. You can actually run a hundred-token query and pay for what you used rather than rounding up to a million.

For Pärception, which sells artificial intelligence consulting to Swedish clients and increasingly answers questions about which model to use for what, this matters. The honest answer for many use cases is no longer ChatGPT. The honest answer is, an open-weight model running on European infrastructure, called through an OpenAI-compatible application programming interface. Scaleway is exactly that.

The Generative APIs Dedicated Deployment option is the heavier version of the same thing. Used to be called Managed Inference, renamed earlier this year to align with the rest of the generative artificial intelligence product line. Instead of shared serverless capacity, you get dedicated graphics processing unit instances running your chosen model. Predictable throughput. Lower latency. Higher minimum cost. The right shape for workloads with consistent traffic. Worth knowing it exists, though for sporadic Pärception consulting workloads, serverless is the better fit.

There's also a Batches application programming interface. You submit a batch of requests in one go. They're processed asynchronously over up to twenty-four hours. The discount is fifty percent. For workloads where speed is not required, this is the cheapest way to run large model inference on Scaleway. Generating descriptions for the full LifeLab photo archive would be a textbook batch job.

The Graphics Processing Unit Catalog

Now the metal underneath the artificial intelligence services. Scaleway's graphics processing unit instance catalog.

The L four graphics processing unit instance is the entry tier. NVIDIA's L four is a card aimed at inference workloads. Modest performance, modest power consumption, modest hourly price. The right answer for serving a small model or running batch inference where speed isn't critical. The kind of instance you spin up for a weekend of experiments and tear down on Monday.

The L forty S graphics processing unit instance is the step up. Faster than the L four, considerably cheaper than an H one hundred on the P C I e form factor. Scaleway markets this explicitly as the universal artificial intelligence instance for the next generation of applications. For a Pärception client who wants to run a custom-fine-tuned model on dedicated infrastructure without spending H one hundred money, this is the answer.

The H one hundred graphics processing unit instance is the heavy lifter. NVIDIA's H one hundred is the card that trained most of the large language models you've heard of. Scaleway offers it in two form factors. The P C I e form factor is cheaper but slightly slower. The S X M form factor is the high-bandwidth interconnected version used for distributed training. For serious model training work, S X M is the choice.

The B three hundred S X M graphics processing unit instance is the new one. NVIDIA's B three hundred is Blackwell architecture, the generation that follows Hopper. Scaleway describes this as pushing the boundaries of performance. The translation is, this is the most expensive, fastest hardware in the catalog, and it's new. If Pär ever decided to do a genuinely large training run, the kind of run where you need eight or sixteen graphics processing units talking to each other through high-bandwidth interconnects for several days, this is the hardware.

There's also a Render graphics processing unit instance line using Tesla P one hundred cards. These are older. The pricing reflects it. For workloads that don't need current-generation features, the older cards are dramatically cheaper. For an experimental rendering pipeline or a hobby project that just needs some graphics processing unit power, this is the underused tier.

The bigger picture across this catalog is that Scaleway has built out a complete graphics processing unit story for one main reason. The European generative artificial intelligence companies needed somewhere European to train. Mistral has used Scaleway clusters. Kyutai has used Scaleway clusters. The story of European artificial intelligence sovereignty in the last two years runs through this hardware in Paris. Knowing that is part of why the offering is taken seriously.

The Custom-Built Clusters

This is where the catalog goes from interesting hourly billing to actually call sales.

Scaleway operates custom-built clusters. These are supercomputers, in the engineering sense of the word. Hundreds to thousands of graphics processing units, connected by high-bandwidth interconnects, sitting in a single building.

The cluster that trained Mistral's foundation models in two thousand and twenty-three and onwards is a Scaleway cluster. The cluster that trained Kyutai's Moshi voice model is a Scaleway cluster. These are not theoretical capabilities. The most prominent European generative artificial intelligence companies of recent years have used Scaleway infrastructure for the training runs that produced their flagship models.

For Pär to actually reach this tier would be, in the strict sense, far-fetched. You don't accidentally train a foundation model. The financial commitments are large. The engineering work is large. The point of mentioning the cluster service is different. It's that the capability exists in the same European cloud account where Pärception's invoices already arrive. The same vendor relationship that handles a fifty euro per month virtual machine can, in principle, handle the rental of a thousand-graphics-processing-unit cluster for a week. The leap is not technical. The leap is whether you have an idea worth a six-figure training run.

The On Demand Cluster is the more accessible version of the same idea. Rather than committing to a custom-built cluster for years, you rent a cluster of thirty-two to one thousand or more graphics processing units for as long as you need it. A week. A month. The kind of duration that fits an actual research project or a fine-tuning run for a serious application.

The far-fetched version of this for Pär would be the systematic LoRA training program at scale. The current FLUX LoRA work runs character training jobs on RunPod, one or two at a time. The On Demand Cluster version would be, rent thirty-two graphics processing units for a weekend, run two hundred LoRA training experiments in parallel, build a search across hyperparameters, data subsets, base model variants, all in a single weekend that would otherwise take months on a single graphics processing unit. The output is a catalog of trained LoRAs covering a parameter space that's currently being explored one cell at a time.

Whether that's worth the cost is a real question. Whether it's possible is not.

Security: I A M, Secret Manager, Key Manager

Identity and Access Management, abbreviated as I A M, is the part of every cloud account that controls who can do what. Right now, on a one-person Scaleway account, I A M is probably, Pär can do everything. That's appropriate.

The moment a second person enters the picture, even temporarily, I A M becomes relevant. If Pär hired a contractor for two weeks to help with a specific deployment, that contractor should have access to exactly the resources they need and nothing else. I A M is what makes that fine-grained access possible. You define a policy. You attach it to the contractor's identity. The policy says, this person can read and write the FLUX training bucket but cannot touch the production virtual machine. When the contract ends, you revoke the policy. Clean.

The same applies to automation. If a serverless function needs to write to an object storage bucket, you don't give that function root access to the entire account. You give it a policy that says, this function can write to this bucket and read from this database, and nothing else. When something goes wrong with the function, the explosion radius is small. The principle is called least privilege, and it's the thing every security engineer wishes their less-careful colleagues understood better.

The Secret Manager is where you put credentials. Application programming interface keys. Database passwords. Third-party service tokens. The naïve version of credential management is to put them in environment variables, which means they live on every machine in plain text. The Secret Manager version is to store them centrally, encrypted, and have your application fetch them when it starts. Rotating a credential becomes a single update in one place rather than a search across every machine.

The Key Manager is for cryptographic keys. The keys that encrypt your data. The keys that sign your tokens. The keys that authenticate hardware. Most applications don't think about these as a separate category from secrets, but they should. A leaked password can be changed. A leaked cryptographic key requires rotation of every piece of data the key was used to encrypt or sign. That's a much bigger problem.

For Pärception today, the Secret Manager is the most immediately useful of the three. The next time you find yourself updating a database password and then chasing down every file where it's hard-coded, the Secret Manager is the answer to that pain. The pain is real. The fix is cheap. The migration is a one-evening project.

Cockpit, Audit Trail, And The Slow Build To Observability

Cockpit is Scaleway's name for their observability product. The thing that collects logs, metrics, and traces from your infrastructure and lets you query them. The thing you reach for when something is wrong and you need to find out why.

Today, on a single virtual machine, observability is probably "secure shell into the box and tail the log files." That works until it doesn't. It stops working the day there's a problem you can't reproduce live and you need to look at history. It stops working the day there are multiple machines and the relevant log lines are scattered across them. It stops working the day the machine is so broken you can't secure-shell in to read the logs.

Cockpit collects the logs centrally. When something goes wrong, you query the central store. The history is there even if the machine is gone. The lines from different machines are joined into one view. The application metrics, the system metrics, the trace data showing how a single request flowed through your infrastructure, all in one place.

Audit Trail is the security-focused version of the same idea. Audit Trail records who did what across your Scaleway account. Who created a virtual machine. Who deleted a bucket. Who rotated a credential. For a one-person account, this is mostly an attic record. For an account that any other human touches, this is the answer to what happened, when, and who.

Both products are inexpensive. Both are the kind of thing you don't adopt until you wish you had. The shape that fits Pärception is probably this. Enable Cockpit before you adopt anything that runs across multiple machines. Enable Audit Trail before you add a second human to the account. Neither is urgent. Both are cheaper to enable in advance than to retrofit during an incident.

Cost Manager And The Environmental Footprint Calculator

Two products at the management corner that often get ignored.

Cost Manager is the tool for tracking what you're spending. Self-explanatory. Worth enabling because the alternative is finding out at month-end. For an attention-deficit-aware operator who would rather not have a billing surprise in late August during the summer edition rush, watching costs day by day rather than reading the monthly invoice is the better shape. You set thresholds. You get notifications. The financial side stops being a surprise category.

The Environmental Footprint Calculator is more interesting. Scaleway reports the carbon intensity of the actual infrastructure your workloads are running on. The Paris data center, in particular the building called D C five, runs on a power usage effectiveness of one point one six, compared to an industry average around one point five five. That means thirty to fifty percent less energy used per unit of compute compared to a typical data center.

For Pärception, which writes about technology choices in a small Swedish village and presumably cares about the energy footprint of the artificial intelligence work, this matters. The same generative artificial intelligence query, run on Scaleway's Paris infrastructure, produces meaningfully less carbon than the same query run on a hyperscaler with a hotter data center. The Environmental Footprint Calculator gives you the actual numbers per workload. Useful for clients who want to know. Useful for your own internal sense of what's happening. Useful as the kind of footnote that lands well in an Årebladet article about local technology choices.

Transactional Emails And The Boring Things That Matter

Transactional Email is the service for sending mail from applications. Not marketing mail. Mail that says your order has shipped, or click here to verify your email address, or the script you scheduled has completed.

For Pärception, the obvious use is across all the small projects that occasionally need to send mail. PärCel order confirmations. Article submission acknowledgments. Pärkit alert emails. Right now this is probably handled with a mix of small libraries and the application server's outgoing mail capability, which works until it doesn't, which is usually the day a major mail provider decides your virtual machine's address is suspicious and starts dropping your mail into spam folders.

Transactional Email runs through Scaleway's hardened mail infrastructure. The deliverability is significantly better than what you get from sending mail directly from a small virtual machine. The cost is small. The annoyance reduction is real. The morning you don't have to debug why a customer didn't get their order confirmation is a morning you get to spend on actual work.

The Internet of Things Hub And The Pinkserver Connection

The Internet of Things Hub is the Scaleway service for connecting devices to the cloud. The shape is the standard one. You have devices in the field. The devices send data. The hub receives the data. The hub routes the data to whichever downstream service should process it. The devices authenticate with certificates. The communication uses standard protocols like M Q T T, which stands for Message Queuing Telemetry Transport.

For Pär, this is the interesting one. Pinkserver, the Raspberry Pi sitting in Kall, currently talks to the Scaleway virtual machine over the public internet using whatever specific protocols each piece of automation needs. The Internet of Things Hub would be the cleaner architecture. Pinkserver authenticates with the Hub. Every piece of telemetry from pinkserver flows through the Hub. The Hub routes it onward to Pärkit's storage, to alerting, to whatever downstream consumer wants the data.

The same shape would apply to the BMW i three's telemetry, currently coming into Pärkit through whatever mechanism Pär has built. The Internet of Things Hub is the cleaner architecture for that pipeline too. The car sends data to the Hub. The Hub routes it into Pärkit's storage. The car never talks to your application server directly. If the application server is down for maintenance, the data still arrives at the Hub, queued, waiting for the storage layer to come back.

Whether to migrate any of this is the question. The answer for today is probably no. The current arrangement works. The point of the tour is that the cleaner architecture exists in the same console as the rest of your infrastructure. The day pinkserver acquires three more siblings, the Internet of Things Hub starts looking like the right answer rather than something to think about later.

Quantum, Revisited

Now the return.

The quantum experiment Pär did some time ago was probably during a window when Scaleway's Quantum as a Service was a much earlier product. A few qubits of access. Some emulation. Interesting toy. Not particularly applicable to anything Pär was working on.

The current state is considerably different. Scaleway has built out their quantum platform substantially in the last year. As of this year they aggregate quantum processors from multiple European manufacturers. Pasqal contributes neutral-atom hardware, integrated as of December last year. Quandela contributes photonic hardware based on Linear Optical Quantum Computing. Alpine Quantum Technologies, headquartered in Innsbruck, contributes a trapped-ion system called I B E X Q one with twelve qubits, available Tuesdays and Wednesdays from ten in the morning to five in the afternoon Central European Time. And I Q M contributes superconducting transmon systems including Garnet at twenty qubits, Sirius at sixteen qubits, and Emerald at fifty-four qubits.

This is not a toy anymore. This is a meaningful catalog of European quantum hardware accessible through one cloud platform, with one billing relationship, through standard software development kits. Qiskit. Cirq. Pulser. Perceval. PennyLane. And as of NVIDIA's March announcement this year, NVIDIA's CUDA dash Q runtime is now fully compatible. You write a CUDA dash Q kernel. You choose Scaleway as your execution backend. The same code runs on graphics processing unit emulation up to thirty-eight qubits across eight Blackwell graphics processing units, or on actual quantum hardware from any of the partner manufacturers. The application programming interface is the same. The hardware varies.

The pricing model is pay per shot for real hardware, pay per hour for emulation, with optional dedicated booking. The barrier to running an actual quantum program on actual quantum hardware is now the same as the barrier to running a Python script on a virtual machine. You write the program. You point it at the platform. You get the result.

Now the far-fetched application for Pärception.

The honest answer is that there is no current business reason for Pär to run quantum workloads. The applications that need quantum advantage today are in optimization, simulation of quantum systems for chemistry research, and certain kinds of machine learning preprocessing where the algorithm has a known quantum speedup. None of these is on any current Pärception roadmap.

But the unhelpful answer ignores what Pärception actually is. Pärception sells artificial intelligence consulting to small Swedish clients. The competitive question for a small consulting practice is, what can you do that the big consultancies cannot. One credible answer is, I can call a real quantum processor from Paris in five minutes. Not because the client's problem needs it. Because the client's curiosity wants it. A demonstration that runs a small variational algorithm on actual quantum hardware, returns the result in a couple of minutes, and explains what just happened, is a genuinely memorable thing for a client meeting.

The cost is small. Quantum as a Service is billed per shot for real hardware and per hour for emulation, with no commitment. The development time is small. The platform speaks every major quantum software development kit, including PennyLane which has direct support for quantum machine learning. A weekend of work produces a demonstration. The demonstration is a permanent asset. The client conversation in which Pär opens a terminal, runs a quantum program on a real trapped-ion processor in Innsbruck, and shows the result graph on screen, is a conversation the client remembers a year later.

There's also the longer-running curiosity angle. The neutral-atom hardware from Pasqal is genuinely new technology. The photonic hardware from Quandela operates at room temperature, which is unusual for quantum. The trapped-ion hardware from Alpine has the longest coherence times of any quantum modality currently on offer. Each one is a different physics. The exQalibur emulator runs photonic simulations on Blackwell graphics processing units. Each platform makes a different shape of program easy. Trying each of them is a kind of literacy in a technology that's about to become more relevant.

This is the kind of thing PärPod is for. A series of episodes, one per hardware platform, where Pär runs a small program, watches what happens, and reports back. Episode one, the photonic emulator and the strange feeling of computing with light. Episode two, the neutral atoms and the way Pasqal arranges them in an actual physical lattice that maps to your problem. Episode three, the trapped ions and what it means that the qubits are individual atoms held in place by lasers. Episode four, the superconducting transmons that everyone has heard of, where the cold parts of the system are colder than space.

Whether any client of Pärception ever asks for quantum work is unknown. Whether the literacy gained from running these experiments pays off is unknown. The cost of the experiments is small enough that the calculation is simply, is this interesting enough to do. For Pär, who already pays a Scaleway invoice every month, the marginal cost of doing this is genuinely low. The marginal interest is high.

The Far-Fetched List

A short final tour of services we haven't fully placed, with the far-fetched applications spelled out plainly.

Apple Mac mini in the cloud, covered in episode one, has a far-fetched use for the PärPod render farm. Five Mac minis spun up overnight, each processing a queue of episodes, all coordinated through a serverless job dispatcher. Not needed today. Available tomorrow if needed. The economics work out the day you've batched a quarter's worth of audio production into a single overnight run.

RISC five bare metal, also covered earlier. The far-fetched use is documenting an unusual training run on an unusual architecture and writing about it. The audience for the story of training a small image generation model on RISC five hardware is small but extraordinarily loyal. The article ages well. The novelty doesn't depreciate. Tutorials on doing unusual things on unusual hardware tend to be among the most cited content on the open internet for years.

The On Demand Cluster has a far-fetched use that's bigger than the current LoRA work. Two hundred parallel character training runs across a hyperparameter grid in a single weekend. The output is a catalog. The catalog becomes a Pärception offering. The thing other people can't easily replicate because the upfront cost of doing the search was non-trivial and the resulting catalog is the asset.

The Custom-Built Cluster has the most far-fetched use of all. Pärception trains its own small language model. Not large. Just enough to claim the title. A model specifically trained on Swedish-language text, fine-tuned for the kind of work Pärception does. The model is then offered through Scaleway's Generative APIs Dedicated Deployment as a custom endpoint that clients can call. The marketing copy writes itself. The first Swedish-language language model trained in Kall. The actual training is done in Paris, but the conceptual ownership is in Kall, and that's the part that matters for the brand voice. The cost is significant. The reputational return, for a small consulting practice that wants to look serious about its own technology, could be worth it.

The Topics and Events service has a far-fetched use as the spine of a distributed Pärkit. Right now Pärkit is a single Postgres on a single machine. The far-fetched version is a network of small data producers all publishing to Scaleway Topics, with multiple consumers subscribing to different subsets. The car publishes to a car topic. Pinkserver publishes to a home topic. Apple Health publishes to a body topic. Pärkit subscribes to all of them, but so does a separate analytics service. So does a separate alerting service. So does a separate archival service that copies everything to Glacier. Each consumer is independent. Each can be replaced without affecting the others. The architecture stops being a single Postgres and becomes a small distributed system.

The Kafka clusters take this even further. Kafka is what you reach for when the topics are not just events but a permanent record. Every reading from every source, archived in order, replayable from any point in history. The Pärkit equivalent would be a single Kafka cluster that holds every piece of data Pär has ever collected from any source, with multiple consumers including the current Postgres, the future ClickHouse warehouse, and whatever else gets added in the next decade. The Kafka cluster becomes the source of truth. Everything else becomes a materialized view.

The Web Application Firewall has a far-fetched use for media protection. Årebladet, as a publication that occasionally writes about local controversies, could in theory attract scraping bots or worse. The Web Application Firewall is the layer that catches that traffic before it touches the application server. Today it's not needed. The day Årebladet writes a piece that goes regional, it might be.

These are the far-fetched ones. They sit there. They wait.

What This Series Was For

This was the map.

Three episodes ago, the only Scaleway services in active use were object storage, virtual machines, and a faint memory of quantum. Now the rest of the catalog has names.

Object storage has a cold-tier sibling called Glacier, perfect for the LifeLab archive of past selves. Virtual machines come in four flavors plus bare metal plus Mac mini plus the curiosity of RISC five. Networking has private clouds, peering, load balancers, edge caching, and a Web Application Firewall. Containers come in three serverless flavors plus Kubernetes Kapsule plus the multi-cloud Kubernetes Kosmos. Databases come managed in three flavors plus a serverless option plus the brand-new ClickHouse warehouse. Data has Spark and Kafka and NATS and a data orchestrator. Mail, Internet of Things, identity, secret management, and key management all have first-class services. Cost and environmental footprint are visible from one console. Observability through Cockpit. Audit trail for compliance.

And quantum has multiple physical platforms from multiple European manufacturers, accessible from one application programming interface, with CUDA dash Q integration that makes the same code run on graphics processing unit emulation or real quantum hardware depending on a single configuration flag.

The point was never to adopt everything. The point was to know what's there. Most cloud bills get out of hand because someone forgot a managed service existed and rebuilt it on a virtual machine. Most cloud bills stay reasonable because someone remembered.

The Scaleway map is now in your head. The summer edition still ships first. Napkincast follows. Everything else waits.

End of series.