Postgres in the Wild: What Production Actually Looks Like

A Server in Paris

This is episode eleven of The Vibecoder's Guide to Postgres.

There is a server in a Scaleway datacenter just outside Paris. It has eight gigabytes of RAM, twenty-two gigabytes of disk, and it runs eight PostgreSQL databases. A news aggregation system that serves three different websites. A music service. A productivity suite with four separate tools sharing one database. An event platform. A multi-tenant e-commerce storefront. A photographer's digital goods shop. A newsletter system. An institutional memory archive with vector embeddings. Eight databases, each with its own user, each with its own backup schedule, all managed by one person with AI assistance.

That is not a fantasy setup. That is a real server, running right now, serving real users. And the person who runs it is not a database administrator. He is a vibecoder. He builds things with AI writing most of the code, and Postgres keeps it all running underneath.

[calm] Today we are going to tour that server. Not the theory of production Postgres, not the best practices from a conference talk at a company with a hundred engineers. The actual, messy, practical reality of what it looks like when one person runs eight databases on a single VPS. The configuration files, the monitoring, the things that matter and the things that do not.

Because there is a gap between tutorial Postgres and production Postgres. The tutorials stop at CREATE TABLE and SELECT. Production starts at "why is my disk full" and "what does this setting actually do." If you are a vibecoder thinking about running your own Postgres, this episode is the bridge.

The Great Debate: Managed or Self-Hosted

Before we tour the server, we need to address the elephant in the room. And no, not that elephant. The question every vibecoder faces eventually.

Should you run your own Postgres at all?

The honest answer is: probably not, at least not at first. If you are building your first project and you want a Postgres database, Supabase will give you one in about ninety seconds. Sign up, create a project, and you have a fully managed PostgreSQL instance with authentication, real-time subscriptions, and an auto-generated API on top. The Pro plan runs twenty-five dollars a month. Neon does something similar but with a serverless twist, where your database scales to zero when nobody is using it, so you pay nothing during quiet hours. Railway gives you a Postgres instance as part of their deployment platform. All of these are real PostgreSQL underneath, not some proprietary thing wearing a Postgres mask.

For an early-stage project, this is the smart choice. You are paying someone else to handle backups, upgrades, monitoring, security patches, and that three AM phone call we talked about in episode ten. You get to focus on building your application instead of learning how to tune postgresql dot conf.

So why would anyone run their own? Sounds like the managed services have it covered.

Three reasons. Cost, control, and learning. Cost is the big one. Managed Postgres pricing works great when you are small, but the markup at scale is somewhere between five and twenty times what you would pay for the same compute and storage if you ran it yourself. A Scaleway VPS with eight gigs of RAM costs about twelve euros a month. That same amount of RAM on a managed database service might cost you sixty to a hundred dollars. When you are running eight databases, the math gets brutal fast.

Control matters too. On a managed service, you cannot tune the operating system. You cannot choose your filesystem. You often cannot install arbitrary extensions. You are renting an apartment. Self-hosting is owning a house. More work, more freedom.

And then there is learning. You cannot truly understand Postgres if you have never opened postgresql dot conf. Every managed service is a black box. The whole point of this series is to crack those boxes open.

The server in Paris runs self-hosted Postgres because of all three. Twelve euros a month for eight databases, full control over every setting, and the person running it understands exactly what is happening under the hood. With AI assistance, the operational burden is surprisingly manageable. Ask Claude to write your backup script, your monitoring queries, your configuration changes. The vibecoder advantage is that "hard to set up" problems stop being hard when you have an AI pair programming with you.

Alright. Let's go look at the server.

The Server Tour

The server is a Scaleway dev one L instance. That is their mid-range developer VPS. Eight gigabytes of RAM, two cores, twenty-two gigabytes of NVMe storage. It runs Ubuntu twenty-four oh four LTS, which is a rock-solid base for Postgres. PostgreSQL sixteen, installed from the official PostgreSQL apt repository, not the Ubuntu default, because the official repo always has the latest point releases.

On that server, besides Postgres, there is Redis, nginx as a reverse proxy, over twenty services running on numbered ports, a Docker container running an uptime monitor, and another running a WhatsApp API bridge. Postgres shares the eight gigabytes of RAM with all of this. That matters. If you are reading tuning advice that assumes Postgres owns the entire server, you need to cut those numbers in half or more.

Here are the eight databases. Live handles news aggregation for three websites, scraping and categorizing articles around the clock. High write volume, lots of inserts, moderate reads. TTPAnotis is a music service. Moderate everything. Parkit is the most complex, a productivity suite where four different tools, a capture inbox, a focus timer, a calendar, and a monitoring hub, all share one database, each owning its own tables but using real foreign keys across service boundaries. Partypar runs events. Parcel is a multi-tenant e-commerce platform integrated with a print-on-demand service. Parpixel sells digital photography goods. Listmonk is a self-hosted newsletter system. And archive is the interesting one, an institutional memory service that uses pgvector for semantic search across embeddings.

Eight databases, eight users, one Postgres instance. That is not unusual for a small VPS. Postgres handles multiple databases on a single instance extremely well. Each database is fully isolated. A bad query in the news aggregation database does not lock up the newsletter system. Well, it should not, anyway. If it uses up all the shared memory, that is a different story. But under normal operation, isolation is solid.

Now let's open the configuration files and see what actually matters.

The Five Settings That Matter

PostgreSQL ships with a configuration file called postgresql dot conf. It has hundreds of settings. Hundreds. If you open it for the first time, you will immediately feel the urge to close it again. Here is the good news. On a small VPS, exactly five settings make a meaningful difference. Everything else can stay at the defaults until you have a very specific reason to change it.

The first is shared buffers. This is the amount of RAM that Postgres reserves for caching table and index data in memory. [surprised] The default is a laughable one hundred twenty-eight megabytes. On a server with eight gigs of RAM, that means Postgres is caching almost nothing, constantly going back to disk for data it just read a moment ago. The standard advice is twenty-five percent of total RAM. On a dedicated database server with eight gigs, you would set this to two gigabytes. But our server is not dedicated. It is running twenty other services. So something more conservative, like five hundred twelve megabytes to one gigabyte, makes more sense. You want to leave enough RAM for the operating system's filesystem cache, which does its own caching that complements what Postgres does. Setting shared buffers too high actually hurts performance because you are starving that filesystem cache.

The second is effective cache size. This one confuses people because it does not actually allocate any memory. It is a hint to the query planner. You are telling Postgres, "hey, between your shared buffers and the operating system's disk cache, roughly this much data is probably already in memory." On an eight gigabyte server, something like four to six gigabytes is reasonable. If you set this too low, the planner might avoid index scans because it assumes reading from disk is expensive. Set it too high and the planner gets overconfident. For a small VPS, setting it to about half your total RAM is a safe conservative choice.

The third is work mem. This is where vibecoders and AI assistants get into trouble, because the name is misleading. It sounds like a global memory limit. It is not. It is the amount of memory each individual sort or hash operation can use, per query, per connection. If you have a query that does three sorts and you have twenty connections, that is potentially sixty times whatever you set work mem to. [worried] Set it to two hundred fifty-six megabytes because some blog post said "more is better" and suddenly a busy moment eats all your RAM and the operating system's out-of-memory killer shows up. On a small VPS with low traffic, thirty-two to sixty-four megabytes is reasonable. On a busy server, sixteen megabytes or less. When you ask AI to tune your postgresql dot conf, this is the setting it most often gets wrong, because it suggests values appropriate for a dedicated hundred-twenty-eight-gig database server, not your twelve-euro VPS.

The fourth is maintenance work mem. This is memory reserved for maintenance operations like VACUUM, CREATE INDEX, and ALTER TABLE. Unlike work mem, only a few of these run at the same time, so you can set it higher. Two hundred fifty-six megabytes is generous on our server. This makes vacuuming faster, index creation faster, and it is safe because you are never going to have thirty simultaneous maintenance operations.

The fifth is max connections. The default is one hundred, and that is probably fine for most small servers. Here is the thing people do not realize: each PostgreSQL connection spawns a dedicated backend process that uses five to ten megabytes of RAM even when idle. One hundred idle connections can eat a full gigabyte of RAM doing absolutely nothing. On the server in Paris, the actual concurrent connection count rarely exceeds ten. Eight services connecting to their respective databases, plus the occasional backup process. One hundred is generous headroom. If you were running a web application with hundreds of concurrent users, you would need a connection pooler, which we will talk about later.

There is a handy tool called PGTune. You go to the website, enter your server specs, RAM, CPU count, disk type, expected workload, and it spits out recommended values for all of these settings. It is a great starting point. But remember, it does not know that you are running twenty other services on the same box. It does not know your actual query patterns. It gives you a floor, not a ceiling. The vibecoder move is to start with PGTune's suggestions, cut shared buffers if you are sharing the server, and then monitor for a week before changing anything else.

One more thing. Changing shared buffers and max connections requires a full Postgres restart. The others just need a configuration reload. On a small server with low traffic, a restart takes a few seconds and nobody notices. On a busy production system, it is a bigger deal. Plan accordingly.

The Gatekeeper: pg hba dot conf

If postgresql dot conf is about performance, pg hba dot conf is about security. HBA stands for host-based authentication, and this file controls the single most important question in database security: who is allowed to connect?

The file is a list of rules, and Postgres reads them from top to bottom. First match wins. Each rule has five parts: the connection type, which database, which user, what network address, and the authentication method. That is it. Five fields per line, evaluated in order.

On the server in Paris, the configuration is simple because all connections come from localhost. The applications run on the same server as the database. No remote connections at all. So the rules look something like this. Local connections over Unix sockets use peer authentication, which means the operating system user has to match the database user. If you are logged in as the "live" system user and connecting to the "live" database as the "live" role, peer authentication checks that the operating system confirms you are who you say you are. No password needed, no password to leak.

For TCP connections from localhost, the rules require scram-sha-two-fifty-six authentication. This is the modern password method where a challenge-response protocol proves you know the password without actually sending it over the connection. It replaced the older MD five method, which had known weaknesses. If you are setting up Postgres for the first time in twenty twenty-six, use scram everywhere.

Here is what the file does not contain, and this is the important part: there are no rules allowing connections from external IP addresses. None. The listen addresses setting in postgresql dot conf is set to localhost only. The server does not even accept TCP connections on its public network interface. This is the simplest and strongest security posture for a setup where all your applications run on the same machine. If your database never speaks to the outside world, an attacker would need to be on the server already to connect to it.

If you do need remote connections, say your application runs on a different server, then you need to be much more careful. Use hostssl rules instead of host rules, which means connections must use SSL encryption. Restrict the address column to the specific IP addresses that need access, not a wildcard. And use scram authentication on top of that. The combination of encrypted transport plus strong authentication plus IP restriction is the minimum for remote database access.

[worried] The number of production Postgres instances I have seen with trust authentication enabled on all addresses is genuinely frightening. Trust means no password required. From anywhere.

Yeah. Do not do that. Ever.

One Role Per Database

Remember episode three, when we talked about how SQL is a language of permissions? Here is where that matters in production.

On the server in Paris, every database has its own role. The live database has a role called "live." TTPAnotis has a role called "ttpanotis." Parkit has "parkit." Each role can only connect to its own database and has permissions only on the tables it needs. This is the principle of least privilege. If someone finds a SQL injection vulnerability in the e-commerce application, the attacker gets access to the parcel database. They do not get access to the newsletter subscribers, the productivity suite data, or the news aggregation system.

Setting this up is straightforward. When you create a new database, you create a matching role at the same time. Grant connect on the database to that role. Grant usage on the schemas the application needs. Grant the appropriate permissions on the tables, usually all permissions for the application's own tables, read-only for any shared reference tables. Revoke the public schema's default permissions so other roles cannot accidentally create objects or read data.

The vibecoder shortcut is to let AI generate these commands. Tell it your database name, your role name, and what the application needs to do, and it will produce the correct GRANT and REVOKE statements. This is one area where AI is reliably good, because the permission model is well-documented and deterministic. Just make sure the role does not end up with SUPERUSER privileges. That is the database equivalent of running everything as root.

Do not share roles across databases unless you have a specific reason. The convenience of one role with access to everything is not worth the security risk. It takes thirty seconds to create a new role, and it might save you from a disaster.

Rabbit Hole: VACUUM and the Dead Tuple Problem

This next section goes deep into one of the most unique things about Postgres. If you just want the practical takeaway, skip ahead to the next chapter. The short version is: Postgres has a garbage collector called autovacuum that runs automatically, and you should not turn it off. But if you want to understand why Postgres needs garbage collection at all, and why ignoring it can literally shut down your database, stay with me.

Back in episode seven, we talked about MVCC, multi-version concurrency control. This is how Postgres handles multiple users reading and writing at the same time without them stepping on each other. When you update a row, Postgres does not overwrite the old version. It creates a new version of the row and marks the old one as "dead." When you delete a row, same thing. The row is not actually removed. It is marked as dead. The dead version stays physically on disk, taking up space, invisible to new transactions but very much still there.

[serious] These dead rows are called dead tuples. And they accumulate. A table that gets a thousand updates a day generates a thousand dead tuples a day. After a year, there could be three hundred sixty-five thousand dead versions of rows sitting in the table, all invisible, all wasting space, all slowing down sequential scans because Postgres still has to skip over them when reading through the table.

This is where VACUUM comes in. The VACUUM command walks through a table, finds the dead tuples, and marks that space as reusable. It does not return the space to the operating system, that is what VACUUM FULL does and it takes an exclusive lock that blocks everyone, but it marks those dead tuple slots as available for future inserts and updates. It is garbage collection, exactly like what Java or Go does for memory, except Postgres does it for rows on disk.

Now here is the part that keeps database administrators up at night. Every transaction in Postgres gets a transaction ID, which is a thirty-two-bit integer. That means there are about four billion possible IDs. When Postgres has used all four billion, the IDs wrap around and start over. At any given moment, Postgres considers two billion IDs to be in the past and two billion to be in the future. VACUUM does not just clean up dead tuples. It also "freezes" old transaction IDs, essentially marking them as "definitely in the past" so they do not count against the wraparound limit.

[worried] If VACUUM falls behind, if it cannot freeze old transaction IDs fast enough, the database approaches a critical threshold. And when that threshold is reached, Postgres takes the nuclear option. <break time="1s"/> It shuts down all write operations. The database becomes read-only. Not because of a bug, not because of a crash, but as a deliberate safety measure to prevent data corruption. The only way to recover is an offline, single-user VACUUM that can take hours or days depending on how big your database is.

This is why autovacuum exists and why you never, ever turn it off. Autovacuum is a background daemon that watches every table and runs VACUUM automatically when dead tuples accumulate beyond a threshold. The default trigger is when a table has more dead tuples than fifty plus twenty percent of its live rows. For a table with ten thousand live rows, autovacuum kicks in at around two thousand fifty dead tuples.

The anti-wraparound autovacuum is even more aggressive. When any table's oldest unfrozen transaction ID reaches two hundred million transactions, a special high-priority vacuum kicks in. You can see it in pg stat activity, labeled "to prevent wraparound." If you try to cancel it, it immediately restarts. Postgres does not trust you to make this decision. It will vacuum your table whether you like it or not, because the alternative is a database shutdown.

[sigh] I once inherited a Postgres server where someone had disabled autovacuum on a high-traffic table because "it was slowing down inserts." That table had fourteen million dead tuples and a transaction age of one point eight billion. [worried] We were three weeks from a forced shutdown.

On the server in Paris, autovacuum runs quietly in the background, doing its job without any special configuration. For eight databases with moderate traffic, the defaults are fine. You can check if it is keeping up by looking at pg stat user tables, which shows the dead tuple count and the last vacuum time for every table. If you see a table with millions of dead tuples that has not been vacuumed in days, something is wrong.

Watching the Watchers: Monitoring

You do not need a fancy monitoring stack for a small Postgres server. You need three things: a way to know if it is up, a way to know if it is struggling, and a way to know if disks are filling up.

For "is it up," the server in Paris uses Uptime Kuma, a self-hosted monitoring tool that pings every service at a regular interval and sends an alert if something goes down. It checks the Postgres port along with every web service. Simple, effective, and free.

For "is it struggling," Postgres has built-in statistics views that tell you everything. The most useful is pg stat activity. This view shows you one row per connection: what query each connection is running, how long it has been running, whether it is active or idle, and what it is waiting for. When something feels slow, the first thing you do is check pg stat activity for long-running queries. A query that has been running for thirty minutes when it usually takes two seconds is your culprit.

The second most useful view is pg stat user tables. This shows you, for every table, how many sequential scans versus index scans it has done, how many live and dead tuples it has, and when it was last vacuumed and analyzed. This is your early warning system. If a table suddenly shows a lot more sequential scans than index scans, you might need an index. If dead tuples are climbing while the last vacuum timestamp is days ago, autovacuum might be struggling.

There is also a simple query you should run once in a while: the cache hit ratio. This tells you what percentage of data reads are satisfied from memory versus disk. On a well-tuned server, this should be above ninety-nine percent. If it drops below ninety-five, your shared buffers might be too small, or your working set has outgrown your RAM. On the server in Paris, the cache hit ratio hovers around ninety-nine point six percent, which means the database is almost entirely in memory.

The vibecoder approach to monitoring is practical. You do not need Datadog or Grafana for a small server. A cron job that runs a few SQL queries against the statistics views, logs the results, and alerts you if something looks wrong is plenty. That cron job can be written by AI in about two minutes. Ask it to check active connections, long-running queries, dead tuple counts, and disk usage. Have it send you a notification if any of those cross a threshold.

Do You Need a Connection Pooler?

At some point, someone will tell you that you need PgBouncer. PgBouncer is a connection pooler that sits between your applications and Postgres, maintaining a small pool of real database connections and multiplexing your application connections onto them.

Here is when you actually need it: when you have more application connections than Postgres can comfortably handle. Remember, each Postgres connection is a full operating system process using five to ten megabytes of RAM. If you have a web application with a hundred concurrent users, each holding a database connection, that is a gigabyte of RAM just for idle connections. A connection pooler lets those hundred users share twenty actual Postgres connections by grabbing one when they need it and releasing it when they are done.

On the server in Paris, there is no connection pooler. Eight services, each maintaining one or two connections to its database. Total concurrent connections: maybe ten to fifteen at peak. Postgres handles this without breaking a sweat. The overhead is under a hundred megabytes. Adding PgBouncer would add complexity for zero benefit.

You need a connection pooler when you have serverless functions, because each function invocation might try to open a new connection and serverless can mean hundreds of concurrent invocations. You need one when you have a web framework that opens a connection per request instead of using a connection pool. You need one when max connections is getting close to its limit and connections are spending most of their time idle.

You do not need one for a small server with a handful of services. Do not let premature optimization convince you to add infrastructure you do not need. When you start seeing connection count warnings in your monitoring, that is when you install PgBouncer. Not before.

The rule of thumb I use is simple. If your peak connection count is under half of max connections, you are fine. [calm] If it is regularly hitting seventy or eighty percent, start thinking about pooling.

The Vibecoder's Production Checklist

Let's wrap this up with the practical list. If you are about to run Postgres on your own server, here is what actually matters, in order of importance.

Backups. We covered this in episode ten, but it bears repeating. Before you put a single row of real data in your database, you need automated backups that you have tested by actually restoring from them. The server in Paris does daily pg dump to S three for every database, with hourly backups for the critical ones. Test your restores. A backup you have never restored from is just a prayer.

Security. Set listen addresses to localhost unless you have a specific reason to accept remote connections. Use scram-sha-two-fifty-six authentication. One role per database, least privilege, no superuser for applications. Keep your pg hba dot conf tight.

Tuning. Set shared buffers to something reasonable for your server, probably five hundred twelve megabytes to two gigabytes depending on what else is sharing the RAM. Set effective cache size to about half your total RAM. Keep work mem conservative, sixteen to sixty-four megabytes. Leave max connections at the default unless you know you need more.

Autovacuum. Leave it on. Leave the defaults alone. Check pg stat user tables occasionally to make sure it is keeping up. If a specific table has a very unusual write pattern, millions of updates per day, you might need to tune the per-table autovacuum settings. But for most workloads, the defaults are fine.

Monitoring. At minimum, know when it goes down and have a way to check for long-running queries and dead tuple buildup. A cron job with a few SQL queries is enough to start.

Updates. Subscribe to the PostgreSQL security mailing list. When a point release comes out, apply it. On Ubuntu, this is apt update and apt upgrade. Point releases are backwards compatible and often fix real security issues.

[slow] That is the whole list. Six things. Not sixty. Not six hundred. Six things that cover ninety-five percent of what a vibecoder running a small production Postgres needs to worry about. The remaining five percent is edge cases that you will learn about when you hit them, and when you do, you will have the foundation to understand what is going wrong.

The Extension Bazaar Is Calling

You are running Postgres in production. You have your configuration tuned, your roles locked down, your backups tested, and your monitoring in place. You have eight databases on one server and they are all humming along. But you are only using a fraction of what Postgres can do.

Remember that archive database with the vector embeddings? That uses an extension called pgvector. And pgvector is just one item in a bazaar of extensions that make Postgres do things no other database can match. Geospatial queries. Time-series data. Job scheduling. Full-text search in dozens of languages. Graph traversal. Machine learning inference.

[excited] Next time, we are visiting the extension bazaar. The community, the ecosystem, and the addons that make Postgres not just a database, but a platform. See you there.