Technical7 min read

Scaling PostgreSQL: Lessons from 800 Million Users

Giuseppe Albrizio/February 2, 2026

PostgreSQL scales. Seriously. The OpenAI case proves it at planetary level.

The Context

OpenAI runs ChatGPT — 800 million active users — with PostgreSQL as its primary database. Not a custom distributed database. Not a NoSQL system designed for scale. PostgreSQL.

A single primary node, over 50 read replicas on Azure, millions of queries per second.

No sharding.

A single primary PostgreSQL + 50 read replicas handles millions of QPS for one of the most used services on the planet. Without sharding. Without exotic databases.

This should give pause to anyone considering abandoning PostgreSQL because it "doesn't scale."

The Architecture

OpenAI's approach is deceptively simple:

Component	Detail
Primary	Single Azure PostgreSQL Flexible Server node
Read Replicas	50+ geographically distributed replicas
Connection Pooling	PgBouncer in front of every instance
Sharding	None
Average Latency	< 5ms (after optimizations)

Before and After PgBouncer

Latency dropped from ~50ms to under 5ms simply by introducing connection pooling with PgBouncer. Not an architectural rewrite. Not a database change. A proxy layer.

If you don't have PgBouncer (or equivalent) in front of your production PostgreSQL, you're leaving free performance on the table. It's the single change with the best cost-to-benefit ratio in any PostgreSQL deployment.

The Real Bottleneck: Writes

Reads scale horizontally with replicas. Add nodes, distribute load. Problem solved.

Writes don't. Everything goes through the primary. That's physics, not a bug.

OpenAI's Write Strategies

01 — Eliminate unnecessary writes

Before optimizing, they removed. Application writes that weren't needed, logs that could go elsewhere, updates that could be lazy.

The best optimization is not doing the work.

02 — Lazy writes with controlled backfill

Instead of writing synchronously, they implemented deferred writes with rate-controlled backfill. This eliminates the write traffic spikes that cause the worst problems.

03 — Offload write-heavy workloads

Workloads with a natural sharding key generating extreme write volumes were migrated to dedicated systems. PostgreSQL for the core, specialized systems for the exception.

04 — Multi-level rate limiting

Rate limiting at the application, connection, and query digest levels. Not a single control point, but defense in depth against resource exhaustion.

Architectural Considerations

Why Not Sharding?

Sharding introduces enormous operational complexity: cross-shard queries, distributed transactions, rebalancing, routing layers. OpenAI demonstrated that with the right optimizations, a single primary can handle loads most engineers would consider impossible.

Sharding	Single Primary + Replicas
High operational complexity	Operationally simple
Expensive cross-shard queries	All queries on one node
Fragile distributed transactions	Native ACID transactions
Non-trivial rebalancing	Scale-up + read replicas

Sharding makes sense when writes exceed the capacity of a single node. But that moment arrives much later than you think — and OpenAI is proof.

MVCC: The Price of Concurrency

PostgreSQL uses Multi-Version Concurrency Control. Every update creates a new row version. This means:

Table bloat: dead rows occupying space
Index bloat: indexes pointing to obsolete rows
Complex autovacuum: the garbage collector needs careful tuning
Growing WAL: more replicas = more WAL = more network bandwidth

Autovacuum tuning is one of the most underestimated activities in PostgreSQL management. The defaults are conservative. At scale, they need aggressive revision.

Read Replicas: Not All Created Equal

A crucial pattern adopted by OpenAI: traffic segregation by priority.

Not all queries are created equal. A user waiting for a real-time response has different priority than an analytics job.

Type	Dedicated Replicas	Characteristics
High-priority	Dedicated replicas	Minimal latency, no interference
Low-priority	Shared replicas	Analytics, batch jobs, reports
Long-running	Isolated replicas	Queries > 1s moved here

This prevents a 30-second analytics query from blocking real-time reads. Simple, but too many teams put everything on the same replicas.

Schema Migration: The Minefield

Production migrations on high-traffic databases are where things break. OpenAI's approach:

Lightweight operations only: no full table rewrites in production
5-second DDL timeout: if it doesn't complete in 5s, abort
Indexes always CONCURRENTLY: never block reads for an index build
Slow queries moved to replicas: queries > 1s get migrated to avoid blocking migrations

Never run ALTER TABLE ... ADD COLUMN ... DEFAULT ... with a table rewrite on a database with millions of rows in production. Always ADD COLUMN nullable, then backfill separately.

Incidents and Lessons

The Redis Cascade Failure

A Redis outage caused a cascade collapse of the entire system. PostgreSQL wasn't the culprit, but the unmanaged dependency was.

Lesson: every external dependency is a single point of failure if you don't have circuit breakers and fallbacks.

The WALSender Bug

A bug where high CPU triggered a spin-loop in WALSender, preventing WAL transmission to replicas. The lag persisted even after CPU normalized.

Lesson: replication lag monitoring isn't optional. And you need alerting on anomalies, not just thresholds.

What PostgreSQL Is Missing (According to OpenAI)

These are real feature requests OpenAI has brought to the PostgreSQL community:

Index disabling: ability to mark an index as invalid without dropping it, monitor the impact, then decide
Latency percentiles: native P95/P99 in pg_stat_statements, not just averages
DDL history tracking: a schema change history accessible via query
Wait event semantics: sessions "active" with wait event "ClientRead" for hours — confusing semantics
Heuristic defaults: auto-tuned parameters based on detected CPU/RAM/disk

These aren't complaints. They're concrete feedback from someone operating PostgreSQL at a scale few reach. If you work on the PostgreSQL ecosystem, these are the areas where contribution has the highest impact.

My Takeaways

PostgreSQL Is Enough. Almost Always.

The OpenAI case demolishes the argument that "PostgreSQL doesn't scale." If it scales for 800 million ChatGPT users, it scales for your project. The problem is never PostgreSQL — it's how you use it.

"The best technology is boring technology. PostgreSQL is gloriously boring. And that's why it works."

Complexity Is a Choice, Not a Requirement

OpenAI could have chosen a custom distributed database. They could have sharded from day one. Instead they chose the simplest approach that could work and optimized from there.

That's engineering. Not the newest technology, but the simplest solution that solves the problem.

Connection Pooling Is Not Optional

If there's one thing to take away from this article: PgBouncer in production. Always. The difference between 50ms and 5ms isn't an optimization — it's a category change.

Monitoring Decides Everything

OpenAI had a single Sev0 incident attributable to PostgreSQL in nine months. Not because PostgreSQL is magic, but because they invested in observability: replication lag, query performance, connection states, WAL volume.

You can't optimize what you don't measure.

Sharding Is the Last Resort

Too many architectures start with sharding "just in case." OpenAI demonstrates that the cost of sharding's operational complexity almost always outweighs the benefit, up to scales that 99.99% of projects will never reach.

Operational Takeaways

For those scaling PostgreSQL today:

Priority	Action	Impact
1	Introduce PgBouncer	10x latency reduction
2	Segregate traffic on dedicated replicas	Eliminate workload interference
3	Aggressive autovacuum tuning	Prevent bloat and degradation
4	Timeouts on DDL and queries	Prevent lock chains
5	Monitor replication lag and WAL	Early warning on degradation
6	Lazy writes for spike control	Smooth write traffic
7	Shard only when everything else isn't enough	Complexity as last resort

Sources

Back to Field Notes