PlanetScale Tech Blog

PlanetScale Tech BlogJul 1, 2025

Why it matters: PlanetScale is bringing its proven reliability and performance expertise from the MySQL world to Postgres. By leveraging NVMe-backed infrastructure and a custom proxy layer, they offer a high-performance, scalable alternative to traditional cloud Postgres providers.

PlanetScale has launched a Postgres hosting service, now Generally Available, utilizing their proprietary operator and Metal infrastructure.
The platform features high availability with automatic failovers, query buffering, and integrated connection pooling via PgBouncer.
Built on Postgres v17, it supports online imports from version 13+ and provides zero-downtime version updates.
Performance benchmarks claim it outperforms major competitors like Amazon Aurora and AlloyDB, even with half the resources, due to locally-attached NVMe SSDs.
A new project called Neki is under development to bring Vitess-style horizontal scaling and sharding to Postgres, architected from first principles.

#data #sre #dist

Read original

PlanetScale Tech BlogJul 1, 2025

Benchmarking Postgres

Why it matters: PlanetScale's entry into the Postgres market with a focus on high-performance 'Metal' instances provides engineers with a new managed database option. Their transparent benchmarking methodology helps teams evaluate latency and throughput trade-offs across major cloud providers.

PlanetScale has launched its Postgres offering, utilizing an internal tool called Telescope to conduct standardized performance benchmarks.
The benchmarking methodology focuses on three key metrics: query-path latency, TPCC-like OLTP workloads, and sysbench read-only performance.
PlanetScale Metal was tested using i8g M-320 instances (4 vCPUs, 32GB RAM) featuring NVMe SSD storage for high-performance production workloads.
Comparisons were made against major providers including Amazon Aurora, Google AlloyDB, and Supabase, often giving competitors a resource advantage in CPU or RAM.
All tests were conducted within the same cloud regions (us-east-1 or us-central1) to minimize external network variables and ensure fairness.
The results claim PlanetScale Metal significantly outperforms other managed Postgres vendors in both throughput and latency metrics.

#data #sre #finops

Read original

PlanetScale Tech BlogApr 29, 2025

Announcing Vitess 22

Why it matters: This release significantly improves database scalability and reliability by optimizing query planning and cluster management. Engineers benefit from reduced latency, lower memory overhead, and more robust automated recovery tools, making large-scale MySQL deployments easier to maintain.

Vitess 22.0.0 introduces a new 6-month release cadence with significant performance and observability upgrades.
Query serving improvements include cached prepared statements as raw SQL and GA support for sharded views and atomic distributed transactions.
VTOrc now features stalled-disk recovery, improved errant GTID discovery, and a semi-sync monitor to prevent replication blocks.
The Kubernetes Operator v2.15.0 adds support for K8s 1.32 and migrates automated backups to VTBackup pods to preserve serving capacity.
Performance optimizations in gRPC and AST normalization resulted in a ~3% QPS boost and up to 13% reduction in memory allocations.

#data #dist #sre

Read original

PlanetScale Tech BlogMar 25, 2025

PlanetScale vectors is now GA

Why it matters: This release enables engineers to integrate high-performance vector search directly into their existing MySQL workflows. By supporting indexes larger than RAM and maintaining ACID compliance, it eliminates the need for a separate, specialized vector database for AI-driven applications.

PlanetScale vectors are now GA, featuring 2x query performance and 8x better memory efficiency since the open beta.
Vectors are treated as first-class MySQL citizens, supporting standard RDBMS features like JOINs, WHERE clauses, and ACID-compliant transactions.
The implementation uses a SPANN-based index design, allowing vector indexes to remain performant even when they are 6x larger than available RAM.
Supports Euclidean (L2), inner product, and cosine distance metrics for vectors up to 16,383 dimensions.
Integrated with PlanetScale's developer workflow, including database branching, deploy requests, and cross-region replicas.

#data #mlp #dist

Read original

PlanetScale Tech BlogMar 20, 2025

Faster interpreters in Go: Catching up with C++

Why it matters: This article demonstrates how architectural shifts from AST interpreters to bytecode VMs can yield C++ level performance in Go. It provides a blueprint for building high-performance, maintainable evaluation engines for distributed systems where native push-down isn't always possible.

Vitess replaced its original AST-based SQL evaluator with a custom bytecode Virtual Machine written natively in Go.
The new VM achieves performance parity with MySQL's C++ evaluation code while remaining more maintainable than the previous Go interpreter.
The engine handles complex SQL expressions that cannot be pushed down to underlying MySQL shards, such as post-aggregation filtering in HAVING clauses.
By moving from recursive AST walking to a bytecode-based approach, the system reduces instruction dispatch overhead and improves execution efficiency.
Extensive fuzzing and compatibility testing against MySQL led to the discovery and fixing of several upstream bugs in the original MySQL engine.
The implementation demonstrates that high-level languages like Go can match C++ performance for specialized interpreter tasks through optimized architecture.

#dist #data

Read original

PlanetScale Tech BlogMar 18, 2025

The Real Failure Rate of EBS

Why it matters: Engineers often rely on cloud SLAs without realizing that partial failure or latency spikes can cause total system downtime. Understanding EBS's real-world performance variance is critical for building resilient distributed databases that require consistent throughput.

EBS performance degradation is a frequent partial failure mode where latency spikes (200ms+) effectively block database operations.
AWS gp3/gp2 SLAs allow for up to 1% of time (86 hours/year) where performance drops below 90% of provisioned IOPS.
In distributed systems with many shards, the probability of at least one volume experiencing degradation at any given time approaches 100%.
Higher-tier io2 volumes are not immune to these issues and can suffer from correlated failures within a single availability zone.
PlanetScale mitigates these risks through automated monitoring of latency and idle percentages, triggering zero-downtime reparenting.
The ultimate solution for consistent performance is moving from network-attached storage to local NVMe storage.

#sre #dist #data

Read original

PlanetScale Tech BlogMar 13, 2025

IO devices and latency

Why it matters: Understanding the physical limitations of storage media helps engineers optimize database performance. Choosing local NVMe over network-attached storage eliminates latency bottlenecks and provides the high IOPS necessary for modern, high-traffic transactional workloads.

Non-volatile storage has evolved from tape drives to HDDs and SSDs, with each generation significantly reducing latency by minimizing or eliminating mechanical movement.
Tape storage remains relevant for low-cost, long-term archival due to high durability and density, despite extremely high random access latency.
Hard Disk Drives (HDDs) improved performance over tape using spinning platters, but still suffer from mechanical seek time and rotational latency during random IO.
Solid State Drives (SSDs) and NVMe eliminate mechanical parts, allowing for near-instantaneous random access and massive parallelization of IO operations.
Network-attached storage introduces significant latency compared to locally attached NVMe drives, which offer superior IOPS and consistent performance for databases.

#data #sre

Read original

PlanetScale Tech BlogMar 11, 2025

Announcing PlanetScale Metal

Why it matters: PlanetScale Metal significantly improves database performance and cost-efficiency by leveraging local NVMe storage. It allows engineers to scale relational workloads with lower latency and predictable costs compared to traditional cloud-managed database services like Amazon Aurora.

PlanetScale Metal introduces new node instances powered by locally-attached NVMe SSD drives for AWS and GCP.
The architecture provides unlimited I/O on all M-series cluster types, removing traditional cloud storage bottlenecks.
Production benchmarks show up to a 65% reduction in p99 query latency compared to previous configurations.
Metal offers significant cost efficiency, with reported savings of up to 53% over Amazon Aurora.
The service is now in general availability, having already processed over 5 trillion queries across 5 petabytes of data.

#data #finops #sre

Read original

PlanetScale Tech BlogMar 11, 2025

PlanetScale Metal: There’s no replacement for displacement

Why it matters: Engineers facing I/O bottlenecks can achieve massive performance gains and lower latency by bypassing network-attached storage. PlanetScale Metal demonstrates that using local NVMe with robust replication provides superior OLTP performance and cost-efficiency without sacrificing durability.

PlanetScale Metal replaces network-attached storage like Amazon EBS with local NVMe drives to eliminate network latency and throughput bottlenecks.
Local NVMe provides an order of magnitude more IOPS, with i4i instances reaching up to 400,000 random read IOPS compared to 40,000 for standard EBS configurations.
The architecture reduces 99th percentile query latency by moving storage closer to compute, in one case dropping from 9ms to 4ms for a million-QPS workload.
Durability is achieved through semi-synchronous, row-based MySQL replication across three availability zones rather than relying on cloud provider block-level replication.
The solution is cost-neutral or cheaper than equivalent Aurora or EBS configurations because high-performance network storage often carries significant price premiums.
Automated backup restoration and testing processes ensure high availability and data integrity even in the event of local hardware failure.

#data #sre #dist

Read original

PlanetScale Tech BlogMar 11, 2025

Upgrading Query Insights to Metal

Why it matters: Moving write-heavy, I/O-sensitive workloads from virtualized storage to bare metal significantly reduces latency and increases throughput without complex architectural changes. This highlights the performance benefits of local NVMe over cloud block storage for high-scale databases.

PlanetScale migrated its Query Insights database from EBS-backed instances to PlanetScale Metal to address high I/O latency sensitivity.
The workload involves processing 10,000 write operations per second across 8 MySQL shards using a Kafka-based ingestion pipeline.
The ingestion system utilizes 32 consumer processes and 800 concurrent threads to coalesce and write telemetry data.
Upgrading the busiest shard to Metal resulted in an immediate and significant decrease in query latency across all percentiles.
Post-migration, the busiest shard outperformed the remaining shards on standard infrastructure by a substantial margin.
The upgrade reduced Kafka consumer backlogs and provided additional capacity for future volume growth without requiring architectural changes.

#data #sre

Read original

Page 5 of 7

Prev 1...3 4 5 6 7 Next