Deep Dive on Amazon Aurora and Amazon RDS for PostgreSQL Architecture and Features
Source: Dev.to
Introduction
If you’re considering migrating your self‑hosted PostgreSQL database or transitioning your commercial databases to PostgreSQL on AWS, you’ll need to choose the database service that best aligns with your requirements. AWS offers two managed PostgreSQL database options:
- Amazon Aurora PostgreSQL‑Compatible Edition
- Amazon Relational Database Service (Amazon RDS) for PostgreSQL
This post delves into the architecture and features of Aurora PostgreSQL and RDS PostgreSQL, analyzing their performance, scalability, failover capabilities, storage options, high availability, and disaster recovery mechanisms.
Overview
Both Aurora PostgreSQL and RDS for PostgreSQL are fully managed PostgreSQL database services offering:
- Provisioning various classes of DB instances
- Multiple PostgreSQL‑compatible versions
- Managing backups and point‑in‑time recovery (PITR)
- Replication and monitoring
- Multi‑AZ support
- Storage auto‑scaling
Key Differences
Aurora PostgreSQL uses a high‑performance storage subsystem customized for fast distributed storage. The underlying storage grows automatically in segments of 10 GiB, up to 128 TiB. Aurora improves upon PostgreSQL for massive throughput and highly concurrent workloads.
RDS for PostgreSQL supports up to 64 TiB of storage and uses Amazon Elastic Block Store (Amazon EBS) volumes for database and log storage. RDS manages PostgreSQL installation, upgrades, storage management, replication, and backups.
Architecture Comparison
Aurora PostgreSQL Architecture
- Single virtual cluster volume supported by storage nodes using locally attached SSDs
- Data automatically replicated across three Availability Zones
- Shared storage model for writer and readers
RDS PostgreSQL Architecture
- Classic Multi‑AZ with single standby instance
- Multi‑AZ DB cluster with two readable standby DB instances (semi‑synchronous)
- Three separate Availability Zones for increased read capacity
Feature Comparison Table
| Feature | Aurora PostgreSQL | RDS for PostgreSQL |
|---|---|---|
| Maximum Storage | 128 TiB | 64 TiB |
| Storage Type | Custom distributed storage (locally attached SSDs) | Amazon EBS (gp2/gp3, io1/io2) |
| Storage Growth | Automatic in 10 GiB increments | Auto‑scaling in 10 GiB or 10 % chunks |
| Storage Reduction | Automatic when data deleted | Manual |
| IOPS Limitation | No limitation based on storage size | Depends on storage type and size |
| I/O Charges | Separate (I/O‑Optimized available) | Included with storage type |
| Read Replicas | Up to 15 Aurora readers | Up to 155 read replicas (5 per instance, 3 levels of cascading) |
| Cross‑Region Replicas | Aurora Global Database | 5 cross‑Region read replicas |
| Typical Replica Lag | Few hundred milliseconds | Few seconds (optimal) to minutes (high load) |
| Backup Type | Continuous and incremental | Daily full + continuous WAL archiving |
| Backup Performance Impact | None | Slight impact on single‑AZ deployments |
| PITR Restore Time | Fast (incremental nature) | Slower (restore full + replay WALs) |
| Failover Time (Multi‑AZ) | 30 seconds (DNS: 10‑15 s, Recovery: 3‑10 s) | 1‑2 minutes (includes crash recovery) |
| Crash Recovery | Immediate (on‑demand parallel replay) | Depends on checkpoint interval (default 5 min) |
| Multi‑AZ Options | Single configuration | One standby or two readable standbys |
| Write Latency (Multi‑AZ) | Standard | Up to 2× faster with two standbys |
| Replication Method | Shared storage | PostgreSQL streaming replication |
| Write Impact on Replicas | Negligible | Significant (processes transaction logs) |
| Data Replication | 6 copies across 3 AZs | Synchronous to standby, async to replicas |
| Serverless Option | Aurora Serverless v2 | Not available |
| Fast Database Cloning | Yes | No (snapshot restore only) |
| Query Plan Management | Yes (QPM) | Not available |
| Cluster Cache Management | Yes (warm cache failover) | Not available |
| Machine Learning Integration | Yes (native SQL) | Not available |
Detailed Feature Analysis
Storage
Aurora PostgreSQL Storage
- Single virtual cluster volume supported by storage nodes using locally attached SSDs
- Automatic growth in 10 GiB increments up to 128 TiB
- Dynamic reduction when data is deleted
- Triple replication across three Availability Zones automatically
- No IOPS limitation based on storage size (may need to scale DB instance)
- Separate I/O charges applied per usage
- I/O‑Optimized configuration provides up to 40 % cost savings when I/O spend exceeds 25 % of Aurora database spend
RDS for PostgreSQL Storage
- Amazon EBS SSD‑based storage types:
- General Purpose SSD (gp2): 3 IOPS per provisioned GiB, burst up to 3,000 IOPS
- General Purpose SSD (gp3): Customized performance independent of size – baseline 3,000 IOPS and 125 MiB/s for <400 GiB storage
- Provisioned IOPS (io1, io2): 1,000–256,000 IOPS range
- Storage auto‑scaling in chunks of 10 GiB or 10 % of current storage (whichever is greater)
Backup
Aurora PostgreSQL Backup
- Continuous and incremental automated backups
- No performance impact or interruption during backups
- Fast PITR due to incremental nature
- Restore time depends on volume size and transaction log count
RDS for PostgreSQL Backup
- Daily automated backups during the defined backup window
- Slight performance impact on single‑AZ deployments when backup initiates
- Continuous WAL archiving
- PITR process: restore full backup + replay WALs to the desired point in time
- Slower for write‑intensive workloads (long WAL replay time)
- Tip: Frequent manual snapshots reduce PITR duration
Scalability
Aurora PostgreSQL Scalability
- Up to 15 readers for read scaling and high availability
- Shared storage model minimizes impact of high write workloads on replication
- Minimal replica lag (few hundred milliseconds, occasionally up to 60 s)
- Auto‑restart of readers if lag exceeds threshold
- Write capacity limited by single writer instance
RDS for PostgreSQL Scalability
- Up to 155 read replicas (5 per instance, 3 cascading levels)
- Cascading architecture reduces overhead on source instance
- Progressive replication lag with more intermediaries in cascade
- Read replica promotion to standalone instances
- 5 cross‑Region read replicas available
- Streaming replication via PostgreSQL WAL records
- Higher replica lag risk with high write activity or mismatched storage/instance class
- Two readable standbys in Multi‑AZ three‑AZ deployment serve both HA and scalability
Crash Recovery
Aurora PostgreSQL
- No traditional checkpoints; the storage system applies log records directly
- Parallel and asynchronous redo record replay per storage segment
- Immediate availability after a crash
RDS for PostgreSQL
- Replays transaction logs since the last checkpoint (default: 5 minutes apart)
- Checkpoint process writes dirty pages from memory to storage
- Trade‑off: frequent checkpoints reduce recovery time but may increase I/O load
Failover
Aurora PostgreSQL Failover
- Typical failover time: ~30 seconds
- DNS propagation: 10‑15 seconds
- Recovery: 3‑10 seconds (parallel with DNS)
- Automatic promotion of a reader to primary
RDS for PostgreSQL Failover
- Typical failover time: 1‑2 minutes (includes DNS propagation and crash recovery)
- Depends on crash recovery time, DNS propagation, and TTL settings
- Multi‑AZ with two standbys can achieve failover under 35 seconds and up to 2× faster transaction commit latency
RDS Proxy Benefits
Both services support Amazon RDS Proxy for:
- Connection pooling and sharing
- Faster failover recovery
- Automatic connection to the new primary instance
- Maintaining idle connections during failover
High Availability and Disaster Recovery
(Content continues beyond this point in the original article; include any additional sections here as needed.)