Home / Companies / RevenueCat / Blog / Post Details
Content Deep Dive

Postmortem for Aurora Postgres Migration November 23 2022

Blog post from RevenueCat

Post Details
Company
Date Published
Author
Guillermo Pérez
Word Count
2,787
Language
English
Hacker News Points
-
Summary

On November 23, a migration from AWS Aurora Postgres 10.x to 14.x led to significant performance degradation, severely affecting backend systems due to inefficient query planning caused by unexecuted ANALYZE on the largest tables. The migration, planned due to the impending end of support for Aurora Postgres 10.x, utilized a new approach focusing on database replication to maintain data consistency. Despite extensive preparation, the transition resulted in a temporary failure of backend systems and impacted user experience, especially for new purchases which faced entitlement unlocking issues. The complexity of the system caused cascading failures, and identifying the root cause was challenging, taking several hours to resolve. Measures to address the issue included query plan adjustments and manual execution of ANALYZE on key tables, alongside temporary suspension of incoming Apple webhook requests to improve recovery. Post-migration assessments highlighted the importance of timing, communication, and thorough testing, particularly for write operations, to prevent similar incidents in the future.