Home / Companies / Railway / Blog / Post Details
Content Deep Dive

How We Oops-Proofed Infrastructure Deletion on Railway

Blog post from Railway

Post Details
Company
Date Published
Author
Mahmoud Abdelwahab
Word Count
1,914
Language
-
Hacker News Points
-
Summary

Railway offers a 48-hour grace period for undoing deletions of production resources, reducing the risk of accidental data loss during infrastructure management, particularly when using tools like Terraform or Kubernetes. This feature is implemented using Temporal, a workflow engine that supports reliable and stateful processes, allowing for the safe management of infrastructure changes. Temporal's Workflows, Activities, and Signals enable complex operations, such as delaying volume deletions and handling potential failures with automatic retries. The system's architecture involves multiple stages, including authorization checks, patch commitment, and a delayed deletion workflow, all coordinated through Temporal to ensure deletions are safe, observable, and consistent across layers. Once the grace period ends, a comprehensive cleanup process removes the resources physically and from the database, ensuring no residual elements remain. By introducing this safeguard, Railway aims to provide developers with a more forgiving infrastructure environment, minimizing the stress and potential disaster of accidental deletions.