Counting to 3 with a new builder processing 50M+ monthly builds
Blog post from Railway
Railway's journey from using Docker buildx for builds to developing their standalone Builder v3 system highlights a significant transformation in their infrastructure to enhance efficiency and scalability. Initially, builds were conducted on GCP VMs, but the process was fraught with challenges such as high egress costs, inability to manage noisy neighbors, and inefficient use of resources. These issues led to the development of Builder v3, which utilizes microVMs on bare-metal hosts with a more efficient scheduling system, significantly increasing build capacity to 66,000 builds per hour at peak. The transition involved overcoming numerous technical challenges, including network configuration and resource isolation, and necessitated a redesign of the build process to improve reliability and performance. By simplifying the build architecture to reduce unnecessary layers and optimizing metadata handling, Railway effectively reduced build times, and the entire system now operates with improved stability and efficiency. The long-term vision includes moving towards a buildless infrastructure to further enhance deployment efficiency.