Company
Date Published
Author
Christian Schwarz
Word count
2795
Language
English
Hacker News points
2

Summary

The Pageserver in Neon's system experienced a severe regression in cold start latency after making changes to improve stability. The root cause was found to be the inefficient closing of file descriptors as part of the defense-in-depth strategy against compromise of walredo processes. To fix this, the code switched from using the `close_fds` crate to manually issuing the `close_range` system call, which allowed for faster process creation and reduced latency. This highlights the importance of dynamic tracing at runtime and the need to carefully consider the trade-offs in process creation primitives.