Company
Date Published
Author
Adam Prout
Word count
1885
Language
English
Hacker News points
1

Summary

S2's bottomless design separates storage and compute capabilities, allowing for cost savings, improved performance, and elasticity. It achieves this by committing transactions to the tail of the log on local disk and replicating it to other nodes for durability. Newly committed columnstore files are pushed to blob storage as quickly as possible asynchronously after being committed, while hot data is kept cached locally on disk for use by queries and cold data is removed from local disk. This design has several advantages, including small write transactions having no extra latency compared to S2 that doesn't use blob storage, new replicas and hosts being able to be spun up quickly by pulling data from blob storage, and the blob store acting as an extra layer of durability with point-in-time recovery capabilities. However, it also has some disadvantages, such as durability and availability not being separated in S2 today, and relying on blob storage for cross-region high availability which may result in loss of some data on local disk during a region outage. Overall, the bottomless design allows for improved elasticity at lower costs by storing history in the blob store and using faster local disks for cached data.