Company
Date Published
Author
MongoDB
Word count
2097
Language
English
Hacker News points
None

Summary

The MongoDB Backup Service's client-side agent is responsible for transparently streaming a replica set's updates to the MMS cloud for backup, utilizing MongoDB's document model to minimize code requirements. The agent's primary responsibility is obtaining data from customer replica sets and batching it up for network transmission, while transferring data to the ingestion service in the MMS cloud for correctness validation and further processing. The agent operates in a client-server architecture, with the ingestion service handling any additional batching and pipelining beyond what the agent can handle. The agent's development involved selecting an expressive language with great concurrency support, leading to the choice of Java initially but ultimately settling on Go due to its compilation to native binaries, strong idiomaticity, excellent MongoDB driver, and linguistic support for concurrency. The agent works by accessing the collections as a client, making efficient use of resources, and minimizing administrative hassle. It handles error cases such as failed over replica sets or oplog rollovers, utilizing the MMS cloud's copy of client oplogs to recover from these situations. The agent pipeline consists of three goroutines: oplog tailing, slicing and compression, and shipping compressed slices to ingestion. Initial syncing involves capturing a snapshot of new replica sets, while the clustershot feature enables capturing a chunk-consistent snapshot of sharded clusters by creating a synchronization point using tokens in the oplog.