Home / Companies / GitHub / Blog / Post Details
Content Deep Dive

Scaling monorepo maintenance

Blog post from GitHub

Post Details
Company
Date Published
Author
Taylor Blau
Word Count
4,710
Language
English
Hacker News Points
-
Summary

GitHub has successfully addressed the challenge of maintaining and repacking some of the largest and fastest-growing Git repositories by implementing a new strategy that allows for faster repacking and improved performance. Traditionally, GitHub's maintenance job involved repacking entire repositories into a single packfile, which was costly in terms of time and resources, especially for large repositories. To overcome this, GitHub developed solutions that include multi-pack indexes and multi-pack bitmaps, which allow for efficient object lookups across multiple packs and support reachability bitmaps beyond a single pack. This new approach involves a geometric repacking strategy that distributes objects across multiple packfiles, focusing on recently added objects, thereby optimizing repack times and reducing the frequency of full repository repacks. The changes lead to significant reductions in CPU time and repack duration, and the improvements are being contributed to the open-source Git project for future releases.