Store and update namespace statistics in a performant manner
Blog post from GitLab
Managing storage space on large GitLab instances, such as GitLab.com, involves challenges due to the lack of restrictions on storage-consuming items like wikis, LFS objects, artifacts, and packages. To address this, the proposal involves creating an ActiveRecord model to track namespace statistics, specifically for root namespaces, and updating these statistics whenever a related project changes. However, updating these statistics efficiently poses a challenge due to the high frequency and cost of database queries already burdening GitLab.com. Various approaches were considered, including using PostgreSQL materialized views, Common Table Expressions, Redis storage, and direct tagging of namespaces, but each had significant drawbacks. The chosen solution, Attempt E, involves asynchronously updating namespace storage statistics using Sidekiq jobs, allowing updates without extending transaction lengths. This approach, while complex, avoids a need for background migration and is compatible with both PostgreSQL and MySQL. It was gradually rolled out under a feature flag on GitLab.com, starting with testing in a staging environment and then expanding to the gitlab-org group, before a global rollout. The implementation monitors performance through Sidekiq dashboards, and so far, it has shown successful execution without significant issues, paving the way for future enforcement of storage limits.
No tracked trend matches for this post yet.
Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.