Home / Companies / GitLab / Blog / Post Details
Content Deep Dive

Store and update namespace statistics in a performant manner

Blog post from GitLab

Post Details
Company
Date Published
Author
Mayra Cabrera
Word Count
1,793
Company Posts That Month
27
Language
English
Hacker News Points
-
Post removed?
No
Summary

Managing storage space on large GitLab instances, such as GitLab.com, involves challenges due to the lack of restrictions on storage-consuming items like wikis, LFS objects, artifacts, and packages. To address this, the proposal involves creating an ActiveRecord model to track namespace statistics, specifically for root namespaces, and updating these statistics whenever a related project changes. However, updating these statistics efficiently poses a challenge due to the high frequency and cost of database queries already burdening GitLab.com. Various approaches were considered, including using PostgreSQL materialized views, Common Table Expressions, Redis storage, and direct tagging of namespaces, but each had significant drawbacks. The chosen solution, Attempt E, involves asynchronously updating namespace storage statistics using Sidekiq jobs, allowing updates without extending transaction lengths. This approach, while complex, avoids a need for background migration and is compatible with both PostgreSQL and MySQL. It was gradually rolled out under a feature flag on GitLab.com, starting with testing in a staging environment and then expanding to the gitlab-org group, before a global rollout. The implementation monitors performance through Sidekiq dashboards, and so far, it has shown successful execution without significant issues, paving the way for future enforcement of storage limits.

Trends Found in this Post

No tracked trend matches for this post yet.

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.