In this blog post, the authors discuss how they built a custom load balancer to improve the performance and scalability of their infrastructure. The need for an alternative load balancer arose due to issues with the Google Cloud Load Balancer (GCLB), which was causing uneven distribution of requests across machines, leading to overloaded and underutilized resources. The authors built a proof-of-concept using subsetting, a simple concept that places requests in the least-loaded machine at any point in time. They compared their results with GCLB and were surprised by the differences. After building a simulator to test different strategies and tweaks, they deployed their custom load balancer in production and found significant improvements in resource utilization uniformity and reduction of spurious errors. The authors are now working on adding vertical scalability and are excited about the potential of their new primitive.