Scale-to-Zero is now in public preview, offering a serverless infrastructure solution that automatically adjusts workloads on GPU and CPU based on traffic demands, enhancing cost efficiency and resource management. This feature, combined with autoscaling, enables applications to "sleep" and "wake" based on incoming requests, thus optimizing infrastructure for compute-intensive tasks like inference and multi-tenant SaaS deployments. It provides significant cost savings by billing per second of usage, with no charges when services are inactive. Scale-to-Zero supports global deployments without incurring additional fees for extra regions and addresses cold start concerns with a startup time of 1 to 5 seconds, which is expected to improve further. Users can easily configure the Scale-to-Zero feature through a control panel or CLI, setting services to scale down to zero when inactive and automatically scaling up when demand increases, making it a flexible, controllable, and globally customizable solution for managing serverless applications.