Neon experienced two significant outages in May 2025 due to IP allocation failures in their AWS us-east-1 region, primarily caused by a misconfiguration of the AWS CNI plugin that allocates networking resources for Kubernetes Pods. The disruption began when a periodic job in Neon's Control Plane failed to terminate idle databases, leading to an unexpected surge in active Compute Pods and exhausting available IP addresses in the VPC subnets. Despite initial configuration changes intended to free up IPs, the issue persisted due to the AWS CNI plugin's default behavior of holding IPs in a cooldown state and inefficient allocation practices. Subsequent attempts to stabilize the system by modifying the WARM_IP_TARGET setting inadvertently worsened the situation, as the configuration led to IP allocation errors that prevented new Pods from starting. The investigation revealed that the AWS CNI plugin's handling of IPs was inconsistent with expectations, particularly when WARM_IP_TARGET was set to values that conflicted with the underlying IP management logic. Eventually, the errors subsided due to a temporary halt in idle compute terminations, which allowed for IP reallocation. The team at Neon has since submitted a pull request to AWS CNI to address these issues and shared their findings publicly to aid other teams in avoiding similar problems.