Surpassing 10Gb/s over Tailscale
Blog post from Tailscale
Tailscale has implemented a series of performance enhancements to the wireguard-go userspace WireGuard implementation, resulting in improved client throughput on Linux, surpassing 10Gb/s on bare metal Linux and outperforming the in-kernel WireGuard implementation. These improvements, achieved through UDP segmentation offload, UDP generic receive offload, and checksum optimizations, are included in the unstable Tailscale client release and the upcoming Tailscale v1.40. The modifications allow Tailscale to join the 10Gb/s club by optimizing packet processing, thus reducing CPU cycles per byte and enhancing overall throughput. Testing was conducted on various systems, including AWS instances and bare metal servers, showing significant improvements in network performance, with some hardware configurations benefiting from hardware UDP segmentation offload. The post emphasizes that while benchmark results may vary, the optimizations have generally increased throughput by up to 35% and nearly doubled it in some cases, marking a significant milestone in network performance for Tailscale.