End-to-End SMS Provider Testing, It's How We Ensure SMS Alerts are Delivered
Blog post from PagerDuty
PagerDuty emphasizes reliability in delivering alerts by conducting End-to-End SMS Provider Testing to proactively identify and address potential delays or outages from third-party carriers, even when their status pages indicate full availability. This initiative involves continuously sending test SMS messages through their various providers, using a system of Android phones with different mobile carrier networks, to evaluate delivery times and ensure optimal performance. If a provider is deemed degraded—characterized by delivery latencies over three minutes or multiple missed messages—the team is alerted to replace them, thereby maintaining the integrity of customer alerts. While the current process of adjusting provider priority is manual, PagerDuty plans to automate it with a probabilistic model to minimize the noise of failure alerts and focus on solving issues. This rigorous testing and automation have not only enhanced the reliability of their services but also provided deep insights into the performance of connected systems, ensuring an improved user experience.