Tracking outliers in A/B testing: When one apple spoils the barrel
Blog post from Statsig
A/B testing analysts frequently face challenges with skewed distributions and outliers in Key Performance Indicators (KPIs), particularly with revenue metrics, which can distort test results by inflating variance and reducing statistical power. This piece explores the issues caused by outliers, highlighting the increased probability of Type II errors in hypothesis testing, where real effects might be missed. Methods for identifying outliers include visual tools like boxplots and statistical techniques such as Z-scores and Interquartile Range (IQR). Handling outliers requires careful strategies, including data transformations and winsorization—the latter being recommended for preserving data integrity while minimizing variance impact. Winsorization involves capping extreme values at predefined percentiles to maintain statistical power and enhance the reliability of test outcomes. The article underscores the importance of managing outliers to improve the sensitivity of A/B tests, as demonstrated through simulations showing increased statistical power with winsorized datasets.