Lie #5 — Vendor Benchmarks Measure Real-World Performance
Blog post from Starburst
Vendor-provided benchmarks, often used to demonstrate the performance of database systems, can be misleading as they do not accurately reflect real-world production environments. These benchmarks, such as TPC-H and TPC-DS, assume ideal conditions where data is already optimized and centrally located, ignoring the complexities and costs associated with data preparation and transfer. As data strategies evolve towards distributed architectures like data lakes and data meshes, traditional benchmark metrics fail to capture the true cost/performance balance experienced by users. Real-world workloads are diverse, involving concurrent queries of varying sizes that strain system resources and impact performance. Vendors may skew benchmark results through practices like excessive caching, which are not feasible in large-scale, real-world scenarios. Organizations are encouraged to conduct their own performance tests to obtain vendor-neutral results that better reflect their specific production needs. Starburst, with its data source agnostic capabilities, advocates for this approach and supports the use of open data formats and diverse architecture choices to optimize performance and cost-effectiveness.