When "Performance" Means Two Different Things
Blog post from Pinecone
In AI applications, the term "performance" encompasses two distinct concepts: infrastructure metrics, such as latency, throughput, and cost, and result quality, including accuracy, precision, and relevance. Confusing these two types of performance can lead to misalignment between teams and unmet expectations among stakeholders. Infrastructure performance involves measurable parameters essential for system functionality, while result quality pertains to the user experience and the effectiveness of the retrieval pipeline. The article underscores the importance of independently measuring and optimizing both types of performance to ensure a balanced approach, as focusing solely on one can lead to improvements in infrastructure but degrade user experience, as seen in a case study where cost reduction led to a decline in result quality.