Up and to the left! How Martian Uses Routing to Push the Pareto Frontier
Blog post from Martian
The text explores the concept of multi-objective optimization, initially using egg selection as an analogy but ultimately focusing on Large Language Models (LLMs) and their performance in terms of cost and quality. It introduces the idea of the Pareto Frontier, which represents optimal trade-offs between these factors, and discusses how this can be expanded by either improving quality or reducing costs. The text suggests that routing requests to different LLMs based on their specific strengths can surpass the existing Pareto Frontier, thus optimizing outcomes. It highlights the importance of understanding LLMs individually, in order to direct requests to the most suitable models, and outlines various methods for achieving this understanding. The text concludes with examples of successful implementations, such as a customer help chat system that significantly reduced errors and costs, illustrating the potential impact of these strategies in reshaping the optimization landscape for LLMs.