LangSmith has introduced a new feature that allows users to compare metrics of logged traces with different tags or metadata, enhancing the monitoring capabilities for LLM applications. This feature enables users to mark different versions of their applications with unique identifiers and view their performance side-by-side. Through LangChain or the LangSmith SDK/API, users can log traces with custom tags and metadata, facilitating detailed performance analysis. A case study highlights how LangSmith is used to evaluate the performance of various LLM providers in a chatbot application, using the "llm" metadata key to compare metrics such as latency and time-to-first-token. This allows for data-driven decisions on model efficiency and user satisfaction. The new grouping feature can also be applied to other scenarios like A/B testing and enhancing user experience by analyzing performance metrics based on specific metadata categories.