The text explores various approaches to handling high-cardinality categorical values in large language models (LLMs), focusing on use cases where structured data output is required, such as query analysis. It highlights the challenges LLMs face in identifying correct values from a large set of possibilities, particularly when dealing with categorical values that are not inherently recognized by the model. The document details different strategies tested to improve the accuracy and efficiency of query analysis, including context stuffing, pre-LLM filtering, and post-LLM selection, with a focus on methods like embedding similarity and n-gram similarity for filtering and selecting valid names. The results indicate that post-LLM selection using embedding similarity offers the best performance in terms of accuracy, speed, and cost. The study emphasizes the need for further benchmarking with even larger datasets typical in enterprise systems, which often deal with millions of possible values.