Industries such as legal services, healthcare, biomedical, technology, and financial services require a thorough understanding of specialized jargon and complex concepts, often necessitating fine-tuning or pre-training on custom data to capture the essential nuances for accurate comprehension. Cohere's fine-tuning capabilities for Rerank are highly appreciated by its customers, emphasizing the importance of evaluating semantic search systems. The primary metric for assessing these systems is the relevance of search results to the query, starting with an informal human "smell test" but ideally progressing to numeric evaluations using established metrics like accuracy@n, recall@n, or nDCG. For example, accuracy@n involves determining how many of the top n results are relevant to the query, with a labeled dataset essential for this process. Jay Alammar's book "Hands-On Large Language Models" provides further insights into evaluating embeddings search.