Pinecone Assistant, a managed service for AI assistants, focuses on providing high-quality responses for knowledge-intensive tasks using private data. To evaluate its performance, the developers created new metrics, as existing ones often fail to capture the nuances of generative AI answers. Traditional unsupervised metrics, like those in the RAGAS library, showed poor alignment with human judgment, prompting the creation of a supervised metric system involving correctness and completeness evaluations. This system uses a protocol that assesses generated answers against extracted atomic facts, significantly improving alignment with human evaluation and reducing false positives. The research utilized datasets from various domains, such as FinanceBench and Open Australian Legal, demonstrating that Pinecone Assistant outperforms OpenAI's solutions in terms of correctness, completeness, and answer alignment. Future plans include expanding datasets and automating benchmark results to reduce the reliance on costly ground truth collection.