Home / Companies / Confident AI / Hacker News

Confident AI on HN

29 posts with 1+ points since 2023

Filters
Since:
Posts by Month (29 total)
Hacker News Posts
Title Points Comments Date
Unit Test LlamaIndex with DeepEval 35 -- 2023-08-28
Tackling the Weaknesses of BertScore 9 -- 2023-08-16
Everything I know about LLM evaluation metrics 7 -- 2024-01-24
Best Practices for Unit Testing RAG Systems in Prod 4 -- 2024-02-06
YC helped us raise our seed round in 5 days 4 -- 2025-03-20
We Replaced Pinecone with PGVector 3 -- 2023-11-01
How to evaluate multi-turn LLM chatbots 3 -- 2024-10-08
I used QAG to implement an LLM text summarization evals 3 -- 2023-12-19
Auto-Evaluation of LLMs with DeepEval 2 -- 2023-09-01
DeepEval GuardRails – AI Alignment 2 -- 2023-09-30
Test for LLM Hallucinations 2 -- 2023-08-31
Framework for evaluating LLM outputs with ML models 2 -- 2023-08-25
How to test LLM is non-toxic before pushing to prod 2 -- 2023-08-22
How to build your own LLM evaluation framework 2 -- 2024-04-15
AI Agent Evaluation: The Definitive Guide to Testing AI Agents 2 -- 2025-10-19
Testing for Image Similarity with DeepEval 1 -- 2023-10-02
Evaluating LLMs for Lawyers 1 -- 2023-09-25
How to Evaluate LangChain QA Retrieval 1 -- 2023-09-23
PDB Support for DeepEval 1 -- 2023-09-07
Test for Bias After Finetuning LLMs 1 -- 2023-09-02
Measure Answer Relevancy of LLMs 1 -- 2023-09-02
Testing Rank Similarity for Rag 1 -- 2023-08-26
We wrote a comprehensive guide on LLM security 1 -- 2024-08-20
How to generate synthetic data using SOTA data evolution methods 1 -- 2024-05-21
Overview of All Major LLM Benchmarks 1 -- 2024-03-22
Best practices I learnt from helping health tech enterprise test LLMs 1 -- 2024-02-27
What Is RAG? (With Examples) 1 -- 2023-12-01
Be confident about your LLM stack 1 -- 2023-08-15
The Complete LLM Evaluation Playbook: How To Run LLM Evals That Matter 1 -- 2025-06-16