| OWASP Top 10 2025 for LLM Applications: What’s new? Risks, and Mitigation Techniques |
Kritin Vongthongsri |
Jan 19, 2025 |
3590 |
- |
| The People's Choice of Top LLM Evaluation Tools in 2025 |
Jeffrey Ip |
Jan 18, 2025 |
1829 |
- |
| LLM Guardrails: The Ultimate Guide to Safeguard LLM Systems |
Jeffrey Ip |
Jan 26, 2025 |
3024 |
- |
| LLM Agent Evaluation: Assessing Tool Use, Task Completion, Agentic Reasoning, and More |
Kritin Vongthongsri |
Jan 31, 2025 |
2702 |
- |
| How I Built Deterministic LLM Evaluation Metrics for DeepEval |
Jeffrey Ip |
Feb 09, 2025 |
2335 |
- |
| How I raised Confident AI's $2.2M seed round in 5 days |
Jeffrey Ip |
Mar 20, 2025 |
1962 |
4 |
| Top LLM Evaluators for Testing LLM Systems at Scale |
Jeffrey Ip |
Apr 22, 2025 |
3227 |
- |
| The G-Eval Guide to LLM Evaluation: Simply Explained |
Kritin Vongthongsri |
Apr 30, 2025 |
3925 |
- |
| The Ultimate LLM Evaluation Playbook: Why It Didn't Work For You |
Jeffrey Ip |
May 03, 2025 |
3973 |
- |
| RAG Evaluation Metrics: Assessing Answer Relevancy, Faithfulness, Contextual Relevancy, And More |
Jeffrey Ip |
Jun 04, 2025 |
2552 |
- |
| LLM Arena-as-a-Judge: LLM-Evals for Comparison-Based Regression Testing |
Deep |
Aug 30, 2025 |
2299 |
- |
| Top LangSmith Alternatives and Competitors, Compared |
Jeffrey Ip |
Sep 02, 2025 |
3106 |
- |
| Confident AI vs OpenLayer: Head-to-Head Comparison |
Jeffrey Ip |
Aug 29, 2025 |
2460 |
- |
| AI Agent Evaluation: The Definitive Guide to Testing AI Agents |
Jeffrey Ip |
Oct 08, 2025 |
5729 |
- |