Reinforcement learning

Reinforcement and preference learning for AI models

Mentions Over Time
2,374
Total Mentions
819
Posts with Topic
29
Companies Mentioning
-0.8%
MoM Change
Historical Data
Month Mentions Posts Companies MoM Change
Mar 2026 121 52 29 -0.8%
Feb 2026 122 54 33 -15.3%
Jan 2026 144 50 25 +9.1%
Dec 2025 132 49 26 -54.9%
Nov 2025 293 55 27 +98.0%
Oct 2025 148 53 22 +32.1%
Sep 2025 112 29 18 +14.3%
Aug 2025 98 39 26 -35.9%
Jul 2025 153 52 26 +34.2%
Jun 2025 114 37 24 -26.9%
May 2025 156 85 24 -17.0%
Apr 2025 188 89 21 -13.4%
Mar 2025 217 54 34 +40.9%
Feb 2025 154 45 28 +5.5%
Jan 2025 146 29 15 +239.5%
Dec 2024 43 28 16 +30.3%
Nov 2024 33 19 15 --
Recent Blog Posts 50 posts
Date Company Title Mentions
2026-04-06 Fireworks AI [staged] Introducing The Inference Fabric: Own Your AI 4
2026-04-03 AssemblyAI Node.js voice agent with AssemblyAI Universal-3 Pro Streaming 2
2026-04-03 Deepinfra Qwen3.5 9B API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra Qwen3 Coder 480B A35B API Benchmarks: Latency & Cost 1
2026-04-03 Deepinfra MiniMax-M2.5 API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra DeepSeek V3.2 API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra Qwen3.5 122B A10B API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Fireworks AI Scaling and Optimizing Frontier Model Training 1
2026-04-01 AssemblyAI LiveKit voice agent with AssemblyAI Universal-3 Pro Streaming 1
2026-04-01 Baseten Open-source LLM training is a mess. Here is how it all works. 2
2026-04-01 Fireworks AI The Fine-Tuning Bottleneck Isn't the Algorithm 2
2026-04-01 Together AI Aurora 2
2026-04-01 HuggingFace Holo3: Breaking the Computer Use Frontier 2
2026-04-01 Temporal AI reliability is a decade-old problem. And we’re still only solving half of it 1
2026-03-31 Cohere Ensemble and Cohere building the first RCM‑native healthcare LLM 1
2026-03-31 Contentful What is agentic architecture? The new way to automate your workflow 2
2026-03-30 testRigor Can You Trust an AI That Can’t Explain Its Decisions? A Guide to Explainable AI… 1
2026-03-27 Deepchecks 7 Top Enterprise Generative AI Tools for Fine-Tuning 4
2026-03-27 Cursor A technical report on Composer 2 2
2026-03-26 StackHawk Best AI Pentesting Tools in 2026: Top Picks Compared 1
2026-03-25 AI21 Labs Stride and prejudice: How a 32-bit overflow corrupted a CUDA kernel (and stayed… 2
2026-03-25 Browserbase Introducing BrowserEnv: Train browser agents on real websites 4
2026-03-24 Vantage Cursor's Composer 2: What It Means for Your AI Coding Costs 1
2026-03-21 Vercel Making Turborepo 96% faster with agents, sandboxes, and humans 1
2026-03-20 Roboflow DeepSeek Vision Models 1
2026-03-19 Cursor Introducing Composer 2 2
2026-03-19 Deepchecks Human Feedback vs. Synthetic Feedback in LLM Precision 1
2026-03-19 Harness CI Pipeline Optimization Guide for Platform Engineering Lead 1
2026-03-18 Bright Data Web Data for AI Agents: 6 Use Cases and the Benchmarks That Tell You Which Tool… 1
2026-03-18 Together AI Mamba-3 1
2026-03-17 AI21 Labs Mind the gap 1
2026-03-17 Galileo What MT-Bench and Chatbot Arena Reveal About Most LLM Judges 1
2026-03-17 Prem AI Which LLM Alignment Method? RLHF vs DPO vs KTO Tradeoffs Explained 31
2026-03-17 Cursor Training Composer for longer horizons 2
2026-03-17 Galileo What MT-Bench and Chatbot Arena Reveal About Most LLM Judges 1
2026-03-17 HuggingFace Holotron-12B - High Throughput Computer Use Agent 1
2026-03-17 HuggingFace Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI 1
2026-03-17 HuggingFace State of Open Source on Hugging Face: Spring 2026 1
2026-03-17 Prem AI How to Generate Synthetic Training Data for LLM Fine-Tuning (2026 Guide) 1
2026-03-17 WorkOS Prompt injection attacks: What are they and how to defend against them 1
2026-03-16 Prem AI Air-Gapped AI Fine-Tuning: How to Train Custom LLMs Without Internet Access 2
2026-03-16 HuggingFace Expanding the Alpamayo Open Platform for Developing Reasoning AVs Across Models… 1
2026-03-16 Lambda Lambda at NVIDIA GTC 2026: building the Superintelligence Cloud 2
2026-03-16 LangChain LangChain Announces Enterprise Agentic AI Platform Built with NVIDIA 1
2026-03-16 Redpanda Redpanda pushes the envelope on NVIDIA Vera 1
2026-03-16 Roboflow Gemini 3 Guide: Master Google’s Deep Think Model in Roboflow 2
2026-03-12 Unsloth What are RL environments and how to build them 5
2026-03-12 Nanonets Are OpenAI and Google intentionally downgrading their models? 1
2026-03-12 Prem AI Reasoning Models Explained: OpenAI o1/o3 vs DeepSeek R1 vs QwQ-32B 3
2026-03-11 Zilliz Top 10 Context Engineering Techniques You Should Know for Production RAG 1