Reinforcement learning

Reinforcement and preference learning for AI models

Mentions Over Time
4,380
Total Mentions
1,219
Posts with Topic
115
Companies Mentioning
+74.8%
YoY Change
Historical Data
Year Mentions Posts Companies YoY Change
2025 1,912 617 115 +74.8%
2024 1,094 263 77 +1.2%
2023 1,081 238 61 +532.2%
2022 171 69 27 +40.2%
2021 122 32 10 +238.9%
Recent Blog Posts 50 posts
Date Company Title Mentions
2026-04-06 Fireworks AI [staged] Introducing The Inference Fabric: Own Your AI 4
2026-04-03 AssemblyAI Node.js voice agent with AssemblyAI Universal-3 Pro Streaming 2
2026-04-03 Deepinfra Qwen3.5 9B API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra Qwen3 Coder 480B A35B API Benchmarks: Latency & Cost 1
2026-04-03 Deepinfra MiniMax-M2.5 API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra DeepSeek V3.2 API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra Qwen3.5 122B A10B API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Fireworks AI Scaling and Optimizing Frontier Model Training 1
2026-04-01 AssemblyAI LiveKit voice agent with AssemblyAI Universal-3 Pro Streaming 1
2026-04-01 Baseten Open-source LLM training is a mess. Here is how it all works. 2
2026-04-01 Fireworks AI The Fine-Tuning Bottleneck Isn't the Algorithm 2
2026-04-01 Together AI Aurora 2
2026-04-01 HuggingFace Holo3: Breaking the Computer Use Frontier 2
2026-04-01 Temporal AI reliability is a decade-old problem. And we’re still only solving half of it 1
2026-03-31 Cohere Ensemble and Cohere building the first RCM‑native healthcare LLM 1
2026-03-31 Contentful What is agentic architecture? The new way to automate your workflow 2
2026-03-30 testRigor Can You Trust an AI That Can’t Explain Its Decisions? A Guide to Explainable AI… 1
2026-03-27 Deepchecks 7 Top Enterprise Generative AI Tools for Fine-Tuning 4
2026-03-27 Cursor A technical report on Composer 2 2
2026-03-26 StackHawk Best AI Pentesting Tools in 2026: Top Picks Compared 1
2026-03-25 AI21 Labs Stride and prejudice: How a 32-bit overflow corrupted a CUDA kernel (and stayed… 2
2026-03-25 Browserbase Introducing BrowserEnv: Train browser agents on real websites 4
2026-03-24 Vantage Cursor's Composer 2: What It Means for Your AI Coding Costs 1
2026-03-21 Vercel Making Turborepo 96% faster with agents, sandboxes, and humans 1
2026-03-20 Roboflow DeepSeek Vision Models 1
2026-03-19 Cursor Introducing Composer 2 2
2026-03-19 Deepchecks Human Feedback vs. Synthetic Feedback in LLM Precision 1
2026-03-19 Harness CI Pipeline Optimization Guide for Platform Engineering Lead 1
2026-03-18 Bright Data Web Data for AI Agents: 6 Use Cases and the Benchmarks That Tell You Which Tool… 1
2026-03-18 Together AI Mamba-3 1
2026-03-17 AI21 Labs Mind the gap 1
2026-03-17 Galileo What MT-Bench and Chatbot Arena Reveal About Most LLM Judges 1
2026-03-17 Prem AI Which LLM Alignment Method? RLHF vs DPO vs KTO Tradeoffs Explained 31
2026-03-17 Cursor Training Composer for longer horizons 2
2026-03-17 Galileo What MT-Bench and Chatbot Arena Reveal About Most LLM Judges 1
2026-03-17 HuggingFace Holotron-12B - High Throughput Computer Use Agent 1
2026-03-17 HuggingFace Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI 1
2026-03-17 HuggingFace State of Open Source on Hugging Face: Spring 2026 1
2026-03-17 Prem AI How to Generate Synthetic Training Data for LLM Fine-Tuning (2026 Guide) 1
2026-03-17 WorkOS Prompt injection attacks: What are they and how to defend against them 1
2026-03-16 Prem AI Air-Gapped AI Fine-Tuning: How to Train Custom LLMs Without Internet Access 2
2026-03-16 HuggingFace Expanding the Alpamayo Open Platform for Developing Reasoning AVs Across Models… 1
2026-03-16 Lambda Lambda at NVIDIA GTC 2026: building the Superintelligence Cloud 2
2026-03-16 LangChain LangChain Announces Enterprise Agentic AI Platform Built with NVIDIA 1
2026-03-16 Redpanda Redpanda pushes the envelope on NVIDIA Vera 1
2026-03-16 Roboflow Gemini 3 Guide: Master Google’s Deep Think Model in Roboflow 2
2026-03-12 Unsloth What are RL environments and how to build them 5
2026-03-12 Nanonets Are OpenAI and Google intentionally downgrading their models? 1
2026-03-12 Prem AI Reasoning Models Explained: OpenAI o1/o3 vs DeepSeek R1 vs QwQ-32B 3
2026-03-11 Zilliz Top 10 Context Engineering Techniques You Should Know for Production RAG 1