Home / Trends / Reinforcement learning

Reinforcement learning

Reinforcement and preference learning for AI models

Explore historical trend data

Sign up for free to navigate to earlier time periods.

Tracked keywords: reinforcement learning, reward-based learning, rlhf, agent learning via feedback, preference learning, fine-tuning loops
Add filters to this data
Mentions Over Time
553
Total Mentions
227
Posts with Topic
4
Companies Mentioning
-10.0%
WoW Change
Historical Data
Week Mentions Posts Companies WoW Change
Apr 13, 2026 9 6 4 -10.0%
Apr 06, 2026 10 5 4 -54.5%
Mar 30, 2026 22 16 10 +57.1%
Mar 23, 2026 14 6 6 -74.5%
Mar 16, 2026 55 21 14 +129.2%
Mar 09, 2026 24 12 9 +14.3%
Mar 02, 2026 21 8 6 -12.5%
Feb 23, 2026 24 15 11 -31.4%
Feb 16, 2026 35 8 6 -16.7%
Feb 09, 2026 42 16 14 +100.0%
Feb 02, 2026 21 15 12 0%
Jan 26, 2026 21 10 10 0%
Jan 19, 2026 21 9 8 -56.3%
Jan 12, 2026 48 11 9 +14.3%
Jan 05, 2026 42 18 11 +50.0%
Dec 29, 2025 28 4 2 +47.4%
Dec 22, 2025 19 5 4 -42.4%
Dec 15, 2025 33 13 12 -10.8%
Dec 08, 2025 37 16 10 +37.0%
Dec 01, 2025 27 13 9 --
Recent Blog Posts 50 posts
Date Company Title Mentions
2026-04-17 Neo4j Build AI Agents That Make Better Decisions on GCP with Neo4j 1
2026-04-16 HuggingFace Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents 4
2026-04-14 Acceldata Data Agents vs Traditional Monitoring Tools Comparison 1
2026-04-14 Arize Building smarter AI agents: architecture, evals, and lessons from the field 1
2026-04-14 HuggingFace Nucleus-Image: Scaling Text-to-Image with Sparse Mixture of Experts 1
2026-04-13 HuggingFace Releasing LiteCoder-Terminal-SFT 1
2026-04-09 Anyscale How Notion cuts embedding costs by 80% and other stories on scaling AI with Ray… 1
2026-04-08 AssemblyAI Edge cases in transcription: Offline mode, partial audio files and API limits 2
2026-04-07 Fireworks AI Own Your AI: Fireworks Training Preview 2
2026-04-07 Socket Microsoft Releases Open Source Toolkit for AI Agent Runtime Security 1
2026-04-06 Fireworks AI [staged] Introducing The Inference Fabric: Own Your AI 4
2026-04-03 AssemblyAI Node.js voice agent with AssemblyAI Universal-3 Pro Streaming 2
2026-04-03 Deepinfra Qwen3.5 9B API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra Qwen3 Coder 480B A35B API Benchmarks: Latency & Cost 1
2026-04-03 Deepinfra MiniMax-M2.5 API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra DeepSeek V3.2 API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Deepinfra Qwen3.5 122B A10B API Benchmarks: Latency, Throughput & Cost 1
2026-04-03 Fireworks AI Scaling and Optimizing Frontier Model Training 1
2026-04-03 Acceldata Automated Data Governance Through Machine-Executable Policy Logic 1
2026-04-01 AssemblyAI LiveKit voice agent with AssemblyAI Universal-3 Pro Streaming 1
2026-04-01 Baseten Open-source LLM training is a mess. Here is how it all works. 2
2026-04-01 Fireworks AI The Fine-Tuning Bottleneck Isn't the Algorithm 2
2026-04-01 Together AI Aurora 2
2026-04-01 HuggingFace Holo3: Breaking the Computer Use Frontier 2
2026-04-01 Temporal AI reliability is a decade-old problem. And we’re still only solving half of it 1
2026-03-31 Cohere Ensemble and Cohere building the first RCM‑native healthcare LLM 1
2026-03-31 Contentful What is agentic architecture? The new way to automate your workflow 2
2026-03-30 testRigor Can You Trust an AI That Can’t Explain Its Decisions? A Guide to Explainable AI… 1
2026-03-27 Deepchecks 7 Top Enterprise Generative AI Tools for Fine-Tuning 4
2026-03-27 Cursor A technical report on Composer 2 2
2026-03-26 StackHawk Best AI Pentesting Tools in 2026: Top Picks Compared 1
2026-03-25 AI21 Labs Stride and prejudice: How a 32-bit overflow corrupted a CUDA kernel (and stayed… 2
2026-03-25 Browserbase Introducing BrowserEnv: Train browser agents on real websites 4
2026-03-24 Vantage Cursor's Composer 2: What It Means for Your AI Coding Costs 1
2026-03-21 Vercel Making Turborepo 96% faster with agents, sandboxes, and humans 1
2026-03-20 Roboflow DeepSeek Vision Models 1
2026-03-19 Cursor Introducing Composer 2 2
2026-03-19 Deepchecks Human Feedback vs. Synthetic Feedback in LLM Precision 1
2026-03-19 Harness CI Pipeline Optimization Guide for Platform Engineering Lead 1
2026-03-18 Bright Data Web Data for AI Agents: 6 Use Cases and the Benchmarks That Tell You Which Tool… 1
2026-03-18 Together AI Mamba-3 1
2026-03-17 AI21 Labs Mind the gap 1
2026-03-17 Galileo What MT-Bench and Chatbot Arena Reveal About Most LLM Judges 1
2026-03-17 Prem AI Which LLM Alignment Method? RLHF vs DPO vs KTO Tradeoffs Explained 31
2026-03-17 Cursor Training Composer for longer horizons 2
2026-03-17 Galileo What MT-Bench and Chatbot Arena Reveal About Most LLM Judges 1
2026-03-17 HuggingFace Holotron-12B - High Throughput Computer Use Agent 1
2026-03-17 HuggingFace Nemotron 3 Nano 4B: A Compact Hybrid Model for Efficient Local AI 1
2026-03-17 HuggingFace State of Open Source on Hugging Face: Spring 2026 1
2026-03-17 Prem AI How to Generate Synthetic Training Data for LLM Fine-Tuning (2026 Guide) 1