Home / Trends / Reinforcement learning

Reinforcement learning

Reinforcement and preference learning for AI models

Explore historical trend data

Sign up for free to navigate to earlier time periods.

Tracked keywords: reinforcement learning, reward-based learning, rlhf, agent learning via feedback, preference learning, fine-tuning loops
Add filters to this data
Mentions Over Time
685
Total Mentions
278
Posts with Topic
6
Companies Mentioning
+53.8%
WoW Change
Historical Data
Week Mentions Posts Companies WoW Change
May 25, 2026 20 8 6 +53.8%
May 18, 2026 13 6 6 -45.8%
May 11, 2026 24 12 6 +26.3%
May 04, 2026 19 7 6 -40.6%
Apr 27, 2026 32 8 5 +33.3%
Apr 20, 2026 24 10 10 +166.7%
Apr 13, 2026 9 6 4 -10.0%
Apr 06, 2026 10 5 4 -54.5%
Mar 30, 2026 22 16 10 +57.1%
Mar 23, 2026 14 6 6 -74.5%
Mar 16, 2026 55 21 14 +129.2%
Mar 09, 2026 24 12 9 +14.3%
Mar 02, 2026 21 8 6 -12.5%
Feb 23, 2026 24 15 11 -31.4%
Feb 16, 2026 35 8 6 -16.7%
Feb 09, 2026 42 16 14 +100.0%
Feb 02, 2026 21 15 12 0%
Jan 26, 2026 21 10 10 0%
Jan 19, 2026 21 9 8 -56.3%
Jan 12, 2026 48 11 9 +14.3%
Jan 05, 2026 42 18 11 +50.0%
Dec 29, 2025 28 4 2 +47.4%
Dec 22, 2025 19 5 4 -42.4%
Dec 15, 2025 33 13 12 -10.8%
Dec 08, 2025 37 16 10 +37.0%
Dec 01, 2025 27 13 9 --
Recent Blog Posts 50 posts
Date Company Title Mentions
2026-05-28 Google Cloud How the community trained Gemma to "Think" with Tunix and TPUs 5
2026-05-27 AssemblyAI Building a voice agent: the full production timeline for both approaches 2
2026-05-26 AssemblyAI What's new in Universal-3 Pro: smarter code-switching, faster turnaround, and b… 1
2026-05-26 Northflank Top CoreWeave Sandbox alternatives for AI agent workloads in 2026 3
2026-05-26 Firecrawl Reduce LLM & Agent Hallucinations With Real-Time Web Search 5
2026-05-25 Deepinfra NVIDIA Nemotron 3 Super on DeepInfra: 120B MoE Model 2
2026-05-25 HuggingFace Harness, Scaffold, and the AI Agent Terms Worth Getting Right 1
2026-05-25 HuggingFace Should we use genetics instead of system prompts for AI Agents & Personas? 1
2026-05-22 HuggingFace Eight Days in China: What I Learned from the AI Labs, Robotics Startups and Aca… 2
2026-05-21 AssemblyAI How I built a voice agent without writing (or understanding) any code 1
2026-05-21 Modal Modal's Series C: Raising $355M at a $4.65B valuation 4
2026-05-20 Cerebrium Deploying DeepSeek-R1: A Guide to a Serverless, High-Performaning OpenAI-Compat… 1
2026-05-19 Bugcrowd AI benchmarking report: Measuring the exploitation ladder for AI models 4
2026-05-18 Cursor Introducing Composer 2.5 1
2026-05-15 HuggingFace Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Fronti… 1
2026-05-15 PromptLayer LLM-as-a-Judge: How Do You Know If Your AI Is Actually Good? 1
2026-05-14 Anyscale Architecting Data Pipelines for Multimodal Datasets at Scale 1
2026-05-14 HuggingFace Unlocking asynchronicity in continuous batching 1
2026-05-14 Anyscale Introducing the Anyscale Agent Skill for LLM Post-Training 9
2026-05-14 LangChain Introducing LangChain Labs 1
2026-05-14 LangChain Everything we shipped at Interrupt 1
2026-05-13 Exa How Search Quality Shapes RL Outcomes 2
2026-05-13 LangChain LangSmith LLM Gateway: runtime governance built into the agent lifecycle 1
2026-05-11 HuggingFace Building Blocks for Foundation Model Training and Inference on AWS 4
2026-05-11 HuggingFace Two Years of Local AI on a Laptop: When Open Models Outpaced Moore's Law 1
2026-05-11 Unsloth Unsloth Joins PyTorch Ecosystem 1
2026-05-09 HuggingFace "OncoAgent: A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology C… 1
2026-05-08 Baseten Introducing the Baseten Loops SDK 1
2026-05-07 Anyscale AI agents on Ray Serve: Single to multi-agent architecture 1
2026-05-07 HuggingFace QVAC MedPsy: State-of-the-Art Medical and Healthcare Language Models for Edge D… 7
2026-05-07 Marqo Marqo vs Cimulate: Independent AI-Native Search vs Salesforce-Locked Discovery 3
2026-05-05 Confluent Integrating AI Into Apache Kafka Architectures: Patterns and Best Practices 2
2026-05-05 Northflank AI Sandbox pricing comparison (2026) 1
2026-05-04 JetBrains PyTorch vs. TensorFlow: Choosing the Right Framework in 2026 | The PyCharm Blog 4
2026-05-04 Northflank GPU sandboxes: isolation models and platform support in 2026 2
2026-05-02 Acceldata Enterprise Data Agents vs Traditional Monitoring Tools 1
2026-04-30 AssemblyAI Build a voice agent with a chained STT-LLM-TTS architecture 1
2026-04-29 HuggingFace Granite 4.1 LLMs: How They’re Built 15
2026-04-28 Galileo Scaling Judge Compute: The Next Frontier in AI Evaluation 1
2026-04-28 Galileo Why LLM Judges Disagree With Your Experts — and How to Fix It 7
2026-04-28 HuggingFace Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence f… 2
2026-04-27 HuggingFace OpenRA-RL: An Open Platform for AI Agents in Real-Time Strategy Games 4
2026-04-27 Couchbase What Is an AI-Powered Recommendation Engine? 1
2026-04-24 Fireworks AI How we fixed prompt injection for all models on Fireworks 1
2026-04-24 Eden AI Top Free Generative AI APIs, Open Source models, and tools 3
2026-04-24 Agora From Dark Matter to Voice AI: Deepgram’s Journey to Speech Recognition 2
2026-04-24 Anyscale Introducing Vision-Language Reinforcement Learning in SkyRL 7
2026-04-24 Datadog Introducing ARFBench: A time series question-answering benchmark based on real … 1
2026-04-23 Redis Human in the loop: Why your production AI systems need human oversight 3
2026-04-23 CopilotKit Generative UI Spectrum: How Agents Now Ship Their Own Interfaces 3