Home / Trends / Reinforcement learning

Reinforcement learning

Reinforcement and preference learning for AI models

Explore historical trend data

Sign up for free to navigate to earlier time periods.

Tracked keywords: reinforcement learning, reward-based learning, rlhf, agent learning via feedback, preference learning, fine-tuning loops
Add filters to this data
Mentions Over Time
652
Total Mentions
264
Posts with Topic
6
Companies Mentioning
+26.3%
WoW Change
Historical Data
Week Mentions Posts Companies WoW Change
May 11, 2026 24 12 6 +26.3%
May 04, 2026 19 7 6 -40.6%
Apr 27, 2026 32 8 5 +33.3%
Apr 20, 2026 24 10 10 +166.7%
Apr 13, 2026 9 6 4 -10.0%
Apr 06, 2026 10 5 4 -54.5%
Mar 30, 2026 22 16 10 +57.1%
Mar 23, 2026 14 6 6 -74.5%
Mar 16, 2026 55 21 14 +129.2%
Mar 09, 2026 24 12 9 +14.3%
Mar 02, 2026 21 8 6 -12.5%
Feb 23, 2026 24 15 11 -31.4%
Feb 16, 2026 35 8 6 -16.7%
Feb 09, 2026 42 16 14 +100.0%
Feb 02, 2026 21 15 12 0%
Jan 26, 2026 21 10 10 0%
Jan 19, 2026 21 9 8 -56.3%
Jan 12, 2026 48 11 9 +14.3%
Jan 05, 2026 42 18 11 +50.0%
Dec 29, 2025 28 4 2 +47.4%
Dec 22, 2025 19 5 4 -42.4%
Dec 15, 2025 33 13 12 -10.8%
Dec 08, 2025 37 16 10 +37.0%
Dec 01, 2025 27 13 9 --
Recent Blog Posts 50 posts
Date Company Title Mentions
2026-05-15 HuggingFace Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Fronti… 1
2026-05-15 PromptLayer LLM-as-a-Judge: How Do You Know If Your AI Is Actually Good? 1
2026-05-14 Anyscale Architecting Data Pipelines for Multimodal Datasets at Scale 1
2026-05-14 HuggingFace Unlocking asynchronicity in continuous batching 1
2026-05-14 Anyscale Introducing the Anyscale Agent Skill for LLM Post-Training 9
2026-05-14 LangChain Introducing LangChain Labs 1
2026-05-14 LangChain Everything we shipped at Interrupt 1
2026-05-13 Exa How Search Quality Shapes RL Outcomes 2
2026-05-13 LangChain LangSmith LLM Gateway: runtime governance built into the agent lifecycle 1
2026-05-11 HuggingFace Building Blocks for Foundation Model Training and Inference on AWS 4
2026-05-11 HuggingFace Two Years of Local AI on a Laptop: When Open Models Outpaced Moore's Law 1
2026-05-11 Unsloth Unsloth Joins PyTorch Ecosystem 1
2026-05-09 HuggingFace "OncoAgent: A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology C… 1
2026-05-08 Baseten Introducing the Baseten Loops SDK 1
2026-05-07 Anyscale AI agents on Ray Serve: Single to multi-agent architecture 1
2026-05-07 HuggingFace QVAC MedPsy: State-of-the-Art Medical and Healthcare Language Models for Edge D… 7
2026-05-07 Marqo Marqo vs Cimulate: Independent AI-Native Search vs Salesforce-Locked Discovery 3
2026-05-05 Confluent Integrating AI Into Apache Kafka Architectures: Patterns and Best Practices 2
2026-05-04 JetBrains PyTorch vs. TensorFlow: Choosing the Right Framework in 2026 | The PyCharm Blog 4
2026-05-02 Acceldata Enterprise Data Agents vs Traditional Monitoring Tools 1
2026-04-30 AssemblyAI Build a voice agent with a chained STT-LLM-TTS architecture 1
2026-04-29 HuggingFace Granite 4.1 LLMs: How They’re Built 15
2026-04-28 Galileo Scaling Judge Compute: The Next Frontier in AI Evaluation 1
2026-04-28 Galileo Why LLM Judges Disagree With Your Experts — and How to Fix It 7
2026-04-28 HuggingFace Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence f… 2
2026-04-27 HuggingFace OpenRA-RL: An Open Platform for AI Agents in Real-Time Strategy Games 4
2026-04-27 Couchbase What Is an AI-Powered Recommendation Engine? 1
2026-04-24 Eden AI Top Free Generative AI APIs, Open Source models, and tools 3
2026-04-24 Agora From Dark Matter to Voice AI: Deepgram’s Journey to Speech Recognition 2
2026-04-24 Anyscale Introducing Vision-Language Reinforcement Learning in SkyRL 7
2026-04-24 Datadog Introducing ARFBench: A time series question-answering benchmark based on real … 1
2026-04-23 Fireworks AI How we fixed prompt injection for all models on Fireworks 1
2026-04-23 Redis Human in the loop: Why your production AI systems need human oversight 3
2026-04-23 CopilotKit Generative UI Spectrum: How Agents Now Ship Their Own Interfaces 3
2026-04-21 Together AI Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding 1
2026-04-21 Cursor Cursor partners with SpaceX on model training 1
2026-04-21 HuggingFace RL: A Structured Human Action & Intent Dataset for Physical AI and World Models 4
2026-04-20 WorkOS The OWASP Top 10 for LLM applications: What developers shipping AI features nee… 1
2026-04-17 Neo4j Build AI Agents That Make Better Decisions on GCP with Neo4j 1
2026-04-17 Neo4j Build AI Agents That Make Better Decisions on GCP with Neo4j 1
2026-04-16 HuggingFace Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents 4
2026-04-16 Google Cloud MaxText Expands Post-Training Capabilities: Introducing SFT and RL on Single-Ho… 3
2026-04-14 Acceldata Data Agents vs Traditional Monitoring Tools Comparison 1
2026-04-14 Arize Building smarter AI agents: architecture, evals, and lessons from the field 1
2026-04-14 HuggingFace Nucleus-Image: Scaling Text-to-Image with Sparse Mixture of Experts 1
2026-04-13 HuggingFace Releasing LiteCoder-Terminal-SFT 1
2026-04-09 Anyscale How Notion cuts embedding costs by 80% and other stories on scaling AI with Ray… 1
2026-04-08 AssemblyAI Edge cases in transcription: Offline mode, partial audio files and API limits 2
2026-04-07 Fireworks AI Own Your AI: Fireworks Training Preview 2
2026-04-07 Socket Microsoft Releases Open Source Toolkit for AI Agent Runtime Security 1