Reinforcement learning

Reinforcement and preference learning for AI models

Explore historical trend data

Sign up for free to navigate to earlier time periods.

Mentions Over Time
652
Total Mentions
264
Posts with Topic
6
Companies Mentioning
+26.3%
WoW Change
Historical Data
Week Mentions Posts Companies WoW Change
May 11, 2026 24 12 6 +26.3%
May 04, 2026 19 7 6 -40.6%
Apr 27, 2026 32 8 5 +33.3%
Apr 20, 2026 24 10 10 +166.7%
Apr 13, 2026 9 6 4 -10.0%
Apr 06, 2026 10 5 4 -54.5%
Mar 30, 2026 22 16 10 +57.1%
Mar 23, 2026 14 6 6 -74.5%
Mar 16, 2026 55 21 14 +129.2%
Mar 09, 2026 24 12 9 +14.3%
Mar 02, 2026 21 8 6 -12.5%
Feb 23, 2026 24 15 11 -31.4%
Feb 16, 2026 35 8 6 -16.7%
Feb 09, 2026 42 16 14 +100.0%
Feb 02, 2026 21 15 12 0%
Jan 26, 2026 21 10 10 0%
Jan 19, 2026 21 9 8 -56.3%
Jan 12, 2026 48 11 9 +14.3%
Jan 05, 2026 42 18 11 +50.0%
Dec 29, 2025 28 4 2 +47.4%
Dec 22, 2025 19 5 4 -42.4%
Dec 15, 2025 33 13 12 -10.8%
Dec 08, 2025 37 16 10 +37.0%
Dec 01, 2025 27 13 9 --
Recent Blog Posts 50 posts
Date Company Title Mentions
2026-05-15 HuggingFace Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Fronti… 1
2026-05-15 PromptLayer LLM-as-a-Judge: How Do You Know If Your AI Is Actually Good? 1
2026-05-14 Anyscale Architecting Data Pipelines for Multimodal Datasets at Scale 1
2026-05-14 HuggingFace Unlocking asynchronicity in continuous batching 1
2026-05-14 Anyscale Introducing the Anyscale Agent Skill for LLM Post-Training 9
2026-05-14 LangChain Introducing LangChain Labs 1
2026-05-14 LangChain Everything we shipped at Interrupt 1
2026-05-13 Exa How Search Quality Shapes RL Outcomes 2
2026-05-13 LangChain LangSmith LLM Gateway: runtime governance built into the agent lifecycle 1
2026-05-11 HuggingFace Building Blocks for Foundation Model Training and Inference on AWS 4
2026-05-11 HuggingFace Two Years of Local AI on a Laptop: When Open Models Outpaced Moore's Law 1
2026-05-11 Unsloth Unsloth Joins PyTorch Ecosystem 1
2026-05-09 HuggingFace "OncoAgent: A Dual-Tier Multi-Agent Framework for Privacy-Preserving Oncology C… 1
2026-05-08 Baseten Introducing the Baseten Loops SDK 1
2026-05-07 Anyscale AI agents on Ray Serve: Single to multi-agent architecture 1
2026-05-07 HuggingFace QVAC MedPsy: State-of-the-Art Medical and Healthcare Language Models for Edge D… 7
2026-05-07 Marqo Marqo vs Cimulate: Independent AI-Native Search vs Salesforce-Locked Discovery 3
2026-05-05 Confluent Integrating AI Into Apache Kafka Architectures: Patterns and Best Practices 2
2026-05-04 JetBrains PyTorch vs. TensorFlow: Choosing the Right Framework in 2026 | The PyCharm Blog 4
2026-05-02 Acceldata Enterprise Data Agents vs Traditional Monitoring Tools 1
2026-04-30 AssemblyAI Build a voice agent with a chained STT-LLM-TTS architecture 1
2026-04-29 HuggingFace Granite 4.1 LLMs: How They’re Built 15
2026-04-28 Galileo Scaling Judge Compute: The Next Frontier in AI Evaluation 1
2026-04-28 Galileo Why LLM Judges Disagree With Your Experts — and How to Fix It 7
2026-04-28 HuggingFace Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence f… 2
2026-04-27 HuggingFace OpenRA-RL: An Open Platform for AI Agents in Real-Time Strategy Games 4
2026-04-27 Couchbase What Is an AI-Powered Recommendation Engine? 1
2026-04-24 Eden AI Top Free Generative AI APIs, Open Source models, and tools 3
2026-04-24 Agora From Dark Matter to Voice AI: Deepgram’s Journey to Speech Recognition 2
2026-04-24 Anyscale Introducing Vision-Language Reinforcement Learning in SkyRL 7
2026-04-24 Datadog Introducing ARFBench: A time series question-answering benchmark based on real … 1
2026-04-23 Fireworks AI How we fixed prompt injection for all models on Fireworks 1
2026-04-23 Redis Human in the loop: Why your production AI systems need human oversight 3
2026-04-23 CopilotKit Generative UI Spectrum: How Agents Now Ship Their Own Interfaces 3
2026-04-21 Together AI Accelerate RL rollouts by up to 50% with distribution-aware speculative decoding 1
2026-04-21 Cursor Cursor partners with SpaceX on model training 1
2026-04-21 HuggingFace RL: A Structured Human Action & Intent Dataset for Physical AI and World Models 4
2026-04-20 WorkOS The OWASP Top 10 for LLM applications: What developers shipping AI features nee… 1
2026-04-17 Neo4j Build AI Agents That Make Better Decisions on GCP with Neo4j 1
2026-04-17 Neo4j Build AI Agents That Make Better Decisions on GCP with Neo4j 1
2026-04-16 HuggingFace Ecom-RLVE: Adaptive Verifiable Environments for E-Commerce Conversational Agents 4
2026-04-16 Google Cloud MaxText Expands Post-Training Capabilities: Introducing SFT and RL on Single-Ho… 3
2026-04-14 Acceldata Data Agents vs Traditional Monitoring Tools Comparison 1
2026-04-14 Arize Building smarter AI agents: architecture, evals, and lessons from the field 1
2026-04-14 HuggingFace Nucleus-Image: Scaling Text-to-Image with Sparse Mixture of Experts 1
2026-04-13 HuggingFace Releasing LiteCoder-Terminal-SFT 1
2026-04-09 Anyscale How Notion cuts embedding costs by 80% and other stories on scaling AI with Ray… 1
2026-04-08 AssemblyAI Edge cases in transcription: Offline mode, partial audio files and API limits 2
2026-04-07 Fireworks AI Own Your AI: Fireworks Training Preview 2
2026-04-07 Socket Microsoft Releases Open Source Toolkit for AI Agent Runtime Security 1