DeepMath: A lightweight math reasoning Agent with SmolAgents

Post Details

Company

HuggingFace

Date Published

Dec. 4, 2025

Author

Daniel Fleischer, Moshe Berchansky, and Moshe Wasserblat

Word Count

1,123

Company Posts That Month

48

Language

-

Hacker News Points

-

Source URL

huggingface.co/blog/intel-deepmath

Summary

DeepMath is a math reasoning agent developed by the Intel AI Software Group, designed to enhance the accuracy and efficiency of mathematical problem-solving in large language models (LLMs). Built on the Qwen3-4B Thinking model and fine-tuned with Group Relative Policy Optimization (GRPO), DeepMath reduces output length by up to 66% while often improving accuracy by emitting concise Python code snippets for intermediate steps, executed in a secure sandbox. The model's training focuses on offloading deterministic computation and encouraging concise, computation-driven reasoning, with GRPO rewarding correctness and brevity. Evaluated on datasets like MATH500, AIME, HMMT, and HLE, DeepMath demonstrates the benefits of combining a small executor with LLMs, offering a more interpretable and accurate math-solving agent without the need for massive models or extensive external tools.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
LLM	8	3,775	638	202	-32%
AI Model Fine-tuning	3	603	116	61	+8%
Reinforcement learning	1	132	49	26	-55%