NVIDIA’s Peter Belcak Distills Why Small Language Models are the Future of Agentic AI

Post Details

Company

Arize

Date Published

Sept. 5, 2025

Author

Parth Shisode

Word Count

1,253

Language

English

Hacker News Points

-

Source URL

arize.com/blog/nvidias-small-language-models-are-the-future-of-agentic-ai-paper

Summary

Peter Belcak, an AI researcher at NVIDIA, presents in his paper that small language models (SLMs) are the future of agentic AI due to their efficiency and cost-effectiveness in specific tasks within agentic systems. The paper argues that SLMs, defined as models with fewer than 10 billion parameters, can be as effective as larger language models (LLMs) in tasks like tool-calling, structured reasoning, and code orchestration, offering significant cost and efficiency benefits. Belcak suggests a pragmatic workflow where tasks are initially mapped with a large model, then specialized SLMs are implemented for specific jobs, fine-tuned iteratively for quality and efficiency. The paper emphasizes that heterogeneous systems, which utilize both SLMs and LLMs, are more suitable for achieving the best results in agentic AI, particularly in environments where resource optimization is crucial. The research also highlights the potential for SLMs to enable edge and on-device deployments, enhancing privacy and reducing latency for light-duty language components, while reserving more complex tasks for larger models.