Training Smart Turn on the NVIDIA DGX Spark™

Post Details

Company

Daily

Date Published

Feb. 10, 2026

Author

Marcus

Word Count

960

Language

English

Hacker News Points

-

Source URL

www.daily.co/blog/training-smart-turn-on-the-nvidia-dgx-spark

Summary

The NVIDIA DGX Spark is a compact AI supercomputer designed for AI inference and training, featuring a unique architecture with 128GB of unified memory shared between its Arm CPU and NVIDIA Blackwell CUDA cores. This architecture allows it to handle larger models than typical consumer GPUs. The article explores the experience of training the open-source Smart Turn model on the DGX Spark, a task that was previously done on x86_64 devices, and highlights the necessity of compiling certain library dependencies for the Spark's Arm architecture. Training performance on the Spark, which involved adjusting batch sizes to leverage its extensive memory, was found to be comparable to that of traditional GPUs like the NVIDIA L4 and RTX 5060 Ti. Although Smart Turn is a small model and not memory-limited, the DGX Spark's unified memory offers significant advantages for training larger models and more demanding configurations. The process was streamlined, with minimal changes needed to existing scripts, and it is expected to become even simpler with future software updates.