NVIDIA DGX Spark performance

Post Details

Company

Ollama

Date Published

Oct. 23, 2025

Author

-

Word Count

400

Company Posts That Month

6

Language

-

Hacker News Points

-

Source URL

ollama.com/blog/nvidia-spark-performance

Summary

Performance tests were conducted on the NVIDIA DGX Spark using the latest firmware and Ollama version 0.12.6 to evaluate its capabilities with various models, including OpenAI's gpt-oss and other models like gemma3, llama3.1, and deepseek-r1. The tests involved generating a summary of "A Tale of Two Cities" with a constraint of 500 tokens, caching disabled, and temperatures set to zero, with each test repeated ten times. The results showed variations in token processing speeds depending on model size and quantization levels, with the gpt-oss models provided by OpenAI being tested through Ollama, which retains intended BF16 attention layers. The DGX Spark firmware can be updated via the DGX Dashboard or CLI, requiring Ubuntu distribution upgrades, and Ollama can be installed for running models. Additionally, OpenAI's Codex can be installed and used alongside Ollama for seamless integration, with the DGX Spark supporting large models like gpt-oss-120b, thanks to its substantial VRAM capacity.

Trends Found in this Post

No tracked trend matches for this post yet.