Home / Companies / HuggingFace / Blog / Post Details
Content Deep Dive

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action

Blog post from HuggingFace

Post Details
Company
Date Published
Author
Asawaree and Atharva Joshi
Word Count
1,960
Language
-
Hacker News Points
-
Summary

NVIDIA Cosmos 3, now available on Hugging Face, represents a significant advancement in world foundation models (WFMs) for physical AI by offering a unified omni-model that integrates world generation, physical reasoning, and action generation. Built on a Mixture-of-Transformers architecture, Cosmos 3 consolidates capabilities previously handled by separate models, enabling the generation of realistic video worlds, reasoning about physical properties, and predicting future sequences within a single model. It is designed for applications in robotics, autonomous vehicles, and smart spaces, leveraging its ability to understand and simulate complex physical environments. The model comes in two versions—Cosmos 3 Nano, optimized for efficient inference, and Cosmos 3 Super, intended for large-scale synthetic data generation and research. With integration into the Hugging Face Diffusers library, Cosmos 3 facilitates seamless adoption within existing pipelines and supports various input and output modalities. Accompanying the launch are Synthetic Data Generation datasets and resources for post-training, further enhancing its utility for training and evaluating physical AI systems.