Ollama and OpenAI have collaborated to introduce new open weight models, gpt-oss-20B and gpt-oss-120B, which enhance local chat experiences and are designed for diverse developer use cases, including powerful reasoning and agentic tasks. These models offer features such as agentic capabilities for function calling and web browsing, configurable reasoning effort, and fine-tuning for specific use cases, all under a permissive Apache 2.0 license allowing for free experimentation and commercial deployment. The models utilize quantization in the MXFP4 format to reduce memory footprint, enabling the smaller model to run on systems with 16GB memory and the larger model to fit on a single 80GB GPU, with Ollama developing new kernels for support. NVIDIA and Ollama are working together to optimize model performance on NVIDIA GeForce RTX and RTX PRO GPUs, further enhancing the capabilities of the gpt-oss models, which can be accessed via the Ollama app or terminal.