Company
Date Published
Author
Clarifai
Word count
819
Language
English
Hacker News points
None

Summary

OpenAI has introduced the GPT-OSS series, featuring the gpt-oss-120b and gpt-oss-20b models under the Apache 2.0 license, designed for advanced reasoning, tool use, and agentic workflows. These models use a Mixture of Experts design with an extended context length of 131K tokens and can run on a single 80 GB GPU thanks to quantization. Benchmarking tests on NVIDIA B200 and H100 GPUs revealed that the B200 outperformed in several scenarios, offering up to 15 times faster inference compared to a single H100, with lower power consumption and complexity. Clarifai has launched the Developer Plan at a promotional price, enabling users to run these models locally and access them through a public API. The model library has expanded with new additions like GPT-5 and Qwen3-Coder, and Ollama support has been integrated, allowing for easy downloading and running of open-source models on local machines. Additionally, Clarifai has enhanced its platform with improvements in Python SDK, workflow pricing visibility, and model comparison in the Playground, facilitating more efficient and flexible deployment of models across various environments.