Home / Companies / Northflank / Blog / Post Details
Content Deep Dive

Run OpenAI's new GPT-OSS (open-source) model on Northflank

Blog post from Northflank

Post Details
Company
Date Published
Author
Will Stewart
Word Count
1,156
Language
English
Hacker News Points
-
Summary

OpenAI has introduced GPT-OSS, its first fully open-source large language model family, available under an Apache 2.0 license, featuring models gpt-oss-20b and gpt-oss-120b designed for efficient inference and enhanced reasoning capabilities. These models are integrated into Hugging Face Transformers and utilize a Mixture-of-Experts architecture with 4-bit quantization for optimized performance. The gpt-oss-20b model is suited for speed and accessibility, fitting on a single 16GB GPU, while the gpt-oss-120b model offers superior performance on complex tasks and requires a multi-GPU setup, like using Northflank's platform for deployment. Northflank facilitates easy deployment with a one-click template, allowing users to self-host the models without infrastructure setup, providing control over latency, cost, and privacy without any rate limits. This release marks a significant shift from prior closed-source models like GPT-3 and GPT-4, granting developers the flexibility to run the models locally or on their own infrastructure while maintaining high performance and transparency in deployment costs.