In August 2025, OpenAI quietly did something it hasn’t done in over five years — it gave the world a free, downloadable GPT.
Called GPT-OSS, this “open-weight” model comes in two sizes — a lighter 20B version that can run on laptops or cloud servers, and a 120B powerhouse for enterprise-level work. Unlike ChatGPT, GPT-OSS runs entirely on your own infrastructure, keeping your data private while letting you customise the model for your exact needs.
In this guide, you’ll learn exactly how to download GPT-OSS, set it up, and put it to work — whether you’re a developer, startup, or business leader. We’ll also cover benchmarks, monetisation opportunities, and why this release could reshape how companies use AI in 2025.
"One of the things that is unique about open models is that people can run them locally. People can run them behind their own firewall, on their own infrastructure," OpenAI co-founder Greg Brockman
An open-weight model is a large language model (LLM) whose trained parameters (“weights”) are released to the public. This allows anyone to:
This contrasts with closed models like ChatGPT or Claude, where the model runs on the provider’s servers, and you access it only via API ( Application Programming Interface) sending your data to a black box.
GPT‑OSS is OpenAI’s new family of open-weight large language models, released under the Apache 2.0 license. That means you can run them locally, customize them, and use them commercially.
There are two versions:
Unlike GPT‑4 or GPT-3.5, you don’t need to send any data to OpenAI. You can download the models and run them behind your firewall.
While GPT-4 and ChatGPT are powerful, they’re locked behind OpenAI’s servers and usage fees. GPT-OSS changes the game:
For companies seeking a GPT-4 alternative with full control and no vendor lock-in, GPT-OSS is a strong contender.
You can download GPT-OSS directly from OpenAI’s official GitHub releases:
Steps:
(Tip: Search “download OpenAI GPT” to find official release notes and mirrors.)
Whether you want to run GPT-OSS locally or on a cloud server, the setup process is straightforward:
Local Deployment (Windows/Mac/Linux)
Cloud Deployment (AWS, Azure, GCP)
This makes GPT-OSS one of the easiest self-hosted AI models for 2025.
I’ve kept it simple and non-technical, so someone with no coding experience could follow it to install and run GPT-OSS locally.
Tip: Start with 20B unless you have a very powerful PC or workstation.
You have three ways to run GPT-OSS locally.
The easiest options are Ollama or LM Studio (both work on Mac and Windows).
Extra: Ollama has an optional web search function (requires a free Ollama account). This feature may be slow right now because the model just launched.
"In the long term, open source will be more cost-effective... because you're not paying for the additional cost of IP and development." — Andrew Jardine, Hugging Face
For business AI use cases, GPT-OSS allows deep customisation and cost savings.
Knolli lets you turn GPT-OSS into a monetisable AI co-pilot:
Creators and companies can earn revenue by offering specialised GPT-OSS-powered tools to their audiences.
✅ Quantization (e.g., MXFP4, QLoRA) makes big models usable on smaller GPUs.
OpenAI released GPT‑OSS without a monetization plan — no upsells, no hosted version. Why?
Some speculate it’s also a hedge against regulation: if they release weights, they sidestep closed-model scrutiny.
You can fine-tune a model using:
GPT‑OSS changes the game. You don’t need to rent AI anymore, you can own it.
This is the dawn of a new phase where startups, nonprofits, governments, and solo developers can:
Open-weight AI models (20B & 120B) released by OpenAI under Apache 2.0 license.
Open-weight = model weights are released.
Open-source = code and sometimes data are released too.
All open-source models are open-weight, but not vice versa.
Yes, OpenAI released it under Apache 2.0, which is highly permissive. You can:
Meta has hinted at Llama 3 (expected 2025), likely continuing their open-weight strategy. Current top models: Llama 2 and Code Llama.
Yes, if:
Yes — no internet, no OpenAI account needed.
No — but it matches GPT-4 Mini, beats GPT-3.5 on some benchmarks, and is open.
Yes, you only pay for compute.
Yes, both the 20B and 120B versions are free to download and run.
From OpenAI’s official GitHub repository (see download section above).
For many use cases, yes especially when privacy, cost, and customisation matter.
20B model can run on a high-VRAM laptop or small cloud instance; 120B requires data centre GPUs.