Foundation model pricing
GPT-5.4 series
Model |
Input (per million tokens) |
Cached input (per million tokens) |
Output (per million tokens) |
|---|---|---|---|
| GPT-5.4 (<272k context length) Global | $2.50 | $0.25 | $15 |
| GPT-5.4 (>272k context length) Global | $5 | $0.50 | $22.50 |
| GPT-5.4 Pro (<272k context length) Global | $30 | — | $180 |
| GPT-5.4 Pro (>272k context length) Global | $60 | — | $270 |
| GPT-5.4 Mini Global | $0.75 | $0.075 | $4.50 |
GPT-5.3 series
Model |
Input (per million tokens) |
Cached input (per million tokens) |
Output (per million tokens) |
|---|---|---|---|
| GPT-5.3 Codex Global | $1.75 | $0.175 | $14 |
GPT-5.2 series
Model |
Input (per million tokens) |
Cached input (per million tokens) |
Output (per million tokens) |
|---|---|---|---|
| GPT-5.2 Codex Global | $1.75 | $0.175 | $14 |
Gemini 3.1
Model |
Type |
≤200K input (per million tokens) |
>200K input (per million tokens) |
≤200K cache (per million tokens) |
>200K cache (per million tokens) |
|---|---|---|---|---|---|
| Gemini 3.1 Pro Preview | Input (text, image, video, audio) | $2 | $4 | $0.2 | $0.4 |
| Gemini 3.1 Pro Preview | Text output (response and reasoning) | $12 | $18 | — | — |
| Gemini 3.1 Flash Lite Preview | Input (text, image, video, audio) | $0.25 | $0.25 | $0.025 | $0.025 |
| Gemini 3.1 Flash Lite Preview | Input (audio) | $0.5 | $0.5 | $0.05 | $0.05 |
| Gemini 3.1 Flash Lite Preview | Text output (response and reasoning) | $1.5 | $1.5 | — | — |
Gemini 3
Model |
Type |
≤200K input (per million tokens) |
>200K input (per million tokens) |
≤200K cache (per million tokens) |
>200K cache (per million tokens) |
|---|---|---|---|---|---|
| Gemini 3 Flash Preview | Input (text, image, video, audio) | $0.5 | $0.5 | $0.05 | $0.05 |
| Gemini 3 Flash Preview | Input (audio) | $1 | $1 | $0.1 | $0.1 |
| Gemini 3 Flash Preview | Text output (response and reasoning) | $3 | $3 | — | — |
Gemini 2.5
Model |
Type |
≤200K input (per million tokens) |
>200K input (per million tokens) |
≤200K cache (per million tokens) |
>200K cache (per million tokens) |
|---|---|---|---|---|---|
| Gemini 2.5 Flash | Input (text, image, video, audio) | $0.3 | $0.3 | $0.03 | $0.03 |
| Gemini 2.5 Flash | Input (audio) | $1 | $1 | $0.1 | $0.1 |
| Gemini 2.5 Flash | Text output (response and reasoning) | $2.5 | $2.5 | — | — |
| Gemini 2.5 Flash Lite | Input (text, image, video, audio) | $0.1 | $0.1 | $0.01 | $0.01 |
| Gemini 2.5 Flash Lite | Input (audio) | $0.3 | $0.3 | $0.03 | $0.03 |
| Gemini 2.5 Flash Lite | Text output (response and reasoning) | $0.4 | $0.4 | — | — |
Note
- GPT-5.4 series uses tiered pricing split at 272K input tokens per request; other GPT models are flat-rate.
- Gemini Pro uses tiered pricing split at 200K input tokens per request; other Gemini models are flat-rate (the price column is repeated for comparison). There is no daily limit, and there are no other restrictions except the models' native context window.
- Cached Input pricing applies when prompt prefixes hit the automatic cache.
