GPT-4o Audio: Pricing, Context Window & Benchmarks
by OpenAI
The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs are currently not supported. Audio tokens are priced at $40 per million input and $80 per million output audio tokens.
What you can do with GPT-4o Audio
Everyday Q&A and clear explanations
Writing help (emails, posts, summaries)
Idea generation and brainstorming
Learning support with step-by-step guidance
Benchmarks not available
This model isn't listed on Artificial Analysis yet. Showing OpenRouter specs below.
| Metric | Value |
|---|---|
| Provider | OpenAI |
| Context Window | 128,000 tokens |
| Input Price | $2.50/1M tokens |
| Output Price | $10.00/1M tokens |
| Release Date | Aug 15, 2025 |
| Modalities | audio, text |
| Capabilities | Audio Input, Audio Output |
Compare GPT-4o Audio to other models
See how it stacks up on price, quality, and overall performance.
Frequently asked questions
What is GPT-4o Audio good for?
Use GPT-4o Audio for everyday tasks like writing, summarizing, brainstorming, and getting clear explanations.
How much does GPT-4o Audio cost?
Pricing is based on usage. Current rates are $2.50/1M tokens for input and $10.00/1M tokens for output.
Can I try GPT-4o Audio for free?
Yes. You can start a chat instantly and test the model before deciding on a plan.
Does GPT-4o Audio support images or audio?
GPT-4o Audio focuses on text-based tasks.
Similar models
Pricing, context, and capability data are sourced from OpenRouter.
Compare Models
Select a model to compare with GPT-4o Audio