All Models

OpenAI: GPT Audio

The gpt-audio model is OpenAI's first generally available audio model. The new snapshot features an upgraded decoder for more natural sounding voices and maintains better voice consistency. Audio is priced at $32 per million input tokens and $64 per million output tokens.

Providers 1
Released Jan 20, 2026
Input Modalities audio, text
Output Modalities audio, text

Available Providers (1)

Provider Model ID Input Cost Output Cost Context Max Output Docs
Kilo Gateway openai/gpt-audio $2.50/MTok $10.00/MTok 128K 16.4K

Capabilities

Reasoning
Tool Calling
Attachments
Open Weights
Structured Output