OpenAI launches new voice intelligence features in its API

Published: (May 7, 2026 at 06:24 PM EDT)
2 min read
Source: TechCrunch

Source: TechCrunch

New Voice Models

GPT‑Realtime‑2

  • A voice model built to create realistic vocal simulations that can converse with users.
  • Incorporates GPT‑5‑class reasoning to handle more complex user requests, improving on its predecessor (GPT‑Realtime‑1.5).
  • Learn more about GPT‑Realtime‑2

GPT‑Realtime‑Translate

  • Provides real‑time translation services that “keep pace” with the user in a conversational flow.
  • Supports 70+ input languages (languages it can understand) and 13 output languages (languages it can speak).
  • Supported languages

GPT‑Realtime‑Whisper

  • Offers live speech‑to‑text capabilities, capturing spoken words as interactions occur.

“Together, the models we are launching move real‑time audio from simple call‑and‑response toward voice interfaces that can actually do work: listen, reason, translate, transcribe, and take action as a conversation unfolds,” the company said.

Potential Use Cases

These updates are valuable for:

  • Expanding customer‑service capabilities
  • Education platforms
  • Media production
  • Event management
  • Creator platforms
  • And other applications that benefit from real‑time voice interaction

Safety Measures

OpenAI has implemented guardrails to prevent misuse, such as spam, fraud, or other forms of online abuse. Specific triggers can halt conversations that violate the company’s harmful‑content guidelines.

Availability and Pricing

All new voice models are available through OpenAI’s Realtime API.

  • Translate and Whisper are billed by the minute.
  • GPT‑Realtime‑2 is billed based on token consumption.
0 views
Back to Blog

Related posts

Read more »