Real-time voice with Cerewro and OpenAI Realtime

Cerewro's Voice tab opens a bidirectional audio session with OpenAI's Realtime API. Speak to the AI through the microphone, receive natural voice responses and execute tools like commands, file creation or web searches with verbal confirmation.

How the voice session works

  1. Click the "Voice" tab in Cerewro
  2. Connection is established with OpenAI Realtime API via WebSocket
  3. Browser microphone captures your voice in real time
  4. OpenAI converts audio to text and generates the response
  5. Response is played as audio through your speaker
  6. If the AI proposes executing a tool, you can confirm verbally
Voice interaction example
[YOU]: "How much free space is left on drive C?"
[CEREWRO VOICE]: "Let me check... Drive C has 47 GB free out of 512 GB total."

Voice interface advantages

AdvantageDescription
Hands-freeIdeal when you're working on another process
SpeedFaster than typing complex queries
NaturalYou can interrupt and rephrase like a real conversation
AccessibilityEasier for users with typing difficulties
Requirement: The voice interface requires the browser to have microphone access permission and Cerewro's installation to have an OpenAI API key configured with access to the gpt-4o-realtime-preview model.