Real-time voice with Cerewro and OpenAI Realtime

Cerewro's Voice tab opens a bidirectional audio session with OpenAI's Realtime API. Speak to the AI through the microphone, receive natural voice responses and execute tools like commands, file creation or web searches with verbal confirmation.

How the voice session works

Click the "Voice" tab in Cerewro
Connection is established with OpenAI Realtime API via WebSocket
Browser microphone captures your voice in real time
OpenAI converts audio to text and generates the response
Response is played as audio through your speaker
If the AI proposes executing a tool, you can confirm verbally

Voice interaction example

[YOU]: "How much free space is left on drive C?"
[CEREWRO VOICE]: "Let me check... Drive C has 47 GB free out of 512 GB total."

Voice interface advantages

Advantage	Description
Hands-free	Ideal when you're working on another process
Speed	Faster than typing complex queries
Natural	You can interrupt and rephrase like a real conversation
Accessibility	Easier for users with typing difficulties

Requirement: The voice interface requires the browser to have microphone access permission and Cerewro's installation to have an OpenAI API key configured with access to the gpt-4o-realtime-preview model.