Voice Calls
The Karibu Voice API lets you place outbound voice calls to phone numbers. When the recipient answers, they hear either a pre-recorded audio file or synthesized speech from text. All voice endpoints require your X-API-Key header for authentication.Overview
You can initiate voice calls in three ways:| Method | Endpoint | Use when |
|---|---|---|
| Audio from URL | POST /v1/voice/calls/audio | You have a public URL to an audio file (MP3, WAV). |
| Audio upload | POST /v1/voice/calls/audio/upload | You have the audio file on disk and want to upload it. |
| Text-to-speech (TTS) | POST /v1/voice/calls/tts | You want the system to speak a text message. |
255788344348 for Tanzania).
1. Voice call with audio from URL
Use this when your audio is already hosted at a public URL (e.g. on your CDN or S3 with public read). The API will fetch the file and play it to the recipient when they answer.Request
- Method:
POST /v1/voice/calls/audio - Headers:
X-API-Key(required),Content-Type: application/json - Body:
receiver_number(string, required) — E.164 or national number to call (e.g.255788344348)audio_url(string, required) — Public URL of the audio file (e.g. MP3, WAV)
Example
When to use
- Announcements or pre-recorded messages already on a server
- Audio that you update by changing the file at the same URL
- No need to send the file in the request body
2. Voice call with uploaded audio
Use this when you have the audio file locally (e.g. from your app or user upload). You send the file as multipart form data; it is stored and then played to the recipient when they answer.Request
- Method:
POST /v1/voice/calls/audio/upload - Headers:
X-API-Key(required). Do not setContent-Type; the client will set it tomultipart/form-datawith the correct boundary. - Body (multipart/form-data):
receiver_number(string, required) — E.164 or national number to callfile(file, required) — Audio file (MP3 or WAV) to play to the recipient
Example
When to use
- Audio is generated or stored on your server
- You don’t have a public URL for the file
- One-off or dynamic recordings (e.g. per-user messages)
3. Voice call with text-to-speech (TTS)
Use this when you want the system to speak a text message to the recipient. No audio file is needed; the text is synthesized to speech when they answer.Request
- Method:
POST /v1/voice/calls/tts - Headers:
X-API-Key(required),Content-Type: application/json - Body:
receiver_number(string, required) — E.164 or national number to calltext(string, required) — The message to be spoken (at least one character)
Example
When to use
- Notifications or alerts (e.g. order confirmation, reminders)
- Dynamic content that changes per call
- No need to record or host audio files
Authentication
All voice endpoints require the X-API-Key header with a valid API key. Use the same key you use for other Karibu API calls (workspace, messages, OTP, etc.).Phone number format
- Use E.164 or national format (e.g.
255788344348for Tanzania). - Include country code; no leading
+is required in the request body. - Validate numbers before calling to avoid failed or wrong-destination calls.
Webhooks for voice events
To receive call status or events (e.g. answered, completed, failed), create a webhook with service typevoice for your developer app. See Webhooks for creating and managing webhooks. The voice endpoints initiate the call; webhooks deliver event notifications to your URL.
Summary
| Need | Endpoint | Key parameters |
|---|---|---|
| Play audio from a URL | POST /v1/voice/calls/audio | receiver_number, audio_url |
| Upload and play audio file | POST /v1/voice/calls/audio/upload | receiver_number, file (multipart) |
| Speak text to the user | POST /v1/voice/calls/tts | receiver_number, text |