post
https://platform.perso.ai/api/speech/v1/text-to-speech//stream
Convert text to speech with streaming response for real-time playback.
Prerequisites
You must have at least one cloned voice. Use Create Voice (POST /v1/voices/add) to create a voice, then retrieve its voice_id from List Voices (GET /v2/voices).
Authenticate using the PersoPlatform-APIKey header. The ElevenLabs-compatible xi-api-key header is also accepted:
--header 'xi-api-key: <your-api-key>'
URL Path Parameters
Replace {voice_id} in the URL with your voice ID. Retrieve available voice IDs from List Voices (GET /v2/voices).
POST /v1/text-to-speech/{voice_id}/stream
Request Parameters
| Parameter | Required | Description |
|---|---|---|
text | Yes | Text to convert to speech |
model_id | No | Default: perso_multilingual_v1 |
Query Parameters
| Parameter | Required | Description |
|---|---|---|
output_format | Yes | Audio format for streaming. Default: wav_24000 (see formats below) |
How It Works
Audio chunks are streamed as they are generated, enabling:
- Lower latency for first audio
- Real-time playback during generation
- Reduced memory usage for long text
Example
Select "TextToSpeechStreamRequest" from the Examples dropdown to see a request example.
curl -X POST "https://platform.perso.ai/api/speech/v1/text-to-speech/{voice_id}/stream?output_format=mp3_44100_192" -H "PersoPlatform-APIKey: <your-api-key>" -H "Content-Type: application/json" -H "Accept: */*" -d '{"text": "Hello, this is a streaming test."}' --output audio.mp3