⌘K

Gemini 3.1 Flash TTS Preview

gemini_3_1_flash_tts_preview

Gemini 3.1 Flash TTS Preview converts text into speech audio with configurable voices.

API Notes

gemini_3_1_flash_tts_preview returns generated audio as an audio/wav file URL in the task result

gemini_3_1_flash_tts_preview usage is reported with input_text_tokens and output_audio_tokens when token usage is available

Authentication

authorization string required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key.

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

model string required

Model ID to use for the request

Value: gemini_3_1_flash_tts_preview


input array required

Input array for the unified generation request

params object required

Text-to-speech parameters

text string required

Text to convert to speech. prompt is also accepted as a compatibility alias, but text is recommended

voice_name string

Prebuilt voice name for speech synthesis

Default: Kore


Polling

Since audio generation takes time, you need to poll the task status after creation

The initial response returns the task ID and initial status. The actual generated audio URL must be obtained through the task status endpoint

Response Format

error object

Error information. Only present when status is failed

code integer

Error code

error_message string

Detailed error message


output array

Generation results. Only present when status is completed

content array

List of generated audio content

type string

Resource type, fixed as audio

mime_type string

Audio MIME type, fixed as audio/wav

url string

Generated audio file URL


usage object

Usage statistics. Only present when status is completed

extra_info object

Normalized token usage details

input_text_tokens integer

Number of input text tokens

output_audio_tokens integer

Number of generated audio tokens

total_tokens integer

Total token count


metadata object

Metadata information