⌘K

GPT-4o Transcribe

vtrix-gpt-4o-transcribe

Speech-to-text model powered by GPT-4o. Offers improvements to word error rate and better language recognition and accuracy compared to original Whisper models. Supports 16,000 context window and 2,000 max output tokens.

Authentication

authorization string required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key.

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

file file required

Audio file to transcribe.

Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm

File size limit: 25 MB


model string required

Model ID to use for the request.

Value: vtrix-gpt-4o-transcribe


response_format string

Format of the output transcript.

Options: json, text

Default: json


prompt string

Text to guide the model’s style or provide context. Can be used to correct specific words or acronyms, preserve context from split files, or control punctuation and filler words.

Examples:
Correct specific terms: “The transcript is about OpenAI which makes technology like DALL·E, GPT-4, and ChatGPT.”
Preserve punctuation: “Hello, welcome to my lecture.”
Keep filler words: “Umm, let me think like, hmm… Okay, here’s what I’m, like, thinking.”


stream boolean

Whether to stream the transcription incrementally. When enabled, returns a stream of transcript.text.delta events followed by a transcript.text.done event.

Default: false


language string

Language of the input audio in ISO-639-1 or ISO-639-3 format. Providing the input language improves accuracy and latency.

Examples: en (English), zh (Chinese), ja (Japanese), es (Spanish)


Response Format

text string

The transcribed text from the audio file. Present when response_format is json.


Supported Languages

Supports 98 languages including: Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.