GPT-4o Transcribe | Vtrix API Docs

Authentication

authorization `string` required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key.

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

file `file` required

Audio file to transcribe.

Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm

File size limit: 25 MB

model `string` required

Model ID to use for the request.

Value: vtrix-gpt-4o-transcribe

response_format `string`

Format of the output transcript.

Options: json, text

Default: json

prompt `string`

Text to guide the model’s style or provide context. Can be used to correct specific words or acronyms, preserve context from split files, or control punctuation and filler words.

Examples:
Correct specific terms: “The transcript is about OpenAI which makes technology like DALL·E, GPT-4, and ChatGPT.”
Preserve punctuation: “Hello, welcome to my lecture.”
Keep filler words: “Umm, let me think like, hmm… Okay, here’s what I’m, like, thinking.”

stream `boolean`

Whether to stream the transcription incrementally. When enabled, returns a stream of transcript.text.delta events followed by a transcript.text.done event.

Default: false

language `string`

Language of the input audio in ISO-639-1 or ISO-639-3 format. Providing the input language improves accuracy and latency.

Examples: en (English), zh (Chinese), ja (Japanese), es (Spanish)

Response Format

text `string`

The transcribed text from the audio file. Present when response_format is json.

Supported Languages

Supports 98 languages including: Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.

curl --location 'https://cloud.vtrix.ai/llm/v1/audio/transcriptions' \ --header 'Authorization: Bearer YOUR_API_KEY' \ --header 'Content-Type: multipart/form-data' \ --form file=@/path/to/file/audio.mp3 \ --form model=vtrix-gpt-4o-transcribe

import openai client = openai.OpenAI( api_key="YOUR_API_KEY", base_url="https://cloud.vtrix.ai/llm" ) audio_file = open("/path/to/file/audio.mp3", "rb") transcription = client.audio.transcriptions.create( model="vtrix-gpt-4o-transcribe", file=audio_file ) print(transcription.text)

import fs from 'fs'; import OpenAI from 'openai'; const client = new OpenAI({ apiKey: 'YOUR_API_KEY', baseURL: 'https://cloud.vtrix.ai/llm' }); const transcription = await client.audio.transcriptions.create({ file: fs.createReadStream('/path/to/file/audio.mp3'), model: 'vtrix-gpt-4o-transcribe' }); console.log(transcription.text);

Authentication

authorization string required

Parameters

file file required

model string required

response_format string

prompt string

stream boolean

language string