⌘K

Vtrix CLI Models

Multimodal models covering video, image, and audio. Unified endpoint POST /model/v1/generation, differentiated by the model field.

Video Models

Spark Dance Series

Model IDNameInputNotes
spark_dance_v2_0Seedance 2.0text / image / video / audioFlagship, up to 15s, multimodal input
spark_dance_v2_0_fastSeedance 2.0 Fasttext / image / video / audioFast variant, ideal for draft iteration

Kirin Series

Model IDTypeInput
kirin_v2_6_t2vT2Vtext
kirin_v2_6_i2vI2Vtext / image
kirin_v3_t2vT2Vtext
kirin_v3_i2vI2Vimage
kirin_v3_omni_videoOmnitext / image / video
kirin_video_o1O1text / image / video
kirin_v3_motion_controlMotion Controltext / image / video
kirin_duration_extensionDuration Extensiontext / video
kirin_identify_faceFace Recognitionvideo

Vidu Series

Model IDTypeInput
viduq3_pro_text2videoQ3 Pro T2Vtext
viduq3_pro_img2videoQ3 Pro I2Vtext / image
viduq3_turbo_text2videoQ3 Turbo T2Vtext
viduq3_turbo_img2videoQ3 Turbo I2Vtext / image
viduq2_pro_img2videoQ2 Pro I2Vtext / image
viduq2_pro_referenceQ2 Pro Referenceimage / video
viduq1_text2videoQ1 T2Vtext
viduq1_img2videoQ1 I2Vimage

Other Video Models

Model IDProviderInput
veo_3.1_generate_001Googletext / image
pixverse_v6_t2vPixversetext
pixverse_v6_i2vPixverseimage
pixverse_v6_transitionPixverseimage

Image Models

Model IDNameProvider
spark_dream_5_0Spark Dream 5.0Vtrix
spark_dream_4_5Spark Dream 4.5Vtrix
kirin_v3_imageKirin V3 ImageVtrix
kirin_v3_omni_imageKirin V3 Omni ImageVtrix
gpt_image_1_5GPT Image 1.5OpenAI
gpt_image_1_5_editGPT Image 1.5 EditOpenAI
nano_banana_2Nano Banana 2Google
qwen_image_edit_plusQwen Image Edit PlusAlibaba
wan27_image_proWan2.7 Image ProWan

Audio Models

Model IDNameDescription
kirin_text_to_audioKirin Text2AudioText to sound effects, 3–10 seconds
kirin_video_to_audioKirin Video2AudioVideo to audio, 3–20 seconds
mureka_song_generatorMureka Song GeneratorAI song generation
mureka_lyrics_generatorMureka Lyrics GeneratorAI lyrics generation

Parameter Spec Example: Spark Dance 2.0

Using spark_dance_v2_0 as an example. Run vtrix models spec spark_dance_v2_0 to view the full spec.

ParameterTypeRequiredDefaultDescription
contentarrayrequiredMultimodal input array (text / image_url / video_url / audio_url)
resolutionstringoptional720p480p / 720p / 1080p
ratiostringoptionaladaptive16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / adaptive
durationintegeroptional5Duration in seconds, range 4–15, -1 for auto
seedintegeroptional-1Random seed
generate_audiobooleanoptionaltrueWhether to generate synchronized audio
return_last_framebooleanoptionalfalseWhether to return the last frame as an image
service_tierstringoptionaldefaultflex = async inference, 50% off
camera_fixedbooleanoptionalfalseFix the camera angle
callback_urlstringoptionalCallback URL on task completion
safety_identifierstringoptionalEnd-user identifier (≤ 64 characters)