⌘K

Vidu Q3 - Reference to Video

viduq3_i2v_reference

viduq3 is Vidu's latest and most advanced video generation model, featuring synchronized audio-video output and intelligent scene cutting. It supports up to 7 reference images to maintain consistent subjects across scenes, and can generate videos from 3 to 16 seconds at up to 1080p resolution. Currently optimized for animated and comic-style content, it delivers richer, more expressive results with greater creative control.

Authentication

authorization string required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key.

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters (Non-Subject Calling)

model string required

Model ID used for the request

Value: viduq3_i2v_reference


images array[string]

Image references. Supports uploading 1–7 images. The model uses the subjects in these images as references to generate videos with consistent subjects across scenes.

Note 1: Supports image Base64 encoding or image URL (must be publicly accessible)
Note 2: Supported formats: png, jpeg, jpg, webp
Note 3: Image resolution must be at least 128*128, aspect ratio must be within 1:4 or 4:1, and file size must not exceed 50MB
Note 4: The HTTP POST body must not exceed 20MB, and the encoding must include the appropriate content-type string, for example:

data:image/png;base64,{base64_encode}

sounds array[string]

Audio references. Supports uploading 1–7 audio files, used by the model as audio references.

Note 1: Supports up to 7 audio files, each up to 20 seconds long
Note 2: Supported format: mp3
Note 3: Each audio file must not exceed 50MB
Note 4: The base64-decoded byte length must be less than 20MB, and the encoding must include the appropriate content-type string, for example:

data:audio/mp3;base64,{base64_encode}

⚠️ Currently not supported. This parameter is reserved for future use.


prompt string required

Text description for the generated video

Note: Character length must not exceed 5000 characters


duration integer

Video duration

Default: 5

Range: 3 - 16


seed integer

Random seed. When not provided or set to 0, a random number is used instead.


aspect_ratio string

Aspect ratio. Supports any ratio or auto.

Default: 16:9

Options: 1:1, 9:16, 16:9, 3:4, 4:3, auto (automatically recommended based on input image)


audio bool

Whether to enable synchronized audio-video output

Default: true

Options:
true: Enable audio-video sync, output video with sound (including dialogue and sound effects)
false: Disable audio-video output, produce a silent video


resolution string

Video resolution

Default: 720p

Options: 720p, 1080p


payload string

Pass-through parameter. Transmitted as-is without any processing.

Note: Maximum 1048576 characters


off_peak bool

Off-peak mode

Default: false

Options:
true: Generate video in off-peak mode with lower credit cost. Tasks submitted in off-peak mode will be completed within 48 hours; tasks that cannot be completed will be automatically cancelled and credits will be refunded.
false: Generate video immediately

Note: Off-peak mode is supported when audio is true


watermark bool

Whether to add a watermark. Not added by default.

Options:
true: Add watermark
false: Do not add watermark

Note: Watermark content is fixed and AI-generated.


wm_position integer

Watermark position

Default: 3

Options:
1: Top-left
2: Top-right
3: Bottom-right
4: Bottom-left


wm_url string

Watermark content (image URL). Uses the default watermark if not provided.


meta_data string

Metadata identifier


callback_url string

Callback URL

Parameters (Subject Calling)

model string required

Model ID used for the request

Value: viduq3_i2v_reference


subjects array[object] required

Subject list. Supports up to 7 image or text subjects.

name string required

User-defined subject name. Can be referenced in the prompt using [@name].

server_id string

Subject ID obtained from the Create Subject API.


prompt string required

Text description for the generated video. Reference subjects using [@subjects_name].

Example: "[@1] and [@2] eating hot pot together"

Note: Character length must not exceed 5000 characters


audio bool

Whether to enable synchronized audio-video output

Default: true

Options:
true: Enable audio-video sync
false: Disable audio-video output


duration integer

Video duration

Default: 5

Range: 3 - 16


seed integer

Random seed. When not provided or set to 0, a random number is used instead.


aspect_ratio string

Aspect ratio. Supports any ratio or auto.

Default: 16:9

Options: 1:1, 9:16, 16:9, 3:4, 4:3, auto (automatically recommended based on input image or video)


resolution string

Video resolution

Default: 720p

Options: 720p, 1080p


payload string

Pass-through parameter. Transmitted as-is without any processing.

Note: Maximum 1048576 characters


off_peak bool

Off-peak mode

Default: false

Options:
true: Generate video in off-peak mode with lower credit cost. Tasks submitted in off-peak mode will be completed within 48 hours; tasks that cannot be completed will be automatically cancelled and credits will be refunded.
false: Generate video immediately

Note: Off-peak mode is supported when audio is true


watermark bool

Whether to add a watermark. Not added by default.

Options:
true: Add watermark
false: Do not add watermark

Note: Watermark content is fixed and AI-generated.


wm_position integer

Watermark position

Default: 3

Options:
1: Top-left
2: Top-right
3: Bottom-right
4: Bottom-left


wm_url string

Watermark content (image URL). Uses the default watermark if not provided.


meta_data string

Metadata identifier


callback_url string

Callback URL

Polling

Since result generation takes time, you need to poll the task status after creating the task.

The initial response only returns information such as the task ID and initial status. The final result must be obtained by polling the task status endpoint using the task ID.

See the examples on the right for polling requests and responses.

Response Format

error object

Error information, only present when status is failed

code string

Error code

message string

Detailed error message


output array

Generation results, only present when status is completed

content array

List of generated resource content

type string

Resource type

Value: image|video

url string

Processed resource URL

jobId string

Remote task ID


usage object

Usage statistics, only present when status is completed

cost string

Total cost in USD

discount number

Discount amount


metadata object

Metadata information