⌘K

Wan 2.6 - Reference to Video

wan26_r2v

Wan 2.6 - Reference to Video generates videos based on reference images with enhanced style consistency and quality control.

Authentication

authorization string required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

model string required

Model ID to use for the request

Value: wan26_r2v


input object required

Input parameters for the generation request

prompt string required

Text prompt describing the video content. Supports multi-character narratives by referencing characters as “character1”, “character2”, etc. These placeholders will be matched with reference materials in the order of the reference_urls array.

Supports Chinese and English. Each Chinese character/letter counts as one character, and excess content will be automatically truncated.

Maximum length: 1500 characters

Multi-character example:
“character1 is talking with character2 in the garden”
character1 → reference_urls[0]
character2 → reference_urls[1]

Example: character1 walking in the park with a happy expression

reference_urls array required

Array of reference material URLs used to maintain character appearance and style consistency. Supports images and videos.

Reference Material Requirements:
Total count: 1-5 items (can be mixed images and videos)
Maximum images: 5
Maximum videos: 3
Order matters: reference_urls[0] corresponds to “character1” in prompt, reference_urls[1] to “character2”, etc.

Image Requirements:
Supported formats: JPEG, JPG, PNG (no transparency), BMP, WEBP
Image resolution: Width and height range [360, 2000] pixels
File size: Maximum 10MB

Video Requirements:
Supported formats: MP4, MOV
Duration: 3 - 30 seconds
File size: Maximum 100MB

Input Methods:
Method 1: Publicly accessible URL
Supports HTTP or HTTPS protocol
Example: https://example.com/character.jpg

Method 2: Base64 encoded string
Format: data:{MIME_type};base64,{base64_data}
Example: data:image/png;base64,iVBORw0KGgoAAAANS...

Example: ["https://example.com/char1.jpg", "https://example.com/char2.jpg"]

audio_url string

URL of the audio file. The model will use this audio to generate the video.

Supports HTTP or HTTPS protocol. Local files can obtain temporary URLs by uploading files.

Audio Limitations:
Supported formats: wav, mp3
Duration: 3 - 30 seconds
File size: Maximum 15MB

Handling Excess:
If audio length exceeds the duration value (5 or 10 seconds), the first 5 or 10 seconds are automatically extracted, and the rest is discarded.
If audio length is less than video duration, the portion beyond audio length will be silent. For example, if audio is 3 seconds and video duration is 5 seconds, the output video has sound for the first 3 seconds and is silent for the last 2 seconds.

Example: https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/xxx.mp3


parameters object

Generation parameters

size string

Important: The size parameter directly affects billing costs. Cost = Unit price (based on resolution) × Duration (seconds). For the same model: 1080P > 720P > 480P. Please confirm the model pricing before calling.

Size must be set to a specific value (e.g., 1280*720), not 1:1 or 480P.

Specifies the generated video resolution in the format width*height.

720P Tier:
1280*720: 16:9
720*1280: 9:16
960*960: 1:1
1088*832: 4:3
832*1088: 3:4

1080P Tier:
1920*1080: 16:9
1080*1920: 9:16
1440*1440: 1:1
1632*1248: 4:3
1248*1632: 3:4

Default: 1920*1080

duration integer

Important: The duration parameter directly affects billing costs. Cost = Unit price (based on resolution) × Duration (seconds). Please confirm the model pricing before calling.

Duration of the generated video in seconds (integers only).

Options: 5, 10, 15

Range: 2 - 15

Default: 5

prompt_extend boolean

Whether to enable intelligent prompt rewriting. When enabled, uses a large model to intelligently rewrite the input prompt. This significantly improves generation results for shorter prompts but adds processing time.

Options: true, false

Default: true

watermark boolean

Whether to add a watermark identifier. The watermark is located in the lower right corner of the video with fixed text “AI Generated”.

Options: false, true

Default: false

audio boolean

Whether to generate video with audio.

Options: true, false

Default: true

shot_type string

Specifies the shot type of the generated video, i.e., whether the video consists of one continuous shot or multiple switching shots.

Effective condition: Only takes effect when prompt_extend: true.

Parameter priority: shot_type > prompt. For example, if shot_type is set to single, even if the prompt contains “generate multi-shot video”, the model will still output a single-shot video.

Note: When strict control over video narrative structure is needed (e.g., single shot for product demonstrations, multi-shot for short stories), this parameter can be specified.

Options: single, multi

Default: single

seed integer

Random seed. If not specified, the system automatically generates a random seed. To improve reproducibility of generation results, it is recommended to fix the seed value.

Note: Due to the probabilistic nature of model generation, even with the same seed, it cannot guarantee that generation results will be completely consistent every time.

Example: 12345

Range: 0 - 2147483647


Polling

Since video generation takes time, you need to poll the task status after creation.

The initial response returns the task ID and initial status. The actual generation results must be obtained through polling the task status endpoint.


Response Format

error object

Error information. Only present when status is failed.

code string

Error code

message string

Detailed error message


output array

Generation results. Only present when status is completed.

content array

List of generated video content

type string

Resource type

Value: video

url string

Processed video URL (CDN address)

jobId string

Remote task ID


usage object

Usage statistics. Only present when status is completed.

cost string

Total cost in USD

discount number

Discount amount


metadata object

Metadata information.


Error Codes

Error CodeDescription
001028001reference_urls must be an array with 1-5 items
001028002reference_urls can contain maximum 5 images
001028003reference_urls can contain maximum 3 videos
001028004Invalid reference URL format
001028095Internal generation error
001028096Result parsing error
001028097HTTP error response
001028098Status check error
001028099Task creation error