⌘K

Wan 2.6 - Text to Video

wan26_t2v

Alibaba Wanx 26 Text to Video generates high-quality videos from text descriptions using Alibaba's advanced video generation technology.

Authentication

authorization string required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

model string required

Model ID to use for the request

Value: wan26_t2v


input object required

Input parameters for the generation request

prompt string required

Text prompt used to describe the expected elements and visual characteristics in the generated video. Supports Chinese and English. Each Chinese character/letter counts as one character, and excess content will be automatically truncated.

Maximum length: 1500 characters for wan2.6-t2v

Example: A kitten running in the moonlight

audio_url string

URL of the audio file. The model will use this audio to generate the video.

Supports HTTP or HTTPS protocol. Local files can obtain temporary URLs by uploading files.

Audio Limitations:
Supported formats: wav, mp3
Duration: 3 - 30 seconds
File size: Maximum 15MB

Handling Excess:
If audio length exceeds the duration value (5 or 10 seconds), the first 5 or 10 seconds are automatically extracted, and the rest is discarded.
If audio length is less than video duration, the portion beyond audio length will be silent. For example, if audio is 3 seconds and video duration is 5 seconds, the output video has sound for the first 3 seconds and is silent for the last 2 seconds.

Example: https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/xxx.mp3


parameters object

Generation parameters

size string

Important: The size parameter directly affects billing costs. Cost = Unit price (based on resolution) × Duration (seconds). For the same model: 1080P > 720P > 480P. Please confirm the model pricing before calling.

Size must be set to a specific value (e.g., 1280*720), not 1:1 or 480P.

Specifies the generated video resolution in the format width*height.

720P Tier:
1280*720: 16:9
720*1280: 9:16
960*960: 1:1
1088*832: 4:3
832*1088: 3:4

1080P Tier:
1920*1080: 16:9
1080*1920: 9:16
1440*1440: 1:1
1632*1248: 4:3
1248*1632: 3:4

Default: 1920*1080

duration integer

Important: The duration parameter directly affects billing costs. Cost = Unit price (based on resolution) × Duration (seconds). Please confirm the model pricing before calling.

Duration of the generated video in seconds (integers only).

Options: 5, 10, 15

Range: 2 - 15

Default: 5

prompt_extend boolean

Whether to enable intelligent prompt rewriting. When enabled, uses a large model to intelligently rewrite the input prompt. This significantly improves generation results for shorter prompts but adds processing time.

Options: true, false

Default: true

watermark boolean

Whether to add a watermark identifier. The watermark is located in the lower right corner of the video with fixed text “AI Generated”.

Options: false, true

Default: false

audio boolean

Whether to generate video with audio (must be false when using reference video).

Options: true, false

Default: true

shot_type string

Specifies the shot type of the generated video, i.e., whether the video consists of one continuous shot or multiple switching shots.

Effective condition: Only takes effect when prompt_extend: true.

Parameter priority: shot_type > prompt. For example, if shot_type is set to single, even if the prompt contains “generate multi-shot video”, the model will still output a single-shot video.

Note: When strict control over video narrative structure is needed (e.g., single shot for product demonstrations, multi-shot for short stories), this parameter can be specified.

Options: single, multi

Default: single

seed integer

Random seed. If not specified, the system automatically generates a random seed. To improve reproducibility of generation results, it is recommended to fix the seed value.

Note: Due to the probabilistic nature of model generation, even with the same seed, it cannot guarantee that generation results will be completely consistent every time.

Example: 12345

Range: 0 - 2147483647


Polling

Since video generation takes time, you need to poll the task status after creation.

The initial response returns the task ID and initial status. The actual generation results must be obtained through polling the task status endpoint.


Response Format

error object

Error information. Only present when status is failed.

code string

Error code

message string

Detailed error message


output array

Generation results. Only present when status is completed.

status string

Task status

Options: in_queue, processing, done, failed

content array

List of generated video content

type string

Resource type

Value: video

url string

Processed video URL (CDN address)

jobId string

Remote task ID


usage object

Usage statistics. Only present when status is completed.

cost string

Total cost in USD

discount number

Discount amount


metadata object

Metadata information.


Error Codes

Error CodeDescription
001026001reference_video_urls must be 1-2 video URL array
001026002reference_video_urls cannot be used with audio parameter
001026003When using reference_video_urls, duration must be 5 or 10
001026095Internal generation error
001026096Result parsing error
001026097HTTP error response
001026098Status check error
001026099Task creation error