⌘K

Wan 2.6 Image

wan26_image

Wan2.6 Image is Alibaba's latest image editing model that transforms input images based on text prompts, served via mainland China (Beijing) endpoint.

Authentication

authorization string required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key.

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

model string required

Model ID to use for the request

Value: wan26_image


input object required

Input data for the generation request

messages array required

Request content array. Currently only supports single-turn conversation, i.e., passing one set of role and content parameters. Multi-turn conversations are not supported. Array length must be 1.

role string required

The role of the message

Value: user

content array required

Content array, must contain exactly one text object and 0–4 image objects

text string required

Positive prompt used to describe the image content, style, and composition you expect to generate. Supports Chinese and English. Each Chinese character, letter, number, or symbol counts as one character. Content exceeding the limit will be automatically truncated.

Maximum length: 2000 characters

image string

Input image URL or Base64-encoded string.

Basic Limitations:
Supported formats: JPEG, JPG, PNG (no transparency), BMP, WEBP
Resolution requirements: Width and height must be between 240 and 8000 pixels
File size: Maximum 10MB

Image Quantity Rules:
Image quantity depends on the parameters.enable_interleave parameter:
When enable_interleave=true (interleaved output): Can input 0–1 images
When enable_interleave=false (image editing): Must input 1–4 images

Multi-image input: Pass multiple image objects in the content array. Image order is defined by array order.

Input Formats:
Method 1: Publicly accessible URL (HTTP or HTTPS)
Example: http://wanx.alicdn.com/material/xxx.jpeg

Method 2: Base64 encoding
Format: data:{MIME_type};base64,{base64_data}
Example: data:image/jpeg;base64,GDU7MtCZzEbTbmRZ...


parameters object

Image processing parameters

negative_prompt string

Negative prompt used to describe content you don’t want to appear in the image. Supports Chinese and English. Content exceeding the limit will be automatically truncated.

Maximum length: 500 characters

Example: Low resolution, low quality, deformed limbs, deformed fingers, oversaturated, waxy appearance, faceless details, overly smooth, AI-generated look, chaotic composition, blurred text, distorted.

size string

Output image resolution. Behavior depends on the enable_interleave mode.

Image editing mode (enable_interleave=false):
Method 1 — Reference tier (recommended): 1K (default, total pixels ≈ 1280×1280) or 2K (total pixels ≈ 2048×2048). Aspect ratio follows the last input image.
Method 2 — Explicit size: total pixels between [768×768, 2048×2048], aspect ratio [1:4, 4:1]. Actual output is rounded to the nearest multiple of 16.

Interleaved output mode (enable_interleave=true):
Method 1 — Follow input: if input total pixels ≤ 1280×1280, output matches input; if input total pixels > 1280×1280, output ≈ 1280×1280. If no input image, defaults to 1280×1280.
Method 2 — Explicit size: total pixels between [768×768, 1280×1280], aspect ratio [1:4, 4:1].

Recommended resolutions for common aspect ratios:
1:1: 1280×1280
2:3: 800×1200
3:2: 1200×800
3:4: 960×1280
4:3: 1280×960
9:16: 720×1280
16:9: 1280×720
21:9: 1344×576

Default: 1K

enable_interleave boolean

Controls the image generation mode.

false: Image editing mode — supports multi-image input and subject consistency generation. Requires 1–4 reference images. Outputs 1–4 result images.

true: Interleaved text-image output mode — supports 0 or 1 input image. Used for mixed text-and-image content generation, or pure text-to-image generation.

Options: false, true

Default: false

n integer

Specifies the number of images to generate. This parameter directly affects billing costs (Cost = Unit price × Number of successfully generated images), please confirm model pricing before calling.

When enable_interleave=false (image editing mode): controls the number of generated images directly. Recommended to set to 1 during testing for low-cost verification.

When enable_interleave=true (interleaved mode): must be fixed at 1; setting any other value will result in an API error. Use max_images to control the maximum number of generated images instead.

Range: 1 - 4

Default: 1

max_images integer

Specifies the maximum number of images the model generates in a single response. Only effective in interleaved mode (enable_interleave=true). This parameter affects billing costs (Cost = Unit price × Number of successfully generated images), please confirm model pricing before calling.

This parameter only represents the upper limit. Actual generated images are determined by model inference and may be less than the set value (e.g., setting 5 may result in only 3 images based on content).

Range: 1 - 5

Default: 5

prompt_extend boolean

Whether to enable intelligent Prompt rewriting. Only effective in image editing mode (enable_interleave=false). This feature only optimizes and refines positive prompts and does not change negative prompts.

Options: true, false

Default: true

watermark boolean

Whether to add a watermark identifier. The watermark is located in the lower right corner of the image with fixed text “AI Generated”.

Options: false, true

Default: false

seed integer

Random seed for generation. Using the same seed value can keep the generated content relatively stable. If not provided, the algorithm will automatically use a random seed.

Note: The model generation process is probabilistic; even with the same seed, completely consistent results cannot be guaranteed.

Range: 0 - 2147483647


Polling

Since result generation takes time, you need to poll the task status after creating the task.

The initial response only returns information such as the task ID and initial status. The final result must be obtained by polling the task status endpoint using the task ID.

See the examples on the right for polling requests and responses.


Response Format

error object

Error information, only present when status is failed

code string

Error code

message string

Detailed error message


output array

Generation results, only present when status is completed

content array

List of generated resource content

type string

Resource type

Value: image|video

url string

Processed resource URL

jobId string

Remote task ID


usage object

Usage statistics, only present when status is completed

cost string

Total cost in USD

discount number

Discount amount


metadata object

Metadata information