API Tips
Image-to-Video (First Frame): Generate target video based on your input: first frame image + text prompt (optional) + parameters (optional)
Text-to-Video: Generate target video based on your input: text prompt + parameters (optional)
Authentication
authorization string required
All APIs require authentication via Bearer Token.
Get API Key:
Visit API Key Management Page to get your API Key.
Usage:
Add to request header:
Authorization: Bearer YOUR_API_KEY
Parameters
model string required
Model ID to use for the request
Value: spark_dance_v1_0_pro_fast
content array required
Input information for generating videos, supports text, image formats. Supports the following combinations: Text, Text + Image (first frame)
Text Content Object
type
stringrequiredContent type
Value:
texttext
stringrequiredText content input to the model, describing the expected video, including:
Text prompt (required): Supports Chinese and English. Recommended not exceeding 500 characters. Too many characters can cause information dispersion, and the model may ignore details and only focus on key points, resulting in missing elements in the video
Image Content Object
type
stringrequiredContent type
Value:
image_urlimage_url
objectrequiredImage URL object
url
stringrequiredImage information, can be image URL or Base64 encoded image
Image URL: Ensure the image URL is accessible
Base64 encoding: Follow this format
data:image/<image_format>;base64,<base64_encoded>, note that<image_format>must be lowercase, e.g.data:image/png;base64,{base64_image}Image requirements:
Image formats: jpeg, png, webp, bmp, tiff, gif
Aspect ratio (width/height): (0.4, 2.5)
Width and height (px): (300, 6000)
Size: less than30 MBrole
stringconditional requiredImage position or purpose
First Frame to Video:
role value: Need to pass 1image_urlobject, androlefield can be empty, orroleis:first_frame
callback_url string
Callback notification address for task results
return_last_frame boolean
Whether to return the last frame image of the generated video
true: Returns the last frame image of the generated video. After setting to true, you can get the video’s last frame image through the query video generation task API. The last frame image format is png, with the same width and height pixels as the generated video, without watermark. This parameter can be used to generate multiple consecutive videos by using the end frame of one video as the first frame of the next video task
false: Does not return the last frame image
Default: false
service_tier string
Service tier type for processing this request
default: Online inference mode
flex: Offline inference mode
Default: default
Options: default, flex
execution_expires_after integer
Task timeout threshold. Specifies the expiration time (in seconds) after task submission, calculated from the created_at timestamp
Default: 172800 (48 hours)
Range: [3600, 259200]
resolution string
Video resolution
Default: 1080p
Options: 480p, 720p, 1080p
ratio string
Aspect ratio of the generated video
Default: 16:9 (text-to-video), adaptive (image-to-video)
Options: 16:9, 4:3, 1:1, 3:4, 9:16, 21:9, adaptive
Note: adaptive automatically selects the most suitable aspect ratio based on the uploaded first frame image
duration integer
Video duration in seconds
Either duration or frames can be specified (one of two), frames takes priority over duration. If you want to generate an integer-second video, it is recommended to specify duration
Default: 5
Range: 2 - 12
frames integer
Number of frames for the generated video
duration and frames are mutually exclusive, frames has higher priority. If you want to generate videos with fractional seconds, it is recommended to specify frames. By specifying the number of frames, you can flexibly control the length of the generated video, generating videos with fractional seconds
Due to the value restrictions of frames, only limited fractional seconds are supported. You need to calculate the closest frame number based on the formula
Calculation formula: Frame count = Duration × Frame rate (24)
Value range: Supports all integer values in the range [29, 289] that satisfy the format 25 + 4n, where n is a positive integer
Example: If you need to generate a 2.4 second video, frame count = 2.4×24=57.6. Since frames does not support 57.6, you can only choose the closest value. According to 25+4n, the closest frame number is 57, and the actual generated video is 57/24=2.375 seconds
seed integer
Seed integer for controlling randomness of generated content
Default: -1
Range: [-1, 2^32-1]
camera_fixed boolean
Whether to fix the camera
Default: false
Options: true, false
watermark boolean
Whether the generated video contains watermark
Default: false
Options: true, false
Polling
Since video generation takes time, you need to poll the task status after creation
The initial response returns the task ID and initial status. The actual generation results must be obtained through polling the task status endpoint
Response Format
error object
Error information. Only present when status is failed
code
stringError code
error_message
stringDetailed error message
output array
Generation results. Only present when status is completed
content
arrayList of generated content
type
stringResource type
url
stringContent URL
jobId
stringRemote job ID
duration
integerVideo duration (seconds)
format
stringVideo format, default mp4
resolution
stringVideo resolution
ratio
stringVideo aspect ratio
fps
integerVideo frame rate
usage object
Usage statistics. Only present when status is completed
Usage information
metadata object
Metadata information
Error Codes
| Error Code | Description |
|---|---|
| 003021098 | Generation failed |
| 003021099 | Service unavailable |