Spark DreamO - Multi IP | Vtrix API Docs

API Tips

Input images must meet the following requirements:

Supported formats: JPEG, PNG only (JPEG format recommended)

File size: Maximum 4.7 MB

Image resolution: Maximum 4096 * 4096

Aspect ratio: Recommended range 16:9 to 9:16 (extreme aspect ratios may have poor results and may cause errors)

Authentication

authorization `string` required

All APIs require authentication via Bearer Token.

Get API Key:

Visit API Key Management Page to get your API Key.

Usage:

Add to request header:

Authorization: Bearer YOUR_API_KEY

Parameters

model `string` required

Model ID to use for the request

Value: spark_multi_dreamo

binary_data_base64 `array` required (one of two)

Image files in Base64 encoding. Supports up to 5 input images

Either image_urls or binary_data_base64 must be provided (one of two)

image_urls `array` required (one of two)

Image file URLs (must be publicly accessible). Supports up to 5 input images

Either image_urls or binary_data_base64 must be provided (one of two)

prompt `string` required

Prompt for image editing, supports both Chinese and English

Recommended length around 300 characters. Prompts that are too long may not take effect and may cause errors

ref_type_list `array`

Reference type for each reference image. The length of this array must equal the number of reference images

The default reference type is AUTO, which automatically matches the reference type but will increase inference time. It is recommended to manually specify the reference type for each image in scenarios where the reference type is fixed

IP: Reference subject features
ID: Reference facial features
STYLE: Reference style features
AUTO: Automatically match reference type (default)

Options: IP, ID, STYLE, AUTO

Default: AUTO

guidance_scale1 `number`

Controls the consistency of generation results with text descriptions. Higher values result in higher text consistency but lower image consistency

Range: 1.0 to 7.0

Default: 2.5

guidance_scale2 `number`

Controls the consistency of generation results with images. Higher values result in higher image consistency but lower text consistency

Range: 1.0 to 7.0

Default: 2.5

ddim_steps `integer`

Number of steps for image generation

Range: 1 to 50

Default: 12

swap_face `boolean`

Whether to use facial ID enhancement. When enabled, facial consistency is higher, but it may affect facial attribute editing such as expressions and makeup, and will increase processing time

Options: true, false

Default: false

use_rephraser `boolean`

Whether to rephrase the input text prompt to optimize results. It is recommended to keep this enabled under normal conditions

If the input text is very long, or you have a strong requirement not to change the prompt content, or you want to reduce processing time, you can disable this parameter

Options: true, false

Default: true

rephraser_level `string`

Fine-grained level of intelligent prompt rewriting. More fine-grained levels result in better model understanding of reference images and prompt instructions, but also increase processing time

Note that fine-grained level and generation quality are not necessarily proportional

general: General level
fine: Fine level
coarse: Coarse level

Options: general, fine, coarse

Default: general

seed `integer`

Random seed as the basis for determining the initial diffusion state. If the random seed is the same positive integer and other parameters are consistent, the generated content will most likely have consistent results

Default: -1 (random)

width `integer`

Width of the generated image

Exceeding the upper limit requires ensuring width * height product is less than 2048 * 2048, and may cause abnormal results or timeout issues

Recommended ratios and corresponding dimensions (width * height):
1:1: 1328 * 1328
4:3: 1472 * 1104
3:2: 1584 * 1056
16:9: 1664 * 936
21:9: 2016 * 864

Range: 512 to 2048

Default: 1328

height `integer`

Height of the generated image

Exceeding the upper limit requires ensuring width * height product is less than 2048 * 2048, and may cause abnormal results or timeout issues

Range: 512 to 2048

Default: 1328

Polling

Since image generation takes time, you need to poll the task status after creation

The initial response returns the task ID and initial status. The actual generation results must be obtained through polling the task status endpoint

Response Format

error `object`

Error information. Only present when status is failed

code string

Error code

error_message string

Detailed error message

output `array`

Generation results. Only present when status is completed

content array

List of generated content

type string

Resource type

Value: image

url string

Image URL

jobId string

Remote job ID

usage `object`

Usage statistics. Only present when status is completed

cost string

Total cost in USD

discount number

Discount amount

metadata `object`

Metadata information

Error Codes

Error Code	Description
003013001	Missing prompt
003013002	Missing image
003013003	Invalid prompt length
003013004	Invalid parameter
003013095	Internal generation error
003013096	Result parsing error
003013097	HTTP error response
003013098	Status check error
003013099	Service unavailable

curl --location 'https://cloud.vtrix.ai/model/v1/generation' \ --header 'Content-Type: application/json' \ --header 'Authorization: Bearer YOUR_API_KEY' \ --data '{ "model": "spark_multi_dreamo", "input": [ { "params": { "image_urls": ["https://example.com/image1.jpg", "https://example.com/image2.jpg"], "prompt": "A group photo with consistent character features", "ref_type_list": ["ID", "STYLE"], "guidance_scale1": 2.5, "guidance_scale2": 2.5, "use_rephraser": true, "width": 1328, "height": 1328 } } ] }'

{ "id": "d5u5obte8783ap44qtj0", "created_at": 1769757744021, "status": "completed", "model": "spark_multi_dreamo", "output": [ { "content": [ { "type": "image", "url": "https://example.com/generated-image.jpg", "jobId": "remote_job_id_12345" } ] } ], "usage": { "cost": "0.000500", "discount": 0, "input_tokens": null, "output_tokens": null, "quantity": 1, "time_per_unit": 0, "total_tokens": null, "unit_price": "0.000500", "user_discount": 1 }, "metadata": { "completed_at": 120.5, "in_queue_at": 0, "upload_at": 1.2, "usage": { "input_tokens": 20, "input_tokens_details": { "text_tokens": 20 }, "output_tokens": 0, "total_tokens": 20 } } }

Spark DreamO - Multi IP

API Tips

Authentication

authorization `string` required

Parameters

model `string` required

binary_data_base64 `array` required (one of two)

image_urls `array` required (one of two)

prompt `string` required

ref_type_list `array`

guidance_scale1 `number`

guidance_scale2 `number`

ddim_steps `integer`

swap_face `boolean`

use_rephraser `boolean`

rephraser_level `string`

seed `integer`

width `integer`

height `integer`

Polling

Response Format

error `object`

code `string`

error_message `string`

output `array`

content `array`

type `string`

url `string`

jobId `string`

usage `object`

cost `string`

discount `number`

metadata `object`

Error Codes

API Tips

Authentication

authorization string required

Parameters

model string required

binary_data_base64 array required (one of two)

image_urls array required (one of two)

prompt string required

ref_type_list array

guidance_scale1 number

guidance_scale2 number

ddim_steps integer

swap_face boolean

use_rephraser boolean

rephraser_level string

seed integer

width integer

height integer

Polling

Response Format

error object

code string

error_message string

output array

content array

type string

url string

jobId string

usage object

cost string

discount number

metadata object

Error Codes

authorization `string` required

model `string` required

binary_data_base64 `array` required (one of two)

image_urls `array` required (one of two)

prompt `string` required

ref_type_list `array`

guidance_scale1 `number`

guidance_scale2 `number`

ddim_steps `integer`

swap_face `boolean`

use_rephraser `boolean`

rephraser_level `string`

seed `integer`

width `integer`

height `integer`

error `object`

code `string`

error_message `string`

output `array`

content `array`

type `string`

url `string`

jobId `string`

usage `object`

cost `string`

discount `number`

metadata `object`