Skip to main content
POST
/
generations
Generate Asset
curl --request POST \
  --url https://api.hedra.com/web-app/public/generations \
  --header 'Content-Type: application/json' \
  --header 'X-API-Key: <api-key>' \
  --data '
{
  "generated_video_inputs": {
    "text_prompt": "<string>",
    "ai_model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "resolution": "<string>",
    "aspect_ratio": "<string>",
    "duration_ms": 123,
    "bounding_box_target": {
      "[0]": 0.5,
      "[1]": 0.5
    },
    "character_orientation": "video",
    "enhance_prompt": false
  },
  "workspace_id": "<string>",
  "agent_thread_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "generation_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "generation_ids": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ],
  "type": "video",
  "ai_model_id": "d1dd37a3-e39a-4854-a298-6510289f9cf2",
  "start_keyframe_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "start_keyframe_url": "<string>",
  "end_keyframe_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "end_keyframe_url": "<string>",
  "audio_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "audio_generation": {
    "voice_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "text": "<string>",
    "workspace_id": "<string>",
    "agent_thread_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "generation_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "generation_ids": [
      "3c90c3cc-0d44-4b50-8888-8dd25736052a"
    ],
    "type": "text_to_speech",
    "model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "stability": 1,
    "speed": 1,
    "language": "auto"
  },
  "audio_start_ms": 123,
  "reference_image_ids": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ],
  "video_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "batch_size": 1
}
'
{
  "generated_video_inputs": {
    "text_prompt": "<string>",
    "ai_model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "resolution": "<string>",
    "aspect_ratio": "<string>",
    "duration_ms": 123,
    "bounding_box_target": {
      "[0]": 0.5,
      "[1]": 0.5
    },
    "character_orientation": "video",
    "enhance_prompt": false
  },
  "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "asset_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "created_at": "<string>",
  "status": "complete",
  "progress": 123,
  "workspace_id": "<string>",
  "agent_thread_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "generation_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "generation_ids": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ],
  "type": "video",
  "ai_model_id": "d1dd37a3-e39a-4854-a298-6510289f9cf2",
  "start_keyframe_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "start_keyframe_url": "<string>",
  "end_keyframe_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "end_keyframe_url": "<string>",
  "audio_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "audio_generation": {
    "voice_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "text": "<string>",
    "workspace_id": "<string>",
    "agent_thread_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "generation_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "generation_ids": [
      "3c90c3cc-0d44-4b50-8888-8dd25736052a"
    ],
    "type": "text_to_speech",
    "model_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "stability": 1,
    "speed": 1,
    "language": "auto"
  },
  "audio_start_ms": 123,
  "reference_image_ids": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ],
  "video_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
  "batch_size": 1,
  "eta_sec": 123,
  "batch_generation_id": "<string>",
  "batch_results": [
    {
      "created_at": "<string>",
      "status": "complete",
      "progress": 123,
      "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
      "asset_id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
      "error": "<string>"
    }
  ]
}

Authorizations

X-API-Key
string
header
required

Body

application/json
generated_video_inputs
GeneratedVideoInputs · object
required

Inputs for generating the video.

workspace_id
string | null
agent_thread_id
string<uuid> | null

Optional agent thread ID to associate this generation with.

generation_id
string<uuid> | null

Optional pre-reserved generation ID. If provided, this ID will be used instead of generating a new one. For batch operations (batch_size > 1), use generation_ids instead.

generation_ids
string<uuid>[] | null

Optional list of pre-reserved generation IDs for batch operations. Length must match batch_size. Mutually exclusive with generation_id.

type
string
default:video
Allowed value: "video"
ai_model_id
string<uuid>
default:d1dd37a3-e39a-4854-a298-6510289f9cf2

ID of the model to use for the generation.

start_keyframe_id
string<uuid> | null

The id of the Image asset to use as the start keyframe. This will be ignored if reference_image_ids is provided.

start_keyframe_url
string<uri> | null

The URL of the image to use as the start keyframe.

Required string length: 1 - 2083
end_keyframe_id
string<uuid> | null

The id of the Image asset to use as the end keyframe. This will be ignored if reference_image_ids is provided.

end_keyframe_url
string<uri> | null

The URL of the image to use as the end keyframe.

Required string length: 1 - 2083
audio_id
string<uuid> | null

The id of the Audio asset to use.

audio_generation
GenerateTextToSpeechRequest · object

Optional TTS parameters for server-side audio generation. If provided (and audio_id is not), audio will be generated from these params before video generation.

audio_start_ms
integer | null

Audio start offset in milliseconds. Negative values prepend silence (e.g., -1000 adds 1s silence before audio). Positive values crop from the beginning of the source audio (e.g., 2000 skips the first 2s). Use with generated_video_inputs.duration_ms to control total output length.

reference_image_ids
string<uuid>[] | null

The id(s) of the image(s) to reference in the generation. Only used for multi-image video generation and will supersede start_keyframe_id.

video_id
string<uuid> | null

The id of the Video asset to use as motion input for V2V (motion control) models.

batch_size
integer
default:1

Number of video variations to generate (1-8). When > 1, batch_results will contain all generation results.

Required range: 1 <= x <= 8

Response

Successful Response

generated_video_inputs
GeneratedVideoInputs · object
required

Inputs for generating the video.

id
string<uuid>
required

The id of the generation created.

asset_id
string<uuid>
required

The id of the video asset resulting from the generation.

created_at
string
required

Date the generation was submitted.

status
enum<string>
required

Status of the generation

Available options:
complete,
error,
processing,
queued,
finalizing
progress
number
required

Current progress to completion. Between 0-1

workspace_id
string | null
agent_thread_id
string<uuid> | null

Optional agent thread ID to associate this generation with.

generation_id
string<uuid> | null

Optional pre-reserved generation ID. If provided, this ID will be used instead of generating a new one. For batch operations (batch_size > 1), use generation_ids instead.

generation_ids
string<uuid>[] | null

Optional list of pre-reserved generation IDs for batch operations. Length must match batch_size. Mutually exclusive with generation_id.

type
string
default:video
Allowed value: "video"
ai_model_id
string<uuid>
default:d1dd37a3-e39a-4854-a298-6510289f9cf2

ID of the model to use for the generation.

start_keyframe_id
string<uuid> | null

The id of the Image asset to use as the start keyframe. This will be ignored if reference_image_ids is provided.

start_keyframe_url
string<uri> | null

The URL of the image to use as the start keyframe.

Required string length: 1 - 2083
end_keyframe_id
string<uuid> | null

The id of the Image asset to use as the end keyframe. This will be ignored if reference_image_ids is provided.

end_keyframe_url
string<uri> | null

The URL of the image to use as the end keyframe.

Required string length: 1 - 2083
audio_id
string<uuid> | null

The id of the Audio asset to use.

audio_generation
GenerateTextToSpeechRequest · object

Optional TTS parameters for server-side audio generation. If provided (and audio_id is not), audio will be generated from these params before video generation.

audio_start_ms
integer | null

Audio start offset in milliseconds. Negative values prepend silence (e.g., -1000 adds 1s silence before audio). Positive values crop from the beginning of the source audio (e.g., 2000 skips the first 2s). Use with generated_video_inputs.duration_ms to control total output length.

reference_image_ids
string<uuid>[] | null

The id(s) of the image(s) to reference in the generation. Only used for multi-image video generation and will supersede start_keyframe_id.

video_id
string<uuid> | null

The id of the Video asset to use as motion input for V2V (motion control) models.

batch_size
integer
default:1

Number of video variations to generate (1-8). When > 1, batch_results will contain all generation results.

Required range: 1 <= x <= 8
eta_sec
integer | null

Estimated time until completion in seconds. May be None if no historical data available.

batch_generation_id
string | null

Unique identifier linking all generations in a batch.

batch_results
BatchVideoResultItem · object[]

All generation results in the batch. Always populated (even for batch_size=1). The main response fields (id, asset_id, etc.) reflect the first successful generation.