All examples use the base URL https://api.hedra.com/web-app/public and require an X-API-Key header.
export HEDRA_API_KEY="your_api_key"
Avatar videos are driven by audio — the character in the image will lip-sync and move to the provided audio. Two models are available:
| Model | ID | Best for |
|---|
| Hedra Avatar | 26f0fc66-152b-40ab-abed-76c43df99bc8 | Talking-head videos, lip-sync, up to 10 minutes |
| Hedra Omnia | ab372b84-432f-44f5-bacc-c2542465f712 | Motion avatar videos with full-body movement, up to 8s |
Both models require an image and audio input.
Step 1: Upload your audio
The audio determines the video length. For a 10-minute video, provide ~10 minutes of audio.
# Create the asset record
curl -X POST https://api.hedra.com/web-app/public/assets \
-H "Content-Type: application/json" \
-H "X-API-Key: $HEDRA_API_KEY" \
-d '{
"name": "ten-minute-audio.mp3",
"type": "audio"
}'
# Upload the file
curl -X POST https://api.hedra.com/web-app/public/assets/{audio_asset_id}/upload \
-H "X-API-Key: $HEDRA_API_KEY" \
-F "file=@/path/to/ten-minute-audio.mp3"
Step 2: Upload your portrait image
curl -X POST https://api.hedra.com/web-app/public/assets \
-H "Content-Type: application/json" \
-H "X-API-Key: $HEDRA_API_KEY" \
-d '{
"name": "portrait.png",
"type": "image"
}'
curl -X POST https://api.hedra.com/web-app/public/assets/{image_asset_id}/upload \
-H "X-API-Key: $HEDRA_API_KEY" \
-F "file=@/path/to/portrait.png"
Step 3: Generate the avatar video
Use the Hedra Avatar model (26f0fc66-152b-40ab-abed-76c43df99bc8):
curl -X POST https://api.hedra.com/web-app/public/generations \
-H "Content-Type: application/json" \
-H "X-API-Key: $HEDRA_API_KEY" \
-d '{
"type": "video",
"ai_model_id": "26f0fc66-152b-40ab-abed-76c43df99bc8",
"start_keyframe_id": "{image_asset_id}",
"audio_id": "{audio_asset_id}",
"generated_video_inputs": {
"text_prompt": "A person speaking to the camera",
"aspect_ratio": "9:16",
"resolution": "720p",
"duration_ms": 600000
}
}'
Set duration_ms to 600000 for 10 minutes (600 seconds = 600,000 ms). The video duration will match your audio length.
Using Hedra Omnia
For motion avatar videos with full-body movement, swap the model ID to Hedra Omnia:
curl -X POST https://api.hedra.com/web-app/public/generations \
-H "Content-Type: application/json" \
-H "X-API-Key: $HEDRA_API_KEY" \
-d '{
"type": "video",
"ai_model_id": "ab372b84-432f-44f5-bacc-c2542465f712",
"start_keyframe_id": "{image_asset_id}",
"audio_id": "{audio_asset_id}",
"generated_video_inputs": {
"text_prompt": "A person gesturing expressively while speaking",
"aspect_ratio": "9:16",
"resolution": "720p",
"duration_ms": 8000
}
}'
Inline audio generation
Instead of uploading audio separately, you can generate speech inline by passing audio_generation instead of audio_id.
To find a voice_id, list the available voices:
curl https://api.hedra.com/web-app/public/voices \
-H "X-API-Key: $HEDRA_API_KEY"
See the Generate Audio guide for the full list of voice options, including voice cloning.
curl -X POST https://api.hedra.com/web-app/public/generations \
-H "Content-Type: application/json" \
-H "X-API-Key: $HEDRA_API_KEY" \
-d '{
"type": "video",
"ai_model_id": "26f0fc66-152b-40ab-abed-76c43df99bc8",
"start_keyframe_id": "{image_asset_id}",
"audio_generation": {
"type": "text_to_speech",
"voice_id": "f412c62f-e94f-41c0-bfc6-97f63289941c",
"text": "Hello, this is a demo of inline audio generation for avatar videos."
},
"generated_video_inputs": {
"text_prompt": "A person speaking to the camera",
"aspect_ratio": "9:16",
"resolution": "540p"
}
}'
Step 4: Poll for completion
Long-form avatar videos take longer to generate. Check progress (0-1) and eta_sec for estimates:
curl https://api.hedra.com/web-app/public/generations/{generation_id}/status \
-H "X-API-Key: $HEDRA_API_KEY"
When status is "complete", the response includes an asset_id for the generated video.