# Audio to Video

Audio-to-Video workflows turn the output of an audio node into an input for a video node, so the sound you've generated on the canvas: a voiceover, a line of dialogue, a sound effect, a music cue drives the visuals you produce next. This is how speech becomes a lipsynced performance, how a still image becomes a talking avatar, and how generated sound lands on the same canvas as the video it belongs to, without ever leaving FLORA.

The shape of the workflow is always the same. You generate or upload audio in an audio node, connect that audio node's output to a video node, connect a second input (either a video of a speaker for lipsync or a reference image for avatars), and pick a model that accepts audio as an input. The video node treats the audio as a conditioning signal matching mouth movement to speech, aligning motion to rhythm, or syncing timing to a beat. Then is renders a clip that already has the sound baked in.

<figure><img src="/files/7I0b44fiwy0ayVj4KviH" alt=""><figcaption></figcaption></figure>

### Parameters

<table><thead><tr><th width="221">Parameter</th><th>Type</th><th>Effect on Output</th></tr></thead><tbody><tr><td>Audio</td><td>Audio</td><td>The audio output from an audio node. Connect a generated voiceover, SFX, music track, or uploaded clip — this is the track that drives the video.</td></tr><tr><td>Video</td><td>Video</td><td>The visual source the audio is applied to. Connect a video node for lipsync workflows, or an image node for talking-avatar generations.</td></tr><tr><td>Prompt</td><td>Text</td><td>Optional text guidance passed to models that accept it alongside audio and visual inputs. Useful for steering delivery, style, or camera behavior on models that expose those controls.</td></tr><tr><td>Duration</td><td>Derived</td><td>Most audio-to-video models produce a clip whose length matches the input audio. Trim the audio track before generating if you want a shorter output.</td></tr></tbody></table>

Audio to Video

The Audio to Video workflow takes an audio track and generates a synced video from it. When you connect an audio node to a video node and switch to an audio-aware model. The model then reads the audio as part of the conditioning taking into account the delivery, timing, and motion in the output all follow the track you provided.

You can interact with it in a number of ways:

* Connect any audio node output to the video node to populate the audio input
* Optionally connect an image node as a subject or scene reference
* Use the prompt to describe the setting, camera, and style the model should render
* Switch the model to WAN 2.6 or WAN 2.7 for audio-driven generation
* Adjust aspect ratio and resolution to match your delivery format

How to use

Prompting matters for Audio to Video too — your audio carries dialogue and timing, but the prompt carries the scene.

<br>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.flora.ai/nodes/video-node/audio-to-video.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
