Show HN: Audio-to-Video with LTX-2

LTX-2 is an open-source diffusion model that combines video and audio.

Visually it's not at the level of Seedance 2.0, Veo 3.1, or Sora 2, but it’s open-weights, so anyone can play with it.

I wanted to see how good it is at generating video from just audio.

Off-the-shelf, it's not very good, but I found that if you run the audio through Gemini to generate a prompt, then feed that into LTX-2, in addition to the audio, the output matches the audio much more often.

Foley sounds work particularly well, and one fun use case is uploading audio of yourself to see what AI thinks you look like.

Limitations:

- Doesn't know real people, so a famous person's voice just gets a generic person

- Sometimes gets gender wrong if the voice is more androgynous

- In dialogue with similar voices, it can render the same person saying both lines

Summary

Magichour.ai offers an audio-to-video conversion service that allows users to create professional-looking videos from their audio files, with features like automatic video generation, text-to-speech, and customizable templates.

Story

Show HN: Audio-to-Video with LTX-2