Show HN: Audio-to-Video with LTX-2
runshouse Monday, March 02, 2026LTX-2 is an open-source diffusion model that combines video and audio.
Visually it's not at the level of Seedance 2.0, Veo 3.1, or Sora 2, but it’s open-weights, so anyone can play with it.
I wanted to see how good it is at generating video from just audio.
Off-the-shelf, it's not very good, but I found that if you run the audio through Gemini to generate a prompt, then feed that into LTX-2, in addition to the audio, the output matches the audio much more often.
Foley sounds work particularly well, and one fun use case is uploading audio of yourself to see what AI thinks you look like.
Limitations:
- Doesn't know real people, so a famous person's voice just gets a generic person
- Sometimes gets gender wrong if the voice is more androgynous
- In dialogue with similar voices, it can render the same person saying both lines