Show HN: FaceTime-style calls with an AI Companion (Live2D and long-term memory)

Hi HN, I built Beni (https://thebeni.ai ), a web app for real-time video calls with an AI companion.

The idea started as a pretty simple question: text chatbots are everywhere, but they rarely feel present. I wanted something closer to a call, where the character actually reacts in real time (voice, timing, expressions), not just “type, wait, reply”.

Beni is basically:

A Live2D avatar that animates during the call (expressions + motion driven by the conversation)

Real-time voice conversation (streaming response, not “wait 10 seconds then speak”)

Long-term memory so the character can keep context across sessions

The hardest part wasn’t generating text, it was making the whole loop feel synchronized: mic input, model response, TTS audio, and Live2D animation all need to line up or it feels broken immediately. I ended up spending more time on state management, latency and buffering than on prompts.

Some implementation details (happy to share more if anyone’s curious):

Browser-based real-time calling, with audio streaming and client-side playback control

Live2D rendering on the front end, with animation hooks tied to speech / state

A memory layer that stores lightweight user facts/preferences and conversation summaries to keep continuity

Current limitation: sign-in is required today (to persist memory and prevent abuse). I’m adding a guest mode soon for faster try-out and working on mobile view now.

What I’d love feedback on:

Does the “real-time call” loop feel responsive enough, or still too laggy?

Any ideas for better lip sync / expression timing on 2D/3D avatars in the browser?

Thanks, and I’ll be around in the comments.

Summary

thebeni.ai is a website that provides AI-powered financial and investment analysis tools, enabling users to make informed decisions about their investments and financial planning.

Story

Show HN: FaceTime-style calls with an AI Companion (Live2D and long-term memory)