Killing the Awkward Silence:Architecting an Ultra-Low-Latency Voice AI

Hack Session

About the session

We’ve all talked to voice bots that feel like talking to a walkie-talkie—awkward pauses, annoying interruptions, and robotic responses. Building a Voice AI that actually feels human isn't just about plugging an LLM into a Text-to-Speech API. It’s an exercise in extreme asynchronous engineering.

In this live, code-heavy hack session, we are going to build an enterprise-grade, multi-lingual Voice Agent from the ground up. We won't just look at slides; we will write code, execute it, watch it fail, and then engineer our way out of the problem.

Starting from a naive, highly-flawed voice loop, we will progressively add architectural layers solving real-world physics problems like acoustic echoes, network lag, race conditions, and language code-switching—until we achieve a guaranteed sub-second response time.

Speaker

Download Brochure