Understand the core architecture of voice agents, including the trade-offs between modular pipelines and real-time APIs, and how components like STT, LLMs, and TTS work together.
We'd like to know you better so we can create more relevant courses. What do you do for work?
Instructors: Russ dâSa, Shayne Parmelee, Nedelina Teneva
Understand the core architecture of voice agents, including the trade-offs between modular pipelines and real-time APIs, and how components like STT, LLMs, and TTS work together.
Build and deploy a voice agent that handles speech input, generates LLM responses, and replies using custom voices while managing latency and user interruptions.
Measure and optimize latency across your voice pipeline, and apply strategies to make your agent feel more natural, responsive, and scalable in real-world settings.
Join Building AI Voice Agents for Production, created in collaboration with LiveKit and RealAvatar, and taught by Russ dâSa (Co-founder & CEO of LiveKit), Shayne Parmelee (Developer Advocate, LiveKit), and Nedelina Teneva (Head of AI at RealAvatar, an AI Fund portfolio company). The course also incorporates voice technology from ElevenLabs, a supporting contributor to the project.
Voice agents combine speech and reasoning capabilities to enable real-time, human-like conversations. Theyâre already being used to enhance learning, support customer service, and improve accessibility in healthcare and talk therapy.
In this course, youâll learn how to build voice agents that listen, reason, and respond naturally. Youâll follow the architecture used to create Andrew Avatar, a collaborative project between DeepLearning.AI and RealAvatar that responds to users in Andrew Ngâs voice. Youâll build a voice agent from scratch and deploy it to the cloud, enabling support for many simultaneous users.
What youâll learn:
By the end of this course, youâll have learned the components of an AI voice agent pipeline, combined them into a system with low-latency communication, and deployed them on cloud infrastructure so it scales to many users.
Start building your voice agent today with LiveKit.
Anyone who wants to build conversational voice applications using LLMs. Youâll get the most out of this course if youâre already familiar with basic Python and foundational AI workflows.
Introduction
Voice Agent Overview
End-to-end architecture - Part 1
End-to-end architecture - Part 2
Voice Agent Components
Optimizing Latency
Conclusion
Course access is free for a limited time during the DeepLearning.AI learning platform beta!
Keep learning with updates on curated AI news, courses, and events, as well as Andrewâs thoughts from DeepLearning.AI!