Learn how the Jamba model integrates transformers and the Mamba architecture to efficiently process long contexts while maintaining quality.
We'd like to know you better so we can create more relevant courses. What do you do for work?
Instructors: Chen Wang, Chen Almagor
Learn how the Jamba model integrates transformers and the Mamba architecture to efficiently process long contexts while maintaining quality.
Understand the training process for long context models, and the metrics used to evaluate their performance.
Gain hands-on experience applying Jamba to tasks such as processing large documents, tool-calling, and building large context RAG apps.
Learn to use the Jamba model, a hybrid transformer-Mamba architecture trained to handle long contexts, in this new course, Build Long-Context AI Apps with Jamba, built in partnership with AI21 Labs and taught by Chen Wang and Chen Almagor.
The transformer architecture is the foundation of most large language models but it is computationally expensive when handling very long input contexts.
Thereâs an alternative to transformers called Mamba. Mamba is a selective state space model that can process very long contexts with a much lower computational cost. However, researchers found that the pure Mamba architecture underperforms in understanding the context, and the quality of the output can be lower even in tasks as simple as repeating the input in the output of the model.
To overcome these challenges, AI21 Labs developed the Jamba model, which combines Mambaâs computational efficiency with the transformerâs attention mechanism to help with the output quality.
In this course, youâll learn about the Jamba architecture, how it works, and how it is trained. Youâll also learn how to prompt Jamba and use it to process long documents and build long-context RAG apps.
In detail, youâll:
Start building AI apps that can handle context as long as all of your unread emails from the last 20 years!
Anyone who has basic Python knowledge and wants to learn more about how the Jamba model works and how it is used for building long-context AI apps.
Introduction
Overview
Transformer-Mamba Hybrid LLM Architecture
Jamba Prompting and Documents
Tool Calling
Expand the Context Window Size
Long Context Prompting
Conversational RAG
Conclusion
Quiz
Gradedă»Quiz
ă»10 minsCourse access is free for a limited time during the DeepLearning.AI learning platform beta!
Keep learning with updates on curated AI news, courses, and events, as well as Andrewâs thoughts from DeepLearning.AI!