Welcome to Build Long-Context AI Apps with Jamba, built in partnership with AI21 Labs. The transformer architecture is the foundation of most large language models, but they're not efficient at processing long context lengths. There's an alternative to transformers called Mamba that could process very long input contexts by reading an arbitrary long context and compressing it to a fixed-size representation. A lot of people have been excited about Mamba as a possible alternative or successor to the transformer, but researchers found that the pure Mamba architecture underperforms when the context is very long, since a compression mechanism causes it to lose information. AI21 developed a novel Jamba model that combines a traditional transformer with Mamba to take advantage of Mamba's efficiency, and also the transformer's attention mechanism to how retrieve the right piece of information at the right time. In detail, one of the strengths of the transformer is that it compares every pair of input tokens to see how related to each other. This is the attention mechanism that the decider would say when processing one word, what other words in a sentence they should be paying attention to. But this is quadratic cost in the input length. And while there are various techniques to bring down this cost, it has still been expensive because transformers can handle very long input context length. Then the paper "Mamba: Linear-Time Sequence Modeling with Selective State Spaces" by Albert Gu and Tri Dao describe a modern state space model to perform as well and can be efficiently implemented on GPUs. But Mamba was not as good as a transformer at long-distance relationships and in-context learning. Then the Jamba model, which is a hybrid transformer Mamba architecture, tries to capture the best of both worlds. I'm delighted the instructors for this course are Chen Wang, who is a lead Solution architect and Chen Almagor, who is an algorithm tech Lead from AI21 Labs. Both experts in Mamba and Jamba and Transformers. We're excited to be here and teach a course on Jamba. In the course, we will explore the underlying structure of the Jamba model, including transformers, but focusing on the less well-known state based aspects of the model and understand why this hybrid architecture is beneficial. We will also learn about strategies for expanding the context length of language models, review key aspects in evaluating long context models, and understand the advantages of Jamba in these settings. You will also get hands-on experience using Jamba with labs on prompting processing documents, tool calling, and RAG applications, all with a focus on the advantage of long context window of the Jamba model to improve performance. Many people have worked to create this course. I'd like to thank Ian Cox from AI21 labs and from DeepLearning.AI, Esmaeil Gargari and and Geoff Ladwig. The first lesson will be an overview of the Jamba model. That sounds great. Let's go on to the next video and learn about Jamba.

Please sign in to view this content

Next Lesson

Build Long-Context AI Apps with Jamba

Introduction
Video
・
3 mins

Overview
Video
・
5 mins

Transformer-Mamba Hybrid LLM Architecture
Video
・
14 mins

Jamba Prompting and Documents
Video with Code Example
・
8 mins

Tool Calling
Video with Code Example
・
6 mins

Expand the Context Window Size
Video
・
13 mins

Long Context Prompting
Video with Code Example
・
3 mins

Conversational RAG
Video with Code Example
・
8 mins

Conclusion
Video
・

Appendix – Tips and Help
Code Example
・

Course Feedback

Community