Gain an understanding of the key components of transformers, including tokenization, embeddings, self-attention, and transformer blocks, to build a strong technical foundation.
We'd like to know you better so we can create more relevant courses. What do you do for work?
Instructors: Jay Alammar, Maarten Grootendorst
Co-authors of "Hands-On Large Language Models"
Gain an understanding of the key components of transformers, including tokenization, embeddings, self-attention, and transformer blocks, to build a strong technical foundation.
Understand recent transformer improvements to the attention mechanism such as KV cache, multi-query attention, grouped query attention, and sparse attention.
Compare tokenization strategies used in modern LLMS and explore transformers in the Hugging Face Transformers library.
Introducing âHow Transformer LLMs Work,â created with Jay Alammar and Maarten Grootendorst, authors of the âHands-On Large Language Modelsâ book. This course offers a deep dive into the main components of the transformer architecture that powers large language models (LLMs).
The transformer architecture revolutionized generative AI. In fact, the âGPTâ in ChatGPT stands for âGenerative Pre-Trained Transformer.â
Originally introduced in the groundbreaking 2017 paper Attention Is All You Need, by Ashish Vaswani and others, transformers were a highly scalable model for machine translation tasks. Variants of this architecture now power todayâs LLMs such as those from OpenAI, Google, Meta, Cohere, and Anthropic.
In their book, Jay and Maarten beautifully illustrated the underlying architecture of LLMs through insightful and easy-to-understand explanations.
In this course, youâll learn how a transformer network architecture that powers LLMs works. Youâll build the intuition of how LLMs process text and work with code examples that illustrate the key components of the transformer architecture.
Key topics covered in this course include:
By the end of this course, youâll have a deep understanding of how LLMs process language and youâll be able to read through papers describing models and understand the details that are used to describe these architectures. This intuition will help improve your approach to building LLM applications.
Anyone interested in understanding the inner workings of transformer architectures that power todayâs LLMs.
Introduction
Understanding Language Models: Laguage as a Bag-of-Words
Understanding Language Models: (Word) Embeddings
Understanding Language Models: Encoding and Decoding Context with Attention
Understanding Language Models: Transformers
Tokenizers
Architectural Overview
The Transformer Block
Self-Attention
Model Example
Recent Improvements
Mixture of Experts (MoE)
Conclusion
Director and Engineering Fellow at Cohere and co-author of Hands-On Large Language Models
 Senior Clinical Data Scientist at Netherlands Comprehensive Cancer Organization and co-author of Hands-On Large Language Models
Course access is free for a limited time during the DeepLearning.AI learning platform beta!
Keep learning with updates on curated AI news, courses, and events, as well as Andrewâs thoughts from DeepLearning.AI!