DeepLearning.AI
AI is the new electricity and will transform and improve nearly all areas of human lives.

Quick Guide & Tips

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

Enjoy 30% Off Now. Cancel Anytime!

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

View All Plans and Features

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

DeepLearning.AI
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to How Transformer LLMs Work. In this course, you learn about the main components of the LLM transformer architecture that has transformed the field of language processing. I'm delighted the instructors for this course are Jay Alammar and Maarten Grootendorst. In their book, Hands on Large Language Models. Jay and Martin beautifully illustrated the underlying architecture of LLMs and provided insightful explanations of Transformers. Thanks, Andrew. I'm so happy to be here today and to have the opportunity for Martin and me to teach this course. We wrote our book to provide an easy to understand introduction to transformer-based LLMs. This course allows us to present that information in person. Our hope is that as you leave this course, you will be able to read through papers describing models and understand the details that are used to describe these architectures. And these intuitions will help you use LLMs better too. Let me add that it is a pleasure to work with you on this, Andrew. I've taken so many of your courses over the years, and I appreciate all your effort into making machine learning and AI accessible to all. We're so happy to add this course to the effort. Thank you Martin and Jay is so good to work with both of you. Let me introduce the main topic of this course the transformer. The transformer architecture was first introduced in the 2017 paper, Attention is All You Need by Ashish Vaswani and others for machine translation tasks. The idea was to say input an English sentence and have the network output a German sentence. The same architecture tend to be great at inputting, say, a prompt and outputting a response to that prompt like a question and the answer to that question. And so this helped herald the early rise of large language models. The original transformer architecture consisted of two main parts. It was an encoder and decoder. Consider translating English into German, the encoder preprocess the entire input English text to extract the context needed to perform the translation. Then the decoder uses the encoder context to generate the German. The encoder and the decoder form the basis for the models used in many language models today. The encoder model provides rich, context-sensitive representations of the input text, and is the basis for the Bert model and most of the embedding models using RAG applications. The decoder model performs text generation tasks such as summarizing text, writing code, answering questions, and is the basis for most popular LLMs, such as those from OpenAI, Anthropic, Cohere, and Meta. Let's go over what you learned in this course. You first delve into recent developments in LLMs to see how a sequence of increasingly sophisticated building blocks led to the modern transformer. You then learn about tokenization, which consists of taking text and breaking it down into tokens that comprise words or word fragments. They can then be fed into the LLM. After that, you gain intuition about how the transformer network works. Focusing on the decoder-only models. A generative model takes in a text prompt, and it generates a text in response by generating one token at a time. Here's how the generation process works. The model starts by mapping each input token into an embedding vector that captures the meaning of that token. After that, the model parses these token embeddings through a stack of transformer blocks, where each block is a specific neural network architecture that those designed to learn flexibly from data and also scale well on GPUs. So you learn how each block is made up of an attention layer and a feed-forward network. The model then uses the output vectors of the transformer blocks and passes them to the last component, the language modeling head, which generates the output token. I'd like to thank from DeepLearning.AI, Geoff Ladwig and Hawraa Salami for helping with this course. By the way, I know that Transformers might seem a little bit like magic to some people, and in fact, one common experience after you learn how Transformers work is, I've heard some people go, oh, that's it. And I think part of the reason for that reaction is the magic of LLMs actually comes from two parts: One, the transformer architecture, which you learn is well worth learning. And second, all the incredibly rich data that the models learn from. So while the magic of LLMs comes not just from the transformer architecture, but also from the data, having a solid intuition of what this architecture is doing will give you better intuitions about why they behave in certain ways, as well as how to use them. Let's get started with Martin in the first lesson.
course detail
Next Lesson
How Transformer LLMs Work
  • Introduction
    Video
    ・
    5 mins
  • Understanding Language Models: Laguage as a Bag-of-Words
    Video
    ・
    5 mins
  • Understanding Language Models: (Word) Embeddings
    Video
    ・
    5 mins
  • Understanding Language Models: Encoding and Decoding Context with Attention
    Video
    ・
    5 mins
  • Understanding Language Models: Transformers
    Video
    ・
    7 mins
  • Tokenizers
    Video with Code Example
    ・
    11 mins
  • Architectural Overview
    Video
    ・
    6 mins
  • The Transformer Block
    Video
    ・
    6 mins
  • Self-Attention
    Video
    ・
    10 mins
  • Model Example
    Video with Code Example
    ・
    9 mins
  • Recent Improvements
    Video
    ・
    10 mins
  • Mixture of Experts (MoE)
    Video
    ・
    9 mins
  • Conclusion
    Video
    ・
    1 min
  • Course Feedback
  • Community