All Courses
Professional Certificate
Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-Training

Professional CertificateIntermediate6 hours 10 mins

Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-Training

Instructor: Sharon Zhou

Join the Waitlist

All Courses
Professional Certificate
Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-Training

Intermediate
6 hours 10 mins
43 Video Lessons
Instructor: Sharon Zhou
AMD

Turn pretrained LLMs into production-ready models through post-training

Align a pretrained model for real tasks: use SFT and RLHF to improve instruction following, reasoning, and safer behavior.
Use evaluation to guide improvements: build evals that reveal problems, choose data and rewards accordingly, and iterate.
Get models ready for production, cost-aware: plan promotion and serving, monitor reliably, and account for compute and budget.

Why Enroll

Large language models are powerful, but raw pretrained models aren’t ready for production applications. Post-training is what adapts an LLM to follow instructions, show reasoning, and behave more safely.

Many developers still assume “LLMs inherently hallucinate,” or “only experts can tune models.” Recent advances have changed what’s feasible. If you ship LLM features (e.g., developer copilots, customer support agents, internal assistants) or work on ML/AI platform teams, understanding post-training is becoming a must-have skill.

This course, consisting of 5 modules and taught by Sharon Zhou (VP of AI at AMD and instructor to popular DeepLearning.AI courses), will guide you through various aspects of post-training:

Post-training in the LLM lifecycle: Learn where post-training fits, key ideas in fine-tuning and RL, how models gain reasoning, and how these methods power products.
Core techniques: Understand fine-tuning, RLHF, reward modeling, and RL algorithms (PPO, GRPO). Use LoRA for efficient fine-tuning.
Evaluation and error analysis: Design evals, detect reward hacking, diagnose failures, and red team to test model robustness.
Data for post-training: Prepare fine-tuning/LoRA datasets, combine fine-tuning + RLHF, create synthetic data, and balance data and rewards.
From post-training to production: Learn industry-leading production pipelines, set go/no-go rules, and run data feeedback loops from your logs.

In partnership with

We built this course with AMD to bring post-training practices used in leading labs to working engineers. You’ll get hands-on labs powered by AMD GPUs, while the methods you learn remain hardware-agnostic.

Who should join?

This course is designed for developers, ML engineers, software engineers, data scientists, and students who want to apply post-training techniques to production LLM systems. It’s also valuable for product managers and technical leaders who need to make informed decisions about post-training strategies and lead cross-functional teams working on LLM products.

To make the most of this course, we recommend strong familiarity with Python and a basic understanding of how LLMs work.

Course Outline

Fine-tuning & RL for LLMs: Intro to Post-training

This course is part of Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-Training

Module 1: Post-Training Overview

Conversation between Sharon Zhou and Andrew Ng
Video
・
10 mins

Background
Video
・
5 mins

Where post-training (fine-tuning and RL) fits into LLM training
Video
・
6 mins

Intuitions behind fine-tuning and RL
Video
・
4 mins

Key components to making fine-tuning and RL work
Video
・
10 mins

Post-training example: Reasoning
Video
・
5 mins

Post-training example: Safety and security (RLAIF)
Video
・
4 mins

Post-training in the wild
Video
・
4 mins

Graded・Quiz

Graded・Code Assignment

・

1 hour

Join the DeepLearning.AI Forum to ask questions, get support, or share amazing ideas!
Reading
・
5 mins

Module 1 Lecture Notes
Reading
・
1 min

Module 2: Core techniques in Fine-Tuning and RL

Data: What you need and how to prepare it
Video
・
7 mins

Data: Tokens for models to read/write Data (OPTIONAL)
Video
・
10 mins

Fine-tuning math: Loss, gradients, weight updates (Part 1)
Video
・
6 mins

Fine-tuning math: Loss, gradients, weight updates (Part 2)
Video
・
5 mins

Fine-tuning: Hyperparameters & hyperparameter tuning (Part 1)
Video
・
7 mins

Fine-tuning: Hyperparameters & hyperparameter tuning (Part 2)
Video
・
4 mins

Module 2 Graded Lab: 1

Graded・Code Assignment

・

1 hour

Fine-tuning: Parameter efficient fine-tuning (PEFT)
Video
・
9 mins

RL: Rewards and preference learning
Video
・
9 mins

RL: Training objective and RLHF
Video
・
8 mins

RL: PPO and GRPO Algorithms
Video
・
8 mins

Module 2: Quiz

Graded・Quiz

・

30 mins

Module 2 Graded Lab: 2

Graded・Code Assignment

・

1 hour

Module 2 Lecture Notes
Reading
・
1 min

Module 3: Evaluation as the North Star

Why evals are the north star
Video
・
2 mins

Evals for post-training: Test sets and metrics
Video
・
8 mins

RL test environments and monitoring RL updates
Video
・
5 mins

Reward hacking
Video
・
4 mins

Error analysis: Why it matters
Video
・
2 mins

Error analysis: Diagnosing errors & interventions
Video
・
4 mins

Error analysis: errors → causes → fixes
Reading
・
10 mins

How to invest in good evals
Video
・
5 mins

Red Teaming: Real world failures
Video
・
4 mins

Graded・Quiz

Graded・Code Assignment

・

1 hour

Module 3 Lecture Notes
Reading
・
1 min

Module 4: Data Driven Post-Training

How much data you need for post-training
Video
・
7 mins

Data for fine-tuning
Video
・
6 mins

Data for RL (Part 1)
Video
・
7 mins

Data for RL (Part 2)
Video
・
4 mins

Putting it together
Video
・
2 mins

Synthetic data pipelines
Video
・
7 mins

Template engineering
Video
・
4 mins

Constitutional AI, revisited
Video
・
5 mins

Balancing data and rewards
Video
・
5 mins

Graded・Quiz

Graded・Code Assignment

・

1 hour

Module 4 Lecture Notes
Reading
・
1 min

Module 5: Production Considerations

A production post-training pipeline
Video
・
7 mins

Agents
Video
・
9 mins

RL promotion rules (go/no-go)
Video
・
6 mins

Data-feedback flywheel
Video
・
5 mins

Monitoring and observability
Video
・
4 mins

Infrastructure (Part 1)
Video
・
4 mins

Infrastructure (Part 2)
Video
・
7 mins

Production-ready checklist
Video
・
3 mins

Graded・Quiz

Graded・Code Assignment

・

1 hour

Acknowledgments 
Reading
・
1 min

Module 5 Lecture Notes
Reading
・
1 min

Instructor

Sharon Zhou

Co-Founder and CEO of Lamini

What Learners From Previous Courses Say About DeepLearning.AI

Jan Zawadzki

“Within a few minutes and a couple slides, I had the feeling that I could learn any concept. I felt like a superhero after this course. I didn’t know much about deep learning before, but I felt like I gained a strong foothold afterward.”

Kritika Jalan

“The whole specialization was like a one-stop-shop for me to decode neural networks and understand the math and logic behind every variation of it. I can say neural networks are less of a black box for a lot of us after taking the course.”

Chris Morrow – Deep Learning Specialization

“During my Amazon interview, I was able to describe, in detail, how a prediction model works, how to select the data, how to train the model, and the use cases in which this model could add value to the customer.”

Frequently Asked Questions

I don’t have any programming experience, can I take this course?

We recommend starting with a beginner course such as the Machine Learning Specialization.

I already have Python experience, is this course for me?

Yes! This course is perfect for anyone with a background in Python ready to dive deeper into the post-training of large language models.

I have questions about my DeepLearning.AI Pro subscription, whom can I ask?

Please send an email to [email protected] to receive assistance.

How much does the course cost?

A DeepLearning.AI Pro membership costs $25/month

Will I receive a certificate at the end of the course?

Yes! You’ll earn a certificate upon completing the course, recognizing your skills in post-training large language models

Fine-tuning and Reinforcement Learning for LLMs: Intro to Post-Training

Intermediate
6 hours 10 mins
43 Video Lessons
Instructor: Sharon Zhou
AMD

Enroll for Free

Join today and be on the forefront of the next generation of AI!

Want to learn more about Generative AI?

Keep learning with updates on curated AI news, courses, events, as well as Andrew’s thoughts from DeepLearning.AI!

Enroll for Free