Welcome to Federated Fine-Tuning of LLMs with Private Data built in partnership with Flower Labs. I'm delighted to introduce the instructor Nicholas Lane, who is co-founder and Chief Scientific Officer of Flower Labs, as well as professor at the University of Cambridge. Thanks, Andrew. Wonderful to be here. Imagine you work for a multi-site health care provider and want to develop a large language model to answer health-related queries. The data needed to train the model is spread out across multiple sites, but privacy constraints prevent you from accessing all this data directly. In the previous Flower Labs course, "Intro to Federated Learning", you might have learned that federated learning allows you to distribute model training to the data, rather than bring the data to the model. In this course, you learn how to apply that same concept to fine-tune an LLM. now, LLMs often have billions of parameters. In federated learning, you distribute the model to the data and then exchange parameters with other size at each iteration. This can involve exchanging a significant amount of data. That's right Andrew. In this course, you will learn two techniques that make the whole process much more efficient. First, rather than attempting to train the LLM from scratch, you will start by using a pre-trained model and fine-tune it with private data. There are now many very good open source LLMs that can serve as a great starting point to speed everything up. Second, you will further refine the standard fine-tuning approach by using parameter-efficient fine-tuning, or PEFT. PEFT only needs to modify a small fraction of the LLM weights during fine-tuning, rather than updating all of the parameters. In this course, you will see that this can be done with as little as 0.1% of the total. One of the worries of developers is whether it trained LLM might review sensitive training data. For example, if someone's personal information such as their home address or credit card number was somehow in the training data, might that get leaked by the LLM. In this course you will see examples of how training data can be recovered from even current open source LLMs. Then, you will learn to apply federated learning and differential privacy techniques to minimize the risk of private data being exposed when fine-tuning an LLM. In this course example, each healthcare site never needs to transmit data during the training process. Thanks to federated learning. This provides a strong foundation from which you can build on and through differential privacy, model updates have calibrated noise added to make data recovery even more difficult. This combination makes the data much harder to recover, and you also learn how easy it is to enhance privacy protection even further by adding additional methods like encryption. If your data requires it. Many people have worked to create this course. I'd like to thank Javier Fernandez-Marques, Preslav Alexandrov, Yang Gao, Ruth Galindo from Flower Labs. As well as Diala Ezzeddine and Geoff Ladwig from DeepLearning.AI. The first lesson will be an introduction to federated LLMs and the key strengths of using federated fine-tuning with LLMs. That sounds great. Let's go on to the next video and get started.

Please sign in to view this content

Next Lesson

Federated Fine-tuning of LLMs with Private Data

Introduction
Video
・
3 mins

Smarter LLMs with Private Data
Video
・
11 mins

Centralized LLM Fine-tuning
Video with Code Example
・
16 mins

Federated Fine-tuning for LLMs
Video with Code Example
・
41 mins

Keeping LLMs Private
Video with Code Example
・
34 mins

Conclusion
Video
・
1 min

Course Feedback

Community