Welcome to Improving Accuracy of LLM Applications. Built in partnership with Lamini and Meta. AI applications can now perform tasks that were previously very challenging for computers, such as providing a natural language interface to a database. But these applications often perform well in some areas and yet struggle in others. In this course, you learn a framework for a set of development steps to systematically improve your application's performance. Specifically, you create an evaluation dataset to measure performance, prompt engineer, and then finally fine-tune your model. I'm delighted to introduce our instructors for this course. Sharon Zhou is a returning instructor and CEO of Lamini, which is a company that provides LLM serving and fine tuning for their users, and in which I made a seed investment. And another returning instructor, Amit Sangani, is Director of partnerships of the wonderful Llama team at Meta. Sharon and Amit. Welcome back. Thanks so much. And you're excited to be back. Thank you Andrew. So Amit, the Llama family of open source models has become a huge success. Indeed Andrew. The Llama models are widely used in numerous applications. They have undergone extensive training and rank highly in most foundation model benchmarks. It's gratifying to see so many organizations using our models, but it also presents challenges. Since the models are trained on public data sets and tasks, they may not perform as well on applications requiring proprietary data or on tasks that weren't specifically trained for. But Meta has released these models as open-source, and this allows any developer to fine tune them for their specific applications. Exactly, Andrew. A crucial aspect of these models is their open-source nature. This means users can fine-tune the models for their specific tasks. We have seen many applications, especially with the smaller models where users have fine tuned them for tasks like text to SQL classification, question and answers, recommendation, and summarization. They've have also been adapted to understand proprietary datasets such as financial data, customer data, and legal information. I really appreciate Meta's work releasing these models in an open way. And yet, fine-tuning also isn't the very first step we typically recommend to application builders is one of the options that we might pursue only after exhausting other, simpler options. That's right Andrew, and we'll go through exactly that. At Lamini we work with many enterprises to improve the accuracy of their LLM applications, specifically to make LLMS factual and precise, and where users can start talking about nines in their LLM accuracy, like 95%, 99%. We've observed a successful pattern that has inspired the outline of this course for you. First you'll build your LLM application, hook it all up, prompt engineer with some self-reflection. Next, it's important to get rigorous in evaluating the model's performance so you know whether it's ready for prime time and where you can tell it to actually improve where the next frontier of accuracy is. If this meets your needs, then great. But if you find that even after a lot of prompt engineering work, it's still not yet working accurately enough. The next step is to use LLMs and create a dataset for fine-tuning your model. One myth of this step is thinking you don't have enough data. You'll actually learn that you probably do have enough data, and there are ways to significantly amplify the data you do have with LLMs themselves. Fine-tuning also used to be slow and costly, but by using parameter efficient fine-tuning techniques like LoRA, which stands for Low Rank adaptation, the time and costs have dropped dramatically. What's more, Lamini memory tuning is a technique that gives you a new frontier of factual accuracy, reducing time and developer effort to get to the same level or higher accuracy. Using optimized fine-tuning techniques, you can teach your LLM thousand new facts in just minutes on a single A100 or MI2 50 GPU. In this course, we'll use an example of fine-tuning an LLM to generate SQL queries for a specific schema. Through fine-tuning, you'll see accuracy improved from something more like 30%, all the way up to 95%, using just a small data set of 128 examples in about 30 minutes, at a cost of just a few dollars. What's more, that process can be further optimized down to 6 or 7 seconds with even higher performance memory tuning. Many people have contributed to this course. I'd like to thank from Lamini, Johnny Li and Nina Wei. And from DeepLearning.AI Geoff Ladwig and Esmaeil Gargari. All right, let's dive into optimizing applications by going on to the next video.