I'm very excited about this lesson because you are going to learn about performance optimization. As an engineer, when you deploy the systems in production, you know that you need to think about performance. Sometimes that will mean that you need to favor speed. Other times that will mean that you need to favor quality. Whatever it is, you know that you got to keep consistency. So in this lesson, we're going to dive into how do you optimize performance so for your agents and your crew. And this is going to unlock a lot of potential because it's going to allow you to have consistent results, from your crews. Let's dive into that. The two main components when you're thinking about this AI automations is speed and quality. And usually speed comes from smaller models with smaller LLMs that you can run in smaller devices, or even if you run them on the cloud, there's two pretty fast. And for quality, you're usually thinking about bigger models, things like ChatGPT-4o, and other models out there. They're very good at providing complex results, but they take a little longer to get there. Doesn't matter for what you're optimizing a few tasks may require speed. And other tasks might require quality, but the one thing that you want to make sure that you get is consistent. Consistency is key here, because you want to make sure that even though this is a fuzzy automation, you're always getting the same speed or the same quality as you go. So when you think about the distribution of these models between the smaller and bigger ones, what you see it's a close-to-linear relationship regarding speed, meaning that as the models get bigger and bigger, they get slower and slower. But when you're thinking about a quality that is not necessarily as linear as the speed, because the quality depends a lot on what kind of the tasks that you're trying to accomplish. If an individual task that an agent's trying to accomplish is not that complex, then smaller models might actually have a great quality and it might be good enough with smaller models. So the beauty of this is that when you're thinking about production use cases and complex use cases, you can actually choose different points in this distribution to be. And you can have your agents should be performing from one model in one task and from another model in another task. The key thing is that you want to keep consistency no matter what. So whatever you are optimizing for individually, for every task, you want to make sure that you keep that the same. All right. So we know that is speed and quality are the two main variables that we want to keep consistent here. But how do we measure this. That's where testing our agents performance comes in. And we're going to talk about a specific feature in CrewAI that is CrewAI test. So let's dive into that. When you think about the task, you basically have a description an expected output, and an agent And the interesting thing here is that the comparison between the description and the expected output actually allows you to compare your if the result is close to what you expected it to be. So that allows us to actually rank your task and how good the output of it was. So this was a design choice in CrewAI from the get-go to allow you to task your tasks at a scale. And you can actually renders by using the CrewAI test command on your terminal. And once that you do that, what happens is you have your crew basically do all the tasks that you wanted to do, and then that information passed to a judge, LLM that you can set to be whatever LLM you want. And by the end of it, you get a final report. Where you can see your tasks and your crew, and you can see what was the output for each task during each run. And here you can see that we got all nines. But this is not always the case. But this is an easy way to allow you to measure how consistent your agents are, your tasks outputs are, and how good they are so that you can act on it. Now, we can basically render this for our entire crew and learn that, let's say task number one, it's actually giving an output of quality seven. How do we bring this to a nine? How do we change and improve this task and this agent and this crew so that it can improve its output on a consistent manner? Well, usually it's smaller things. It's usually a specific format that it's not a following or a specific style that it should apply to or maybe there is some missing information like original search from where the information was taken from. So if you can help your agents understand that they're missing those small pieces, they can actually follow that up to the latter, to the point that your outputs are going to be higher and you're going to have better results for your crew. So how do you do that without having to spend long periods of time improving your description and expected output? And that's where CrewAI train feature comes in. And this feature is so powerful that it can actually have a major impact on your crew's performance. And you can easily execute it by going in a terminal and running the CrewAI train command. Once that you execute this, a few things will happen. Let's dive into that. Your crew is going to run as usual, but now, whenever it finishes a single task, it's going to stop and ask you for feedback for every task in your crew, and you're going to be able to tell it what it's missing. Maybe for test number one, it's missing your original search of the research For task number two, it might be missing a format, a very specific format that you wanted to comply with. The thing is, now you can give this very specific feedback on every single task. Once that you finish with the feedback, that is automatically pushed into a new judge LLM. And this Judge LLM is going to extract every single learning for that a specific task, and it's going to push that into your crew memory. And the way that it works is that a from now one whenever you're crew, it's going to run again for every single task, it's going to remember the feedback that you gave it, and you're going to make sure to comply with them. This is so powerful because it allows you to build very complex use cases and get a pretty consistent results across many runs, so I definitely recommend you check it out. So let's actually build one ourselves. Let's dive into it and learn how we can use the training and the test features. This is going to be a lot of fun, so come along and I'll see you in a second.