So you just saw a sample of the kinds of unreliable LLM behaviors that can affect your proof of concept genAI application. In this lesson, you'll take a deeper look at how AI guardrails can help you mitigate these problems by validating and verifying everything your application does. The first question we have is what are guardrails? Guardrails are secondary checks or validations that ensure that inputs or outputs of an LLM call is as expected. When you're looking at like what an expectation might be, you might be checking, you know, something very, very trivial like "Is my LLM output formatted correctly? If it's like a simple string, does it, is it split into, you know, like a list or something?" Or if it's, you know, maybe a json output, like, does it look like a json output of the schema that I expect? Or it might end up being like a very, very complex expectation, right? Which is making sure that your AI output isn't hallucinated and there are no jailbreak attempts that are detected. But essentially it's not blindly trusting the LLM to do the right thing, but explicitly verifying whatever your expectation or whatever your success criteria out of that LLM might be. The previous lesson looked at your standard RAG chatbot. If now you want to take that RAG chatbot to a production-ready stage, where would you go ahead and apply your AI guardrails? So on the left, what we have is a very standard, straightforward barebones of an LLM call, where what you have fundamentally is, you know, you start with the prompt and if you're doing RAG or something complicated, your retrievals from your vector database and your system prompt, and everything else goes into this little box here. Right. And then once you have this, this prompt, you send it over to your LLM. Your LLM, you know, like do some nice next-token prediction and then get an output which you will end up sending back to the rest of your genAI application. Right? Now the core idea behind guardrails is very, very straightforward, which is once again explicitly verifying whatever your LLM does. So now before you send your prompt over to your LLM, what you would do is send this prompt over to force this explicit input verification sweep our input guard. Where that guard contains a number of guardrails that explicitly validate your expectations. Right? So once again these expectations can be a lot of different things depending on what you're building. Maybe your expectation is if my prompt contains any PII, I don't want to send it over. Right? So that is an expectation that you can explicitly check for using a guardrail. Maybe you want to check like is this question or is this prompt something that is off-topic and not what my chatbot is intended for? Maybe even something more sophisticated, like, is there a jailbreak attempt that is detected in my prompt? And once you know that your prompt doesn't contain sources of unreliable behavior, that will in turn make your application unreliable. Let's say you send that prompt over to your LLM once it passes through this verification suite, your LLM does it's thing and then ends up getting an output, which once again, you will first pass through to this explicit output verification suite or output guard, which contains a number of guardrails that check for different types of criteria that you may have. On the output side, maybe something else that you care about is like, are there any hallucinations that are present in my text? Right? So that's something you can test for. Are there once again off-topic responses or are there maybe profanity or unsafe outputs. So you can explicitly test for any of these types of criteria failures. So how does a guardrail really work under the hood? They might be something as simple as a rules engine, which might contain, let's say, regex rules. Right? Is there this type of text that's present in my LLM output? They might end up being something more sophisticated than that. Where they're small, fine-tuned machine learning models that are actually running within the process of your guardrail. Maybe they're looking at named entities that are present in any text that you pass into the guardrail. Maybe there are specific topics or profane outputs, etc. that are present that, you know, you're checking the fine-tuned ML model. Some guardrails might also end up being secondary LLM calls. So you know, you might end up seeing a lot like "score this output on how well it answers the user's questions" or "how much it aligns with, you know, some rules that you have." Right? So that might be done within the scope of a guardrail. Very commonly, though, a guardrail might end up being a combination of all of these above technologies where you can mix and match what tools you use. So then you can very accurately capture the behavior that you want from your LLM application. So with this lesson, we looked at a lot of you know, what guardrails are, how they really work. But let's dive back to our first lesson where we really look at, you know, how do guardrails help mitigate a lot of the unreliable behavior that exists in modern generative AI applications? The first way they help is that they can explicitly verify what are the inviolable constraints that are too high cost for your system to, you know, get wrong, such as never leak PII, right? There's actual financial fines and risks for leaking PII in the system that shouldn't. And then guardrails can make sure that, you know, you're not just relying on your AI system. You're explicitly verifying that that inviolable constraint of not leaking PII is never violated. Second is that even a few guardrails are running along with their system. They can really measure the occurrence of undesirable behavior in your system, such as, you know how many times has my LLM refused to answer a question, or how many times my LLM, you know, get questions about like specific topics in my system, right? So it's taking this idea of something that's very hard to measure, which is, you know, how well does my chatbot help resolve my customer questions and then really break that down into specific small checks that give you an idea of the overall performance of that system. The third key technique, especially necessary for as we move towards agentic and multi-step, complex applications, is that you can contain the cascading error that happens in a lot of these multi-step applications, and make sure that you really are almost drawing a bounding box around what that AI model can do, so that as you combine a lot of these LLM calls in sequence, you know that their error is contained and you don't get that same compounding inside. And then using all of this, you know, you really end up limiting the worst-case risk for your GenAI application. So now that you understand what guardrails are and how they increase the reliability of your application, let's head on to the next lesson and implement your first guardrail. This will be a very, very simple example where we constrain the chatbot that we've been building to not reveal this exciting secret project that the pizzeria has been working on.