In this lesson, you will learn how to build your AI applications with the DSPy. Let's dive into the code. This lesson comes with a lab. And in the lab, you will learn these DSPy fundamentals by building a simple sentiment analysis program with DSPy built-in modules. And then you will also learn how to build a custom agent with DSPy with a demo of building the "name the celebrity" game. It's a very simple game where the player one thinks about a celebrity name and player two starts asking yes or no question, and he'll fine-tune the name or use up all the question quotas. DSPy programing has two important abstractions: signature and the module. Signature is where you define the input and output contract help your LLM calls. And DSPy module, is the interface of talking to the LLM with custom logic. Let's look at them one by one. As we mentioned in lesson one, DSPy context Interaction with LLM is similar to calling a RESTful API that has a well-defined input and output format, but this data format definition happens on client side instead of the server side like a normal RESTful API. The definition is through DSPy signature. In short language, Signature defines the input-output fields along with types and annotations of the LM interaction. There are two ways of defining a DSPy signature, class-based signature and stream-based signature. The first way is class-based signature. All you need to do is subclassing from DSPy dot signature. And the mark input fields, by DSPy dot input field. And output fields by the output field. And optionally, you can provide type and annotation to each field. There are five important parts of a class-based signature. The red box, which is the doctrine of the signature class, is a signature instruction which defines the purpose of the LM call. Like a brief overview of your task, this usually only requires a few sentences, but if you already have an existing prompt and don't want to simplify, you can paste the whole prompt into a docstring. The orange box is the field name. You will use this name to carry input data and access the output data. The blue box simply tells the program if there's an input field or output field. And the purple box carries actual information about the field and it's useful when the field name is not self-explanatory. Lastly, the green box carries the type information, which could be unimplemented Python types who are behind in costume class or pydantic models. This is mostly useful for output fields. If you specify the type when you access the output fields, the value will automatically be of the type you desired. DSPy also supports a lighter way of signature definition custom-based signature. You just start writing in profiles before the arrow and write output fields after the arrow as separate fields after the comma. This is good for prototyping, but for general usage we recommend go with the class-based signature for flexibility and the more powerful support. Now we have defined input-output format, but we still only have static information, and we need to define a way to utilize the signature to talk to the LM. And that is the purpose of DSPy module. Module is a minimal building block of a DSPy program. And in most situations, has signature attached. The simplest the module is DSPy dot predict what formats a user query to LLM prompt and it parses the LM response according to that task signature. Modules have configurable attributes aside from the signature. For example, attribute demo carries a few-shot examples. These modules can be customized for implementing custom logic, and the module can consist of sub modules. DSPy provides a list of built-in modules that makes it easy for users to get started, and the most important one is DSPy Predict, which simply does LM interaction and it is a building block for all the complex modules. ChainOfThought is also commonly used. Aside from the player response, they also ask for reasoning behind the answer. React, stand for reasoning, and act. We'll be using the next lesson, which is a common standard for building AI agents. DSPy program of thought, similar to react, but the tool calling is just a code. And DSPy refine, with that, users can set a reward function and threshold. If the threshold unmet that which is issue retries with LM feedback. That's the definition of DPSy module. Let's talk about usage this right now. To use the built-in modules, simply pass a signature to a built-in modules and you can input modules through the input fields. Pass in the keyword argument. For example, in this slide, you create a chain-of-thought instance which has a single input field called question. Then when you invoke the module, you just pass in the value into the question. We have a complete list of built-in modules available in our documentation site. Please read more there. But more commonly, you have complex logic that cannot be fully covered by previous modules. In those cases, you will need to write a custom module to define customer logic. This is very similar to PyTorch. You subclass from the DSPy dot module class and implement the forward method with your customer logic, then you'll create a module instance and the instance is a callable similar to previous modules. This is very flexible. You can call any Python function within your added frameworks LangChain or Llama Index or any tools like SQL or file system handler. As long as that is the Python program, you are allowed to put that in the forward method. That's a lot of contacts. Let's do some coding to get a better sense of how this module signature system works. Let's get started with the coding. Before we start, when you set up the API key, and don't worry, in this lab we have set up the key for you. Simply call our helper function in the OpenAI API key and set that as an environment variable then you're good to go. The first step of the DSPy programing is choosing your LM. In this lab, I used GPT-4o-mini. And to change our LM, simply change the string here. It is following a format a provider name followed by model name. Provider name is like OpenAI, Anthropic. And the model name is actual model name like GPT-4 for that's mini or GPT-4o. Now let's start building a sentiment classifier to see how the module signature system works. As we mentioned in the slides, we recommend going with a class based signature. And here we subclass from the DSPy model signature and write the task description in docstring. And input will be a single string called text of type string, with some actual information marked as DSPy input field and output filled with DSPy sentiment of type integer marked as DSPy a helper field some actual information and restricted in the range of 0 to 10. This part is called pydantic constraints, which is fully supported in DSPy. We can also use a string based signature. Just to write an input before the arrow, output after the arrow. You can see that this is clean for losing some information from cost space signature. Now we have the signature which is a stagniting information about input and output. When to create a module to actually interact with LM Let's use the very basic DSPy module, DSPy predict which is a feed the signature into the DSPy predict and create an instance. Now it's time to invoke the instance. And we only have one in fulfill text. So we specify the value for a text and call that. And let's see the print. The output is that DSPy prediction, which is similar to a dictionary but allows both keyword accessor and the accessor has only one value, which maps to our output file a sentiment of type integer in the range of 0 to 10. Let's see how we access the sentiment field. We can use both keyword accessor worth of accessor. So I'll give you the same value. Let's see how to change the LM behind the scene. To do that simply call DSPy dot configure and change LM instance. Here, let's change to Chat GPT 4o and let's feed the same input Let's change that to be a GPT 4. Oh and let's feed the semi input to see if that can generate a different value okay. That gives us the same value. All right. That gives us the same value. Now let's change back LLM to be 4o-mini. Now a lot of people may ask this question where is my prompt? This looks a clean, but there is definitely some prompt somewhere when we talk to the LLM. To answer this question, in DSPy we provide an API call inspect history, and the N number here determines how many entries you want to pull from the memory. Let's run that It comes to output as a pretty print of the multi term message And LM a response. The system message maps to the message system rules message has a sequence showing information like the input fields, helper fields, pydantic constraints, and also determines the input output format. We talked to the LM and the user message, carries actual user input, and is formatted according to the format we defined above. And the response action is LM response, which is also formatted according to the format we defined. Let's now try a different building module. We use a ChainOfThough module and see what will happen for the output. You can see that addition to a sentiment field without the same value as before, but also include that reasoning part in the output. Let's see why we get a different output. We can check the LM interaction history by calling the DSPy inspector history. And we can see that the output fields, in addition to a sentiment include a reasoning field and has a specific format. And the response content field reasoning and get it parsed by the module. Let's see what's happening behind the scene. And explain by the minimal DSPy module, DSPy predict. DSPy dot predict has signature and other information like demos attached to the self. When we received a user input in before for the method, we send all the information to something called DSPy adapter, which talks to the LLM. After receiving all the information, including signature, user queries and other attributes. The adapter format, the actual prompt combining all this information and sends the prompts to the LM. The prompt tells the LM about response format according to an adapter type we use. We want to show in the lab how to change the adapter based on our language model, and normally we collect DSPy automatically select that for you. Let's dive deep into the prompt. We can see the fills information are at the top. Then we define the input and output data format, where the default adapter is a section header followed by the value. Then the user portion is formatted according to its defined format in a user message. The output format is very important because only if we know the data format, we can extract the field's value automatically. The upper flow is just a reverse. The LM gives back a response in our define format. Then the adapter parses it into the output fields and send back to the module. Because the response comes in the format, we define the prompt. The adapter knows how to parse it into the required fields. The parser result is wrapped in the DSPy prediction, which is similar to dict but allows both the accessor and the key accessor. To summarize in concise language, DSPy combines a signature and the module information and actually inputs into a multi-term prompt and parses at the LM response according to the signature. So the LM functions like a reservoir API with a well-defined inputs and outputs. Now let's build a complex program by customizing DSPy module. Let's first talk about how to use a different adapter. To do that simply call DSPy dot configure set adapter to be your favorite adapter class. And here we use JSON adapter which is a good adapter if your model supports structured output like GTP4o, GPT-4o-mini. Say you invoke the same ChainOfThought instance with the same input. And let's see what happens with a different adapter. We can see that in addition to the original ones, we ask for the outputs to be a Json format instead of the section header followed by the value, and the response will be formatted as a Json object, and so that adapter can parse it. Now let's start building the "name the celebrity" game by customizing a DSPy module. Let's recap of whatever the game is doing. Player one is us thinking about celebrity name, and the player two is the LM, starts asking yes or no question, until you will find the name or use up all the quotas. We give 20 for this game. Now let's build a builder module Before we talk about what's happening inside module and how we build with that, let's play with that to get a sense of how that works. I think of a celebrity name, and type the name. Use "Lebron James" And started asking a yes or no question and we just answer that. Not an actor, not a musician, Sports figure. Possible? Yes. Current player. Yes. Lakers. Yes. LeBron James. Yes. So okay to the answer. Let's take a look of how we write this module. So we subclass from DSPy module and we have two submodules. The first one is a question generator which generates a yes or no question. And it's a chain-of-thought module with this question generator and signature. Let's take a look at the signature. The signature has two input fields which is pass of questions to pass the answers. It starts with an empty list and it has to output, a new question for the yes or no question and guess made indicating if that's a generic a guess or let's say direct guess on the name. The second submodule is a reflection module. After we wrap up the game, we want to do a self-reflection which takes in a character celebrity name and the final guesser and the past question and answers. An output will be a single string for the refraction process. What's going good, what's going wrong. And we define the customer logic in a forward method. We first to get a user to enter a name, and we're starting accumulating a question. And in the for loop we'll keep generating a question and then keep asking user for the ask for it and keep that in a record. Until we reach the correct guess. Or we use up all of the quotas. When the process has started doing a self-reflection. How which maps to the log of this running process. The demo is just a for fun! But I want to explain with this demo a how flexible it is to use the DSPy module. Basically, you can write any Python function inside this forward method. We don't have any restriction, so it's easy to migrate to DSPY and a migrant off DSPy. And then we can also see how the signature system makes it easy to interact with the LM. Because we have the signature system has to explicitly for the output a new custom guess made. We don't need to worry about parsing the fields out of the LM response, and we don't need to worry about if the guess made can indicate like indicate if that's a generic to answer, a general question, or direct our name guess robustly. The last thing I'll show with this lab is how to save and load the DSPy module. DSPy can provide two ways of saving and loading. The first way is that only saving only saves the internal state, after DSPy module. To do that, Set a path to a Json file and set the flag off, save for program equal to false. In order to load that back into recreate the instance. Or if you have an existing instance and call that load on your module to load that back. DSPy also support whole program saving through Cloud Pickle. You can only worry about dependency it on to recreate the instance. To do that, keep data directory as a path and then set the same program equals to two. Then call a save. To load that back, use DSPy dot load and keep that same path and it will be loaded as a new instance. After you load back a program, you can call that just as if that's our original program. And now I'll restart the game process. That's all about this lesson. In this lesson, you have learn how to program with this DSPy. In the next lesson, you are using MLflow tracing to debug your DSPy program. See you there.