In this lesson, we'll show you how to create and interact with MemGPT agents using the Letta framework. We'll also go over how to understand the agent state, such as the system prompt, tools and the memory of the agent. We'll also learn how to view and edit the agent archival memory. All right, let's go. You can use Letta to create MemGPT agents. MemGPT agents are stateful and have explicitly managed sections of their context window. In this lesson, we'll go over a different parts of the agent state and also archival memory and recall memory. We'll also go over how you can design agents. So, agents can be designed by controlling these following knobs. So the prompt which includes things like the system prompt and the persona that defines the agent's behavior, the agent's tools, the way the agent manages and organizes its memory, and also the content of the agent's memory, both inside of core and archival. These knobs define what's placed into the LLM context at each step, thereby defining the agent's behavior. So, I'm first going to create a Letta client. So, I can import the Letta client from the Letta client package. And this will basically connect my client to the running Letta service a local host 8283. Letta is a bit different from other agent frameworks in the sense that it runs the service so you connect to a running Letta server and it's by interacting with that server that you create agents and interact with them. You already have a running Letta server on your notebook sandbox environment. And so you don't need to worry about starting this up. But if you're using Letta yourself outside of this course, you should make sure to start a Letta service. Or to either connect to a Letta cloud with an API key or a Letta desktop. So I'm going to first create this client. And then I'm also going to initialize a helper tool. So, a tool to print a message in a nice way. So, when we get responses back from Letta agents they're going to return a message type that has a couple of different variants. So, one is a reasoning message generated by the agent. Another is an assistant message. So this is agents communication that we're getting back. There's also tool calls. And tool returns. Tool returns are the return result of any tool execution. This is optional but often contains important information. And then there's of course also user message. So this is just, any kind of message that we send to the agent. So, Letta basically allows you to create persistent LLM agents that have memory built in. So, the Letta service will actually save all the state related to an agent into a database. So you don't need to worry about checkpointing or reloading an agent. That state will always be there. So to start we're going to just create a really simple agent using Letta. We can do this by basically calling the agent creation function on the Letta client. So we pass in a couple fields into create. So one is the agent name. So this is optional. But I'm just going to name this simple agent. There's also the memory blocks that are stored in context. So I have two here. One is the human memory block. The other is the persona. And for the human I'm actually overriding the character limit. So, I'm basically telling the how much space of the context window can be allocated to the human section of the memory block. So sometimes you'll see that you might start running out of space. This is somewhat intentional. It's basically meant to kind of make sure that we don't spend too much space on an in context memory, but sometimes you'll want to allocate more than the default 5k character limit to memory block. And they'll also specify the LLM model. So, GPT-4o-mini and then also an embedding model. So, this will create an agent on the server that we can now message. So to message agent we can also do this through the client. Specifying the agent ID that we got back from the agent state from the agent creation. And then we'll pass in a list of messages. So we'll just send in a user message that just says: " how's it going". And this will give us back a response that includes a list of the messages, that were generated by the server. So we can see here that we got two messages back. So the first one was a reasoning message. Basically, the agent thinking to itself: "The user seems casual and friendly. Let's match the energy." And then the agent sends a message back to us. Saying "hey there, I'm doing great. Thanks for asking. How about you?" The response also contains usage information. So we can print things out like the completion tokens, the prompt tokens, and also the number of steps taken. So, the step count here is how many times was the LLM invoked or how many steps did the agent take. In this case the agent just did one thing. It just message us back. So the sub count is just one. But for more complex tasks where there's a long sequence of tool calling, you'll see that the step count can often be higher. So when we created the agent, we got back an agent state. So this contains all the state associated with an agent, which includes a couple different things. So, one is the system prompt that defines the agent's behavior. There's also the memory, the memory blocks. There's also tools. So these are tools that the agent can call. And then there's also the memory of the agent. So we can actually observe all of these things by either looking into the agent state or calling more client tools. So, we can actually view the system prompt of the agent by looking at agent status system. So this is fairly long. It has a lot of details that are kind of trying to override the default system prompts of the LLM providers. And then also a lot of information about how memory should be managed. You can of course edit this if you want to, but we generally recommend being pretty careful with this, since it can affect how memory management is done. And then we can also see each of the tools that the agent calls. So this is agent looking at the tool names inside of agency dot tools. So we can see that it has archival memory insert and archival memory search to basically insert memories into external memory and also search them. It can also search the conversation history that it has. So not all the messages will be in context if you have a really long conversation. So the agent has the ability to search the conversational history. And there's also a core memory append and replace. Which give the agent the ability to edit the in-context memory blocks with new information. And then finally there's of course the send message tool. So the agent messages are actually generated by the agent calling send message explicitly. This allows the agent to understand that it's making an explicit choice to communicate back to the user. Kind of distinguishing that by making the actual message sending a send message tool call. And then we can also see the in-context memory of the agent. So this is returning back a list of blocks. Which of course have the values that we set. So my name is Charles. And then also the persona. And then you can see here that we're actually setting a unique ID of the block. So you can actually retrieve back the data and these blocks or even modify the blocks by using this unique ID. If we want to search things like the external message, history of the agent, or like the archival memories, then we do need to go through the client. So for example, we can list the messages. So this is the full message history of the agent. So you can see that this is actually an initial message sequence that's populated into the context window to basically give some examples to the agent. You can actually override this, you don't want to include these. And then we can also see the message that we just sent to the agent. "How's it going?" The response that we get. Back, etc... So as this grows, you'll be able to list these messages out and then also paginate through them. We can also list the archival memory of the agent. So to do this we can just just List the passages. Passages are each of the rows in the archival memory. So right now this agent has no archival memory. So this is just empty. But we'll modify this later. So we're now going to dive a little deeper into core memory. So core memory is the in-context memory of the agent. So the agent can modify memories that it has that are stored in the context window. For things that might be about the human or the persona or other memory blocks that you create. So this is kind of unique because it allows the agent to essentially learn by writing new information into its context, by editing a memory block. We can see an example of this by Sending a message to the agent that gives it new information. So for example, we can tell the agent, you know, "hey, my name is actually Sarah" versus right now the in-context memory about the human says that the human's name is Charles. So we can see the new messages generated and returned back to us. So first, the agent reasons. The user's name is updated to Sarah. So it's ready to engage. It then calls core memory replace. So this is one of the tools that we add into Letta agents by default. And so decides for the block with the label human to replace the old content, Charles, with the new content Sarah. And then it also makes a heartbeat request. So this is a special keyword argument. That's inserted into all Letta tools to allow the LLM to control whether or not the agent gets invoked again. So whether or not we run another step. So if this was false then we would just stop here. But because it's true. The agent gets in both again. And on the second invocation it reason's ready to interact with Sarah, excited to get to know her, and then sends a message saying: "Nice to meet you Sarah. What's on your mind today?" So if we actually print back out the usage this time, we'll see that the step count is actually two because of this request heartbeat. So the agent did two things as opposed to doing just one thing. We can also retrieve this block and see the new value. So we can see now that the value says my name is Sarah as opposed to before it said the name is Charles. Letta agents also have built-in memory that's not just in context memory, but also externally stored memory. So we refer to this as the archival memory. This is information that maybe isn't worth putting into the context window, or it doesn't fit into the context window. And so the agent is given the ability to insert into an external vector database and later retrieve information if it needs it. So the agent itself has the ability to write to archival memory by calling the Archival Memory insert tool. So we can actually just try to explicitly trigger this by telling the agent to save the information "Bob loves cats" into archival memory. And so we can see here this kind of triggered the agent to call archival memory insert and insert this information Bob's love of cats. And then it returns back request heartbeat is true. As well and so then the agent is called again and basically sends us a message. I've saved the information and Bob loves cats. And so now if we can actually see that when we list passages and get the text, we get the text Bob loves cats. So now this information is saved into our archival memory. We can also explicitly create archival memories as a developer. So if we want to insert some information into the agent's archival memory, we can basically call passages dot create with the client insert something like Bob Loves Boston Terriers into the agent's archival memory. And so this is returning back a bunch of information which includes, like the embedding generated for the passage just inserted. So now that we have a couple memory saved into our archival, we can see how the agent can use this. So I'm actually going to explicitly trigger the agent to call archival by asking it a question like what animals do I like? Kind of hinting search archival. So, in general, the agent can itself decide when it needs to search archival. But here, just for reliability, I'm going to tell it explicitly to please search it. So, we can run this. And so we can see that the agent reasons user wants to know their animal preferences. So it decides to search. It calls archival memory search. And it specifies the query animals. So the agent actually gets to decide what the query that goes into archival memory search is. So in this case, it decided to query the word animals. it can actually also be a sentence or phrase. But here it was just one word. The agent actually also has the ability to page through results. So sometimes, there can be a very long list of results in an archival memory search, in which case, the agent might need a page multiple times to see all the results. In this case, that's not really relevant, since we actually only have two memories inside of archival memory. So it returns back both archival memory memories. So Bob loves cats, and then also Bob loves Boston terrier. So this is both a memory that the agent explicitly saved and then also the one that we saved into its memory. So now it gets this information into the context window and is able to say back, you like cats in Boston terrier is based off or was in archival memory. So congratulations, you've now created your first Letta agent, and then also learned what's actually contained inside of the agent state of a Letta agent. In future lessons will add more capabilities things like custom tool calling and also customizing the in context memory management.

Please sign in to view this content

Learn Code

Next Lesson

LLMs as Operating Systems: Agent Memory

Introduction
Video
・
5 mins

Editable memory
Video with Code Example
・
12 mins

Understanding MemGPT
Video
・
14 mins

Building Agents with Memory
Video with Code Example
・
12 mins

Programming Agent Memory
Video with Code Example
・
14 mins

Agentic RAG and External Memory
Video with Code Example
・
8 mins

Multi-agent Orchestration
Video with Code Example
・
14 mins

Conclusion
Video
・
1 min

Appendix - Tips, Help, and Download
Code Example
・
10 mins

Course Feedback

Community