AI is the new electricity and will transform and improve nearly all areas of human lives.

Quick Guide & Tips

💻 Accessing Utils File and Helper Functions

In each notebook on the top menu:

1: Click on "File"

2: Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.

🔄 Reset User Workspace

If you need to reset your workspace to its original state, follow these quick steps:

1: Access the Menu: Look for the three-dot menu (⋮) in the top-right corner of the notebook toolbar.

2: Restore Original Version: Click on "Restore Original Version" from the dropdown menu.

For more detailed instructions, please visit our Reset Workspace Guide.

💻 Downloading Notebooks

In each notebook on the top menu:

1: Click on "File"

2: Then, click on "Download as"

3: Then, click on "Notebook (.ipynb)"

💻 Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.

📗 See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).

📱 Features to Use

🎞 Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣 Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅 Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥 Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√ Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.

🧑 Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑 Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅 Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕ Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬 Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍ Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.

📚 Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]

🙂 Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum

Sign in

Or, sign in with your email

Email

Password

Forgot password?

Don't have an account? Create account

By signing up, you agree to our Terms Of Use and Privacy Policy

Create Your Account

Or, sign up with your email

Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Plan

Planning for more users?

What best describes you?

This helps us tune the catalog to suit you best.

Software Engineer

Data Scientist

Machine Learning Engineer

Data Analyst

Product Manager

Entrepreneur

Business / Consulting

Research / Academic

Student

Other

Subscribe to receive AI news, events and course updates from DeepLearning.AI!

Join Team Success

You have successfully joined undefined

You now have access to all Pro features. Click below to start learning!

Session Expired

Session expired — please return to Cornerstone to restart the session and complete the course.

/

Agent Memory: Building Memory-Aware Agents

All Courses

/

Agent Memory: Building Memory-Aware Agents

All Courses

Agent Memory: Building Memory-Aware Agents

Agent Memory: Building Memory-Aware Agents

Course Syllabus

Elevate Your Career with Full Learning Experience

Unlock Plus AI learning and gain exclusive insights from industry leaders

Access exclusive features like graded notebooks and quizzes

Earn unlimited certificates to enhance your resume

Starting at $1 USD/mo after a free trial – cancel anytime

In this lesson, we'll start by defining what an AI agent actually is. Then look at what happens when you interact with a stateless agent versus one augmented with memory. You will see why conversational history alone isn't enough and what agent memory really means. We'll wrap up by introducing the agent stack and a memory core, the architectural foundation that the rest of this course builds on. Let's dive in. In this lesson, we're going to cover why AI agents need memory. More specifically, we're going to get an overview of what an AI agent is, what we mean by stateless agent and memory augmented agent, and we're also going to get an overview of conversational history and going beyond conversational history. We'll finalize this lesson by looking at two key concepts, which are Agent Memory and what we mean by the Agent Memory Core. What is an AI Agent? An AI agent is a computational entity that can perceive its environment through inputs, take action through tool use, and also has reasoning capabilities enabled by an LLM. And more importantly, an AI agent has some form of augmented memory, typically to allow it to store, retrieve, and apply knowledge across sessions and interactions. This is what we define as an AI agent. Generally, AI agents should be able to operate independently, act on behalf of the human with almost little to no feedback. Ideally, they are goal and objective bound. Now that we have an understanding of what an AI agent is, it's time for us to get a very clear understanding of the importance of memory for AI agents. We'll do this by looking at how an AI agent would operate without memory. Typically you would have an AI agent embedded within an application. And in this application, a user would interact with the agent. Typically, we can imagine a user asking for recommendations for restaurants around the area. We would refer to this as turn 1. This message would then be sent to the agent. And the agent would generate a response, ideally detecting the intent of the user, reasoning about the message the user sent, and using a bunch of tools to actually achieve the objective, such as searching for location and getting restaurants around a location, before finally providing an output to the user. We can refer to this as interaction turns. Following subsequent turns, we would have the user interact again with the same agent to continue the conversation. We can refer to this as turn 3. When the user gets the recommendations from the agent, the user could possibly ask to book the first recommendation on the list. But an agent without memory would actually respond as shown on the screen, which would be it has no recollection of the conversation the user is referring to. and would ask the user to please specify. We could refer to this as turn 4. In this event, we can see that due to lack of memory, the agent is unable to complete the task that the user has specified. We can refer to this as a multi-turn interaction. But obviously, this is without memory. Now we've seen what an AI agent with just Perception, Action, and Reasoning characteristics would operate like in a real-life scenario. Essentially, we would refer to this type of AI agent as a stateless agent. Let's talk more about what a stateless agent is. A stateless agent is one that can actually still perceive its environment through input and reason over this input and produce outputs back to the user, essentially powered by an LLM. And this feedback is very crucial. But one thing is, with no memory, as we've seen, the AI agent doesn't retain or recall information beyond a single turn. There are significant disadvantages to this. For example, the agent will not be able to complete a long-horizon task. Now, these are tasks where the agent is expected to run for several minutes to hours or even days. But without having information of previous interaction or steps taken, it would be very difficult to complete the task. The agent will also have no context awareness across sessions, meaning if a user was to interact with an agent and then leave and come back or start a new session, the information as to the user would be lost and not transferred across sessions. More importantly, the agent would lack adaptation capabilities, which means any new information provided to the agent during an interaction would not be updated or used in subsequent interactions. Stateless agent can actually have high operational cost because to keep or at least augment continuity, you would have to stuff a lot of information into the context window at every single turn. There are more disadvantages, but these are the significant ones. Now, we've had an overview of stateless agent, which is an agent without memory. But we can move on and see what an AI agent with memory would look like. We're going to use a similar scenario as to the previous one. But let's start from turn 3. Turn 1 and 2 in a Memory-Augmented Agent would have been stored in an external memory source such as a database. So when we start with the user interacting with the application and also the agent indirectly, we would have turn one and two within the memory of the agent. What this means is, when the user asks for some recommendations and also asks them to book the recommendation on the first on the list. The agent will respond appropriately, which is identify the restaurant on the first on the list and actually provide a reasonable output, which would be asking the user what date and time would the booking like to be made for. We can refer to both turn 1 and 2 and turn 3 and 4 as interaction history. And all of this will be stored in a database for all subsequent turns. Essentially, we have a memory-augmented agent. Let's take a further deep dive into Memory-Augmented Agents and their advantages. Just like stateless agents, they can Perceive Inputs and also Reason Over Inputs and Produce Outputs. All of this will be powered by the LLM reasoning capabilities, but a more important thing here is we have a database where information is stored and retrieved from. There are many benefits to this, such as the agent's being able to complete long-horizon tasks, has subsequent context awareness and can actually adapt. Let's go over the advantages in more detail. One of the key advantages is the ability for memory-augmented agents to actually complete long-horizon tasks, mainly because they can reference previous interaction and context held in previous sessions. Now this leads to a Sustained Context Awareness, which feels to the user that there is a continuous interaction had with the agent. Now, what we have ideally is an improved sense of efficiency and also reduce operational cost. This is because we reduce the amount of information we have to pass into the context window and only pass in one that are relevant to the interaction that have been stored in an external memory store. Finally, we'll have greater reliability in multi-step workflow. This is mainly due to the fact that we can reference previous steps that were taken and also reference previous context, which makes all the subsequent steps more reliable in terms of successful outcomes. We're going to be exploring more of all of these advantages in future lessons. Memory augmented agents come in all shape and sizes. What we've seen might be a naive form of memory augmented agent. And here's what we mean. Going from a stateless agent to one that can remember means storing interaction history within an external store. The key benefits of this we've touched on, but the most important one is the continuation of interaction. Essentially, we're bringing continuity to an agent through the storing of conversational history or interactions. And these interactions are typically between the user or users or the agents or assistants. Interaction history, when it's stored in an external store, can be referred to as Conversational Memory. We're going to be looking more into different forms of memory, but conversational memory is one of the simplest forms to understand one that interaction history lends itself to. An interaction is typically back and forth information between a user and an agent. In conversational memory, we have very specific attributes. We have a timestamp, which is when the interaction was taken. We have the user and the assistant message, and this will be stored in a database, in external memory. Ideally, conversational memory will be time-ordered, which means that when we return any conversational memory data, we are retrieving it ordered by time, so we can see the sequence of actions and interactions taken. Let's see how this would look like in the context window of an LLM. So an LLM would have a context window that can take a certain amount of tokens. In the context window, we would place in the system prompt and instructions. Now, interaction history will be retrieved as conversational memory. where we'll put all the multi-turn past interaction from the external store. And then we'll place in the final user prompt. This is a depiction of what the context window of an LLM with conversational memory would look like. But we actually need to go beyond this. And there are several reasons as to why we need to move beyond conversational memory or just using interaction history to create memory augmented agents. The first one is very simple. Conversation windows are actually finite, but the user relationships are not. What we mean is that we can actually capture more relationships between users and the assistant by looking at conversations or other data associated with conversations. For example, information as to entities that are mentioned during an interaction. entities such as place, people, and relationships to people. Not all valuable information are actually stored in a single conversation. We're going to have to move beyond conversational memory to actually extract useful information that we can use in cross-session interaction. Agents need structured, queryable knowledge, not just chat logs. Data stored in conversation memory is just interaction history. But agents can operate within workflow where the steps taken in the workflow is actually useful information and also the outcomes. Now, this is not conversation history or interaction history. Therefore, we need to expand and go beyond conversational memory, which we will see in future lessons. Now that we've seen conversational memory, it's time to explore the distinct forms of agent memory you will be coming across. The easy way to view agent memory or in two distinct form, which are short-term and long-term. Let's look into short-term memory. Two common forms of short-term memory are Semantic Cache and Working Memory. Semantic Cache is essentially using a caching mechanisms that leverage vector search and previously received response from an inference provider to essentially use as response for similar queries in subsequent interaction. While Working Memory can be seen as the LLM Context Window and any Session Based memory. This is essentially seen as a scratch pad where the LLM can operate within but is lost after an interaction or session. These are all the different types of short-term memory. For long-term memory, we have three main forms, such as Procedural, Semantic, and Episodic. Let's look at examples of these types of memory. For Procedural memory, a common memory type we're going to be using in agents is Workflow memory. This is where we store the steps and interaction that an agent has taken to achieve its subsequent objective. And these steps could include the calling of tools and other processes. It's ideal to record the steps in a form of external memory, which can be referenced and pulled in in subsequent interaction as a form of experience for the agent to refer to. We call this Workflow memory. A good example of Semantic memory is simply the Knowledge Base, which is any external knowledge that the agent needs to know to complete a task. This could be domain-specific knowledge in regards to the domain that the agent operates within. Then conversation memory that we explored previously is known as a type of Episodic memory. typically because we have the timestamp attribute on the information captured and we could reference this specific data using time. Now we have a good understanding of different types of agent memory. It's time to understand what specifically is agent memory. Agent memory can be defined as the composition of system component with some architectural components as well, in order to enable an agent to adapt and learn. The system components that you'll find in Agent Memory are typically Embedding Models, a database, and a large language model. Ideally, the combination of these system components, along with some control mechanisms and software harness, which is code, will enable an agent to store, retrieve, and recall information that allows it to adapt to interaction and learn. To understand agent memory, we can use your previous knowledge on retrieval augmented generation, or RAG. We're going to see how RAG works very quickly and then bring that in and connect it to agent memory. Let's go over a typical RAG pipeline. First you would have a data source which you will pass through a Data Processing pipeline, which will include the breaking down of the data object into subsequent chunks. These chunks are then passed into an embedding model, where a numerical representation of the data object is created. This numerical representation captures the semantics and context of the data object that was passed into the embedding model. Alongside with the Embedding Model and other metadata, we can pass in all different data types into a single database for storage. Now, we would have a user which will interact with an application by passing in a User Query, which would then be vectorized by passing in the User Query into the Embedding Model. and generating a numerical representation of the User Query. Typically, the Embedding Model used to generate the embedding for the User Query would be the same we use during the ingestion pipeline. Once we retrieve the semantically similar rows to the user query, we would pass these rows into a Reranking Model. The results of the reranked rows are then concatenated with the user query and passed into the LLM to ground the response of the LLM in domain-specific data. Now, this is a typical RAG pipeline. In order to make a connection to how RAG relates to agent memory, we will be taking the following steps. We would have the typical ingestion process in the earlier portion of a RAG pipeline. And we would also bring in our knowledge of an AI agent and the main characteristics which are the Perception, Memory, Action, and Reasoning capabilities that we covered previously. To make the connection of RAG to agent memory, what you would have is the abstractions or the memory types that we covered in previous slides as computational form within the database. Ideally, this would be tables within a database. So you'd have your semantic memory, your procedural memory, or episodic memory, which can be conversational memory, as tables represented in a database. A memory manager could then be used to abstract away the methods and program used to read, write, update and delete data from these tables. And our agent would have access to all of these capabilities by connecting a Memory Manager through tools and providing these as capabilities to the agent via memory. And this is how we bring in our previous knowledge of RAG and also the characteristics of an agent and different forms of memory into agent memory. Now, we're going to cover the term Agent Memory Core. In an agentic system, there are three main components where memory is located or could be said to be located. This would be your large language model, which has parametric memory of all the data it's been trained with. and also your Embedding Model, which has memory by which it could draw semantic and context information from when generating an embedding. But also you have your database. This is where you're going to see the most traffic of data within your agentic system. This is where data is stored, retrieved, and optimized as well. These are all the main system components associated with agent memory. But an agent memory core would be your database. This is where the most information and data would be stored and retrieved from in the entirety of your agentic system. Most of the data traffic we see in AI agents flow through an external memory which is backed by a database. An Agent Memory Core can be defined as this primary infrastructure that will see the most data traffic within your agentic system. It should be able to handle the storage of information, the retrieval and the optimization of information within the store itself. This is why we refer to the database as the Agent Memory Core. This is an important concept to take along with you through this course.

deco top

deco bottom

Agent Memory: Building Memory-Aware Agents

Sign in to continue learning

Agent Memory: Building Memory-Aware Agents

Intermediate

1h57m

Topics

Agents

Collaborator

Agent Memory: Building Memory-Aware Agents

Introduction
Video
・
2m

Why AI Agents Need Memory
Video
・
18m

Constructing The Memory Manager
Video with Code Example
・
22m

Scaling Agent Tool Use with Semantic Tool Memory
Video with Code Example
・
17m

Memory Operations: Extraction, Consolidation, and Self-Updating Memory
Video with Code Example
・
23m

Memory Aware Agent
Video with Code Example
・
20m

Conclusion
Video
・
1m

Extra resources
Reading
・
1m

Graded・Quiz

Course Details