Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial – cancel anytime
In this lesson, you will build the memory infrastructure that powers everything else in this course. You will design persistent memory stores for different agent memory types, model memory data for efficient retrieval, and implement a memory manager that orchestrates how your agent reads, writes, and retrieves memory. during execution. All right, let's dive into the code. In this lesson, you're going to be constructing the Memory Manager. What you'll cover in this lesson is a good understanding of the agent stack. But also what we mean by when we refer to the Memory Layer and Memory Manager. You also get an overview of memory operations, and we'll also look into terms and terminologies such as memory engineering, memory units, and context engineering. Overall, we will lead to the implementation of a memory manager and memory operations. The agent stack refers to a bunch of tools and technologies that work together to enable an AI agent to perform reliably. and efficiently in production. The agent stack typically makes up the layers you see on the screen, but for the purpose of this course, we're going to compress it into three main layers: the Application Layer, the Data Layer, and the Infrastructure Layer. But again, because we're talking about AI agents and looking at things from an agentic perspective, we can switch out the Data Layer for Memory Layer. Now, let's look at the Memory Layer in more detail. It's typically the part of your agent stack that contains the Memory Core, which we covered in a previous lesson, and also the Memory Manager. The Memory Core and a Memory Manager work together to essentially create memory-augmented agents that can handle continuous tasks, operate on long-horizon tasks and adapt and learn to new information. If we zoom in on the Memory Manager, we can understand a Memory Manager as an abstraction on the database where we have different control flows and control logic to read and write from memory stores we have in our database. We're going to see an implementation of this further in this lesson. But before we go there, let's go over some common memory operations. The memory manager is going to hold abstractions which will contain methods that hold create, read, update, and delete methods. to memory stores within a database. Now, these memory stores are actually memory types that we've covered before. So Conversational, Knowledge Base, Workflow, and Summary will all have their allocated table within the database. Now, each of these memory types we have a different type of storage requirement. For conversational memory, it's going to be a relational SQL table. Whereas for knowledge base and other memory types, we're going to be storing data in relational form, but we're going to have a data type that can hold vector embeddings. In each and every memory type, we're going to have the read and write operations. This is just to show you examples of methods you're going to have in your Memory Manager. Now, there are also update, delete, and create methods that you can implement as well. Another thing to cover is a classification of memory operation. This classification is dependent on how the memory operation is called. So we've already went over what memory operations are, which are just create, read, update, delete operations on tables held within a database. But we can distinguish these operations into two forms. The first being deterministic, which means that we are executing the memory operations programmatically. When we interact with an agent, this memory operations that are deterministic will be executed regardless of any situations and will be executed based on a fixed schedule or predefined conditions. The other classification is Agent Triggered. This is where the memory operations are provided to the agent as its tools, and the agent can decide as to when and where to use these memory operations. The triggering of these memory operations is left to the discretion of the AI agent. Another term you're going to come across in this course is Memory Unit. Let's explain what a Memory Unit is. A Memory Unit can be thought of as the smallest atomic unit representation of information or data held within a database that is used within an agentic system. We've already seen an example of Conversational Memory Unit where we have the timestamp, the role of the entity having a conversation and the content of the conversation itself. Another example of a Memory Unit is a Workflow Memory Unit where you would have a row that have data types such as the content of the actual workflow, the type of the workflow, the timestamp, and also a vector representation of some content of the workflow. The content of a Workflow Memory Unit would typically be the steps and the outcome of the steps taken in the execution. You're going to hear more about workflow in later lessons. Another key term is Context Engineering. Context Engineering is a practice of curating specific content we want to pass into the context window. This means that we have data sources which can actually provide several information that will pass into our context window, but rather than stuffing all this information into the context window, we actually think very carefully about what context we pass in. The aim of context engineering is to maximize the value of each token within a context window. Ideally, we want a high signal-to-noise ratio for every token within a context window. This means that we'll be able to get the desired output and the desired outcome. The last term you need to take with you is Memory Engineering, which is the discipline of actually building and maintaining systems of memory for an AI agent to enable it to adapt and actually learn. In Memory Engineering, you're responsible for all the processes and operation that goes within a memory life cycle. Now let's look into what Memory Lifecycle is. A Memory Lifecycle goes through several steps. First you have a Raw Data Source, which goes through an ingestion pipeline, and then is enriched maybe using an embedding model or an LLM for augmentation of information. This is then stored in a database and also stored in different tables representing either short-term or long-term memory. Now we have the organization of information. This will involve process such as indexing and mapping out relationships within information. Now we'll go into retrieving the information. A memory can be retrieved with different retrieval strategies. The common one are text or lexical, vector, Graph Traversal, or hybrid. which is a combination of two or more different retrieval strategies. When we retrieve information within a Memory Lifecycle, this information which is memory is passed into the LLM. The output from an LLM can also be used as memory. It would go through steps such as Serialization and Augmentation where we enrich the output to be stored in a database. And then it goes within the cycle of Storage, Organization, Retrieval, and back into the LLM again. Now, this is a very simple overview of the Memory Lifecycle. The memory lifecycle enables a continuous learning cycle that an AI agent requires to address long horizon task. Memory Engineering might be a new term that you've just come across. but it's a combination of existing disciplines and takes the practice and principles from these disciplines to enable the efficient implementation of memory operations within AI agents. For example, Database Engineering. Within Memory Engineering, we leverage principles such as ACID transactions, Persistent Storage, and having an understanding of store architecture that we can utilize within Memory Engineering. Agent Engineering also comes into play here, as we're building an AI agents, we do require an understanding of how to engineer this agent and when and where to put in memory operations. Machine Learning Engineering is also required in Memory Engineering. This is due to the fact that we might actually fine-tune some of this embedding models or even some smaller language models, typical processes such as Model Versioning, Reranking Pipelines, and Continual Learning actually come into play within Memory Engineering. Lastly, information retrieval. which is a discipline that is concerned with implementing retrieval strategies and optimizing retrieval strategies. It's essentially utilizing Memory Engineering when we're implementing Vector Indexes or any other index strategies we want to utilize for the efficient retrieval of data from a database. This essentially are the disciplines involved in Memory Engineering. As you can see, there is nothing new here, just an intersection of existing disciplines. Now we're going into Memory Aware Agents, and this is an important part of this course, as we are moving away from memory augmented agent into Memory Aware Agent, and it's always good to understand the difference. We started off with naive implementation of a memory augmented agent that just had conversational memory, which means it only had interaction history. And this allowed us to add an explicit memory type allocation to our AI agent. By adding an explicit memory type allocation to our AI agent, we're able to move into a fully memory augmented agent. This means we have an agent that can retrieve information from different memory stores such as the conversational memory, workflow memory, toolbox memory, and other forms of agent memory within our database. But we can take things further. We can make our AI agent memory aware by taking a few steps. The first one is giving the AI agent an awareness of the memory stores via the system prompt. The second is giving memory operations as tools to the AI agent so that the AI agent could store, retrieve, read, and forget memory at its own discretion. The next step is given an AI agent ability to reason through memory life cycle. And the final step to move into a Memory Aware Agent is by segmenting the context window into portions or partitions allocated for specific memory types, which we'll see in all the implementation steps in this lesson. In this section, we're going to be setting up all the key system components such as the database, the embedding models, and a vector store that we require to implement a memory-aware AI agent. We are going to start with setting up our database. We're going to load environment variables so we have access to all the API keys within our environment. Then we'll set up our Oracle AI database. create a connection to the database, and passing in the right parameters. and then confirm we have an active connection by seeing a banner of the database. Once you execute a cell, you should be able to see an output. It will go through a confirmation of steps to successfully connect to the database. And then you should be able to see the banner confirming which Oracle AI Database version you're using. We're using Oracle AI database 26ai for this lesson. The second key component we'll be setting up is our embedding model. We'll be getting our embedding model from Hugging Face, specifically using the Hugging Face integration within the LangChain library. We'll be using the sentence-transformers library and accessing the paraphrase-mpnet-base-v2. When this cell is executed, you're going to have an embedding_model on your local machine. Next, we're going to set up the database tables, which are key system components we need for our agent memory and memory aware AI agent. We're going to set up names for the tables that represent different forms of agent memory we covered previously. For example, for conversational memory, we're going to have the CONVERSATIONAL_TABLE that will be identified by the text CONVERSATIONAL_MEMORY. For KNOWLEDGE_BASE_TABLE, we're going to have semantic memory and so on and so forth. All the tables names are stored in a list and assigned to the variable ALL_TABLES. We'll iterate through the contents of all tables and drop a table if it exists as we're running this specific lesson from scratch. And lastly, we commit the database transaction. Once you run the cell, you should be able to see an output that the table should not exist cause we're running this for the first time. Now that we've actually created our memory stores, Next step is for us to create the actual conversational history table. This will be done by specifying a function called create_conversational_history_table that would take our connection and the table name as arguments. Again, we will drop any tables if it exist because we want to ensure we're starting with a fresh table. And then we'll run a SQL statement to actually create the table with the right attributes we expect for the rows within the table. Notably, for conversational memory unit, we want to capture the content, the role, the timestamp. Those are what we covered in the lesson earlier. But additional metadata can be captured. such as the actual metadata associated with the conversational memory unit, the created_at, which is different than a timestamp the conversational memory unit is captured. We also have a summary_id. which will be explained in later sections of this course. Now that we have a SQL statement, we can execute the SQL statement by committing the transaction later on. To ensure that we can conduct faster lookup, we're going to create indexes on the thread_id and on the timestamp attributes. This will ensure that traversing through rows within our conversational memory table doesn't take a lot of time and is done efficiently. Next, we're going to code a functions to create a table. We're also going to create a tool log table which is going to be imported from the helper module. This takes the same steps as a conversational history table to create the table within the Oracle AI database. Now, we just call the create_conversational_history_table passing the database_connection earlier, along with the name specified earlier on in the coding sections. We do the same for the tool log history table as well. by calling the create tool log table, we've imported from the helper module. Once this cell is executed, you should be able to see a message shown that the CONVERSATIONAL_MEMORY and TOOL_LOG_MEMORY table were created successfully with indexes. Now that we've created our SQL table for conversational memory and tool log. It's time to create SQL tables that can handle vector data. For this part, we're going to be using the Oracle database integration in the LangChain library. Specifically, we're going to import the OracleVS module that allows us to create indexes and actually create tables that can handle vector data within Oracle AI Database. We'll also be specifying the DistanceStrategy to measure the distance between two or more vectors. We'll also import the ability to conduct hybrid search within the LangChain Oracle database integration. In order to create our vector stores, we're going to have a class to abstract all the methods and the vector stores. We'll refer to this as the StoreManager. This StoreManager is going to create all our vector stores. Let's take a look at the specific functions and methods that are called to create the vector stores. For knowledge base, which is semantic memory, we're going to initialize a vector store using the Oracle vector store LangChain integration. We're going to specify the client, which is our database connection, the embedding_function, which is going to use the embedding model we initialized earlier. alongside the table_names we specified. And lastly, we specified the distance_strategy, which in this case is going to be cosine. This is a mathematical operation we use to measure the distance between two or more vectors in a high dimensional space. We're going to conduct this step for each of the forms of memory we want our AI agent to have. Specifically, we do this for knowledge base, which is semantic memory. We do this for workflow, which is workflow memory. We do this for toolbox, which is toolbox memory, entity, which is entity memory, and summary, which is summary memory. The store manager will also contain methods to retrieve the actual memory stores themselves. For each of the memory types we initialize, we'll be able to retrieve objects of these memory stores and perform operations on them. Lastly, for a knowledge base itself, we'll set up a hybrid search capability to enable the utilization of multiple retrieval strategies to retrieve information from knowledge base. Make sure you run the cell. Next, we create an instance of the StoreManager class we specified earlier. For this instance, we're going to pass in the database connection as the client, the embedding function is the embedding model we initialized earlier, but also we pass in the table names we specified initially. alongside the distance function, which we mentioned was going to be COSINE, and the conversational table and tool log names. When you run this cell, you should have your memory stores and vector stores created. In the next step, we can get instances of our memory stores and allocate them to variables within our development environment by using the get specific memory store function we specified within the store_manager. Lastly, to ensure efficient retrieval of information from a database, you should always create an index. An index is a data structure that enables the retrieval of information from a database without scanning all the items of the database. We'll be creating an index for all vector stores that we've created. By using the function safe_create_index, imported from the helper file, we'll be able to create an index for each of the memory types we've specified for our vector stores. Once this cell is executed, you should get an output message specifying that all the vector indexes have been created. Now we're moving on to part two, where we create an instance of our memory manager. Remember, a memory manager abstracts all the operations that we utilize to read and write information from the database. We're going to import the memory manager from our helper file. The memory manager is going to take in the connection to our database, and needs to be aware of all the tables that we require access to. Now it's time to utilize the memory manager. The best way to utilize the Memory Manager is actually to ingest data into it and retrieve data from it. Remember, these are the read and write operations that were highlighted. In this section, we'll get to utilize the Memory Manager. we instantiated in the previous cell. In order to utilize the memory manager, we have to write data into it and read data from it. And the data we're going to utilize is arXiv papers retrieved from Hugging Face using the load_dataset method. In order to ensure the data retrieved follows a certain format. The following cell extracts the key information required from each paper and data point from the loaded dataset. Specifically, we extract the title, abstract, subject, and submission date of each data point from the dataset. We then concatenate all of this data and pass it into a text variable and use our memory_manager to write_knowledge_base, which is our agent semantic memory. and put in the text into the text argument, and the metadata would take in the subsequent allocated attributes as specified in these cells, write into the knowledge base does some operations. Firstly, this text data, a vector representation of this is created within the knowledge base, which enables semantic search against rows within our semantic summary table. The metadata are also held within the semantic summary table. So each row has the metadata and the vector representation. In order to complete the overview of the memory_manager and its operation, we're going to read from the knowledge base memory. We're going to use the memory_manager and specifically read_knowledge_base function to retrieve a specific row that matches this query. Remember, each row within our semantic memory table actually contains a vector representation of the text. We expect rows returned back to be semantically similar to the word space exploration. When you execute this cell, what you're going to see are some information that specifies what type of memory we are reading from, which is the Knowledge Base Memory. what the memory is, which contains information as to the content of this memory and how it should be queried, and also a specification of how information within this memory should be utilized. This is specifically for the LLM. Remember, we are building memory aware agents. That means that these agents are actually aware of the memory types that they have and how to utilize it. But notably, The passages that are semantically similar to the query passed in are these specific passages, returned from the query. This essentially is our semantic memory for our AI agent that contains existing knowledge base of a specific domain, which is space exploration. Great job on completing the coding session in where we got to see how a memory manager can be implemented and how it operates.