Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial – cancel anytime
Now that you have learned tools and techniques for document processing, you'll see them in action on AWS. You'll learn to implement an event-driven pipeline that automatically triggers ADE to process new documents uploaded to an S3 bucket. These parsed documents can then be loaded into a Bedrock knowledge base, enabling you to answer questions through agentic RAG built with AWS Strands agents. Let's get to it. Before we begin, I would like to thank Nneoma Okoroafor, Partner Solutions Architect at AWS for the collaboration. Here's what we will cover in this lesson. First, how to implement a RAG pipeline on AWS. Second, what we mean by Event-Driven and Serverless Architecture. Third, an overview of AWS resources you will use to implement the pipeline. And finally, how to build an agentic application with memory and knowledge base search tool with Strands Agents. Here's how you implemented the pre-processing phase in the previous lesson. First, the Raw Documents were stored locally and they were passed to the ADE Parser, which was also run locally using the computing and memory resources of your local environment. The parse outputs were then passed to the Embedder. We used an embedding model from OpenAI. And finally, the embedding vectors were passed to a local database using ChromaDB. We're now going to swap all of these local components with AWS services. This will help make this pipeline production ready on the cloud, which can easily scale with a larger number of documents. So, instead of storing the documents locally, we'll upload them to an S3 bucket, which is a cloud object storage. Instead of using the computing resources of your own machine, we'll use a Lambda function to run the parsing logic. The Lambda function provides a serverless, isolated computing environment. and on the cloud. It will be automatically triggered when new documents are uploaded to S3. And finally, for the embedder and vector database, we'll use Amazon Bedrock services, which provides serverless embeddings and knowledge bases. If you're unfamiliar with these resources or what serverless means, don't worry. I'll explain them in this video. You also saw how to use LangChain to define a retrieval object that extracts information from the vector database to answer user queries. You'll do the same on AWS. You'll build an agent using Strands Agents, which is an open source framework for building production ready agents with native AWS support. And you'll equip the agent with a retrieval tool that connects it to the knowledge base. You'll also use Amazon Bedrock to provide your agent access to an LLM and Amazon Bedrock AgentCore Memory, so you can create an agent that remembers user interactions for full context. So here's how you can think about your architecture. It has AI-based components, LandingAI ADE for parsing and Strands for agentic framework. It also has these main AWS components: Amazon S3, AWS Lambda, and Amazon Bedrock. Previously, I mentioned that AWS Lambda and Amazon Bedrock provides serverless services. So what does that mean? Serverless means two things. First, you don't provision or manage any servers. AWS handles the infrastructure and security updates. Second, you only pay when your code actually runs, not during idle time. This also means AWS automatically scales your application based on demand. Whether you have 10 concurrent users or 10,000. And because there's no infrastructure to configure, you can rapidly prototype and deploy new features. The second characteristic of your architecture is that it includes an event-driven component. The ADE parser will run automatically when a new document is uploaded to S3. How does that work? This is what's called Event-Driven Architecture. A modern design pattern where systems are decoupled and communicate by sending and receiving events, which are essentially notifications that something has happened. Let me break this down into three steps. First step is Event Producers. These are components like S3 Buckets that emit events when pertinent actions occur. In your case, when you upload a file to S3, it emits a file uploaded event. Second step, event channel or broker. Services like Amazon EventBridge that route these events to interested parties. Think of it as a notification system that knows who needs to be informed about what. There are also other messaging services on AWS like Amazon SNS, a pub-sub framework. and Amazon SQS for queuing, which support simpler pub-sub plus queuing patterns. EventBridge is best for event-driven architectures with more complex routing. Third step is Event Consumers. Services like AWS Lambda that subscribe to and react to specific events. Your Lambda function with ADE parser is listening for that file uploaded event and automatically starts processing when it arrives. Here's a helpful analogy. Instead of constantly polling, are we there yet? checking S3 every few seconds to see if there's a new file. An event-driven approach is like a push notification. S3 notifies Lambda the moment interesting events happen. And Lambda immediately reacts. So when you upload a document to S3, S3 creates an event. EventBridge routes it, and your Lambda function automatically runs the ADE parser. No manual trigger is needed. Let's now break down each component of your system and understand its specific role. You have three main AWS components: Amazon S3, AWS Lambda, and Amazon Bedrock. Let's dive into each one and see what they do. First up is Amazon S3, which stands for Simple Storage Service. S3 is scalable object storage service from AWS for files of any type and size. Think of it as an infinite digital filing cabinet accessible from anywhere on the internet. No matter how many files you add, it never runs out of space. This will serve as the central repository for your data. You store raw input data like PDFs, images, or text files in dedicated S3 buckets. Then you save the results of your AI processing, summaries, parsed documents, and transformed data back into organized folders. within those buckets. Just as a quick note, a bucket is the top-level container you create in S3 to organize your files, similar to a main folder on your computer. Next is AWS Lambda. It is a serverless compute service that runs your code without you managing any servers. Think of it as a part-time robot assistant or a function-on-demand that only activates when called. It's not running all the time. It springs into action only when it's needed. Lambda executes small pieces of code in response to specific events, like S3 uploads, API calls, or scheduled triggers. In your case, when a document is uploaded to S3, Lambda will automatically run your ADE parsing logic. And you pay only for the compute time consumed, measured in milliseconds. So if you're a Lambda function runs only for 2 milliseconds, you only pay for those 2 milliseconds of compute. Now, for Lambda to access other AWS services, like reading from S3, it needs appropriate permissions. This is where IAM comes in. IAM stands for identity and access management. is the security framework for controlling who or what can access your resources on AWS. When you create a Lambda function, you need to set up two things. An IAM Role and an IAM Policy. Let me explain the difference. Role represents who can have access. Think of a role as an ID badge or job title. It's the identity that your Lambda function assumes when it runs. The role proves that I am the S3-to-Bedrock-Processor function or I'm the document parser. Services like Lambda don't have usernames or passwords. Instead, they assume roles to prove their identity. to other AWS services. So when your Lambda function starts running, it tells AWS, I'm assuming the role of document processor. and here is my temporary access credential. IAM Policy, however, is what it can do. Now, a role by itself doesn't guarantee any permissions. That's where the policy comes in. Think of the policy as a list of rules written on that ID badge. specifies exactly what actions the role is allowed to perform. The policy is a JSON document that defines actions like s3:GetObject, which means you can read files from S3. Or s3:PutObject, which means you can write files to S3. Putting it all together, when you create a Lambda function, you have to create a IAM role, the identity or badge. attach an IAM policy to that role, the rules on the badge, and assign the role to your Lambda function. Hopefully the key takeaways are clear. Role defines who it is and policy specifies what it can do. Finally, Amazon Bedrock. Amazon Bedrock is a fully managed service that provides access to foundation models, large language models like Claude or Nova from AWS, along with embedding models, all through a single API. Think of it as a menu of pre-trained AI models. Instead of training and hosting your own models from scratch, you can simply select the model you need and use it. Amazon Bedrock powers three key parts of your architecture. First, the Knowledge Base. This is where your parsed documents are automatically embedded and stored as vectors. Bedrock handles the embedding process, converting your text into a numerical representation and providing semantic search capabilities so your agent can retrieve relevant information. Second, Agent Runtime. Bedrock provides the foundational model that powers your agent's reasoning and responses. When your agent receives a query, it uses Bedrock's LLM to generate intelligent, grounded answers. Third, AgentCore Memory. This is where your agent's memory is stored. Conversation history, user preferences, and semantic facts. Memory allows your agent to remember context across interactions, making it feel more like a real assistant. Keep in mind that Bedrock is serverless in nature, so it scales automatically, and you only pay for what you use. Now, let's go through the agentic component of your architecture. Strands Agents. It's an open-source SDK. Strands Agent is an AWS open-source framework specifically designed for building agents in notebooks and production environments. It helps simplify orchestration. Strands Agent seamlessly integrates with AWS resources such as S3, Bedrock, and other tools without requiring complex manual coding. It also helps create agent definition. With Strands Agent, you specify which Bedrock models to use, what tools your agent can access, and how memory works. This makes your agent configuration clear, maintainable, and easy to modify. Last but not least, it is also Enterprise Ready. Strands Agent is production-grade. It comes with built-in tooling for tracing and logging, performance monitoring, and error handling. And it also supports flexible deployment patterns. Now let's see how all these components work together. Let's walk through the process step by step. one is uploading the document. In the lab, you'll work with medical research papers. You upload a PDF to the S3 input medical folder. The S3 upload event triggers your Lambda function. Lambda runs the ADE parser to parse and structure the content. The parsed output is uploaded back to S3 in the output medical folder. in two formats: a markdown file for the parsed content and a JSON file containing chunk information for visual grounding, like chunk type and bounding box coordinates. This all happens automatically. You just upload the file and event-driven architecture takes care of the rest. Step two is ingesting the parsed documents into your knowledge base. You start ingesting the markdown files from the S3 output medical folder into the knowledge base. Amazon Bedrock reads the files, generates the embeddings for each chunk of text, and stores them in a vector database. Once the ingestion is complete, your knowledge base is now searchable. The agent can query this knowledge base to retrieve relevant information. Note that you can also implement a separate Lambda function for the ingestion job from S3 buckets to the knowledge base automatically. But in the lab, we'll keep it simple and only implement one Lambda function for parsing. Step three is creating the search_knowledge_base tool. This tool connects your agent to the knowledge base. When the agent needs information from your documents, it calls this tool, which queries the vector database and returns the most relevant content. This is what enables your agent to answer questions based on your uploaded documents. Step four is setting up the agent memory. Amazon Bedrock AgentCore memory provides three types of Long-Term Memory. One is User Preference. It stores likes, dislikes, and personal context. Second is Semantic Memory, which stores facts, entities, and relationships. Third is Summary Memory, which stores conversation summaries and key points. Importantly, memory will persist across sessions, so your agent can remember past interactions and provide personalized responses. Step five is building the agent itself. You will configure the system prompt, instructions for how the agent should behave, the search knowledge base tool you created, the memory, enabled by AgentCore Memory, and the LLM model hosted on Bedrock. Once everything is configured, your agent is ready to interact with the users. Finally, step six. You can chat with your agent. Here is an example. The user might say, I like sushi with tuna. And the agent might respond, Got it! I'll remember that preference. And in the later session, you might ask, What should I eat for lunch today? The agent might respond, How about sushi? You mentioned that you like tuna. The key feature is that the agent remembers you. It's not just answering isolated questions in one session. It's building context and learning your preferences over time. Now it's lab time. Let's switch to the notebook and start building.