Unlock Plus AI learning and gain exclusive insights from industry leaders
Access exclusive features like graded notebooks and quizzes
Earn unlimited certificates to enhance your resume
Starting at $1 USD/mo after a free trial โ cancel anytime
In this lesson, we'll go through a problem that doesn't get enough attention. Deploying agents to production is surprisingly hard. You've built an agent, maybe you used LangGraph, BeeAI, Google's agent development kit or your own custom code. And it works great on your laptop, but now you need to actually deploy it so your organization can use it. In this lesson, we're going to take all of the A2A agents you've built in previous lessons and deploy them onto Agent Stack, IBM Research's open-source platform that provides self-hostable infrastructure for sharing your agents. Let's code and innovate. When you decide to take your agent from locally running on your device to something that can be ran remotely, suddenly you're dealing with a whole list of infrastructure to-dos. You need persistent storage for session and conversation history. You need to configure the LLM APIs and manage those connections. If you're doing RAG, you need file storage and vector search, and then there's the deployment layer, which requires containers, scaling, monitoring, logging, and more. And you can't forget security, where you'll handle rate limiting, authentication, and access controls. Oh, and you'll need a way for end users to interact with your deployed agents through a UI. There's a lot that goes into a full platform. So what are your options? Well, there are framework-specific platforms like LangGraph Cloud or CrewAI's platform. And these are self-hostable, which is great. But you're locked into that one framework. If your team wants to experiment with different approaches or you're managing multiple projects that use different frameworks, that's not supported. Then there are cloud vendor platforms, like Azure AI, Vertex AI, AWS Bedrock, and Vercel AI. And these are framework agnostic, which solves that problem. But now you're committed to a specific cloud vendor and locked into specific parts of the stack, which means you lose full flexibility if you want to change your setup. Additionally, depending on your industry and use case, platforms just aren't an option because data can't leave your infrastructure and it needs to stay on your own servers. So the only option remains to build it yourself, which takes months of infrastructure work before you even get to deploy your first agent, or you can use Agent Stack. Agent Stack is an Open-Source Linux Foundation project by IBM Research that provides self-hostable infrastructure for your agents that's optimized for speed, flexibility, and control. It lets you deploy any agent, regardless of what framework you built it with. to production in hours instead of weeks or months. Because it's infrastructure, you only need to adopt the components of the platform that suit you and your company's needs. And feel free to build the rest as you see fit. It's designed for two types of teams. First, agent development teams who are exploring multiple frameworks and need real infrastructure that doesn't lock them in. Hence why every agent uses the A2A protocol to standardize communication. And second, platform teams who are managing multiple internal projects and need one unified system for all their deployments. Both of these teams need the same things: control over their data, flexibility to use any framework, and the ability to deploy fast. That's what Agent Stack gives you. No vendor lock-in, no months of infrastructure work, just straightforward production-ready deployment for AI agents. Now let's install it together in just a couple short commands. You'll only need one line to install Agent Stack through the CLI. Now we're going to go through the setup process. So it's asking me, do I want to start the Agent Stack platform now? Yes. Now it's asking us if we want to configure our LLM provider. And we do. So we're going to hit yes. Now it's asking us to choose our API key. I'm going to choose Google Gemini for this course. We enter our API key. Now it's asking us to choose the default model. I'm going to hit no because we're going to use a lighter weight model for this exercise. So we're going to use gemini-2.5-flash- lite. And we are going to use the recommended embedding model. And we're all set. Anytime you want to change your model provider, you can run agentstack model setup. Inside Agent Stack, you have three layers. At the core of the stack is the Agent Stack Server, which is a self-hostable server that can be deployed via Helm Charts and provides your agents with a scalable runtime. Then we have the top-level components like the Agent Stack Server SDK, which we'll use in this lesson to get agents ready to deploy onto the Agent Stack. We have a CLI to deploy, configure, and manage your agents on the platform, and both auto-generated UI extensions to get your agent in front of others quickly or a client SDK for custom-built UIs. Optionally, you can take advantage of our infrastructure services, such as LLM Provider Management, RAG capabilities, File Storage, a Data Layer, Authentication, and Secret Management. On Agent Stack, each agent that is deployed is implicitly an A2A agent, because the Agent Stack SDK server is built on top of the agent-to-agent protocol. You can learn more about the Agent Stack SDK in the resources available at the end of the lesson. Now we're going to move on to the AgentStack-HealthcareAgent repo. This is a healthcare-based example of how to build and deploy an A2A Agent that calls other A2A Agents onto an open source platform called Agent Stack. It's a public repo so you can take a look at it and go through the code at any time. But we're going to review some of the agents that we built in the previous lessons and see how to deploy them onto Agent Stack. I'll walk through transforming two of the agents that we created, the provider agent and the healthcare agent. You'll learn how the Agent Stack platform supplies the server wrapper that makes it an A2A agent. The LLM provisioning, the UI if desired, and the handoff between agents. You don't need to rewrite the agent logic. Instead, you're adding the platform plumbing. Let's start with the provider agent in the provider_agent folder. Then into the agentstack_agents, and finally, we have our provider_agent.py file. We're not going to read through every line, as these are the same agents that we built in the previous lessons. But I am going to point out the parts that are necessary to bring the agent on to the Agent Stack platform. So first we have our ProviderAgent class. This is the same agent logic as we created it in the previous lesson. Then we create an instance of the Agent Stack server. We decorate the entry point with the server.agent decorator, and we give it the name ProviderAgent. This binds the agent to the server and gives it a name which it can be discovered by on the Agent Stack platform. Since we want to use the LLM inference, the message input, and the context managed by the platform, we create a provider agent wrapper. This makes sure that the ProviderAgent uses the AgentStack extension functionality. We build our LangChain OpenAI client using the credentials from the platform rather than the local environment variables or secrets that we might be used to. Just like the input needed to be an AgentStack message object, so does the output in order for it to be properly returned to the AgentStack platform. So we yield the agent message as the agent response. Finally, we run the server and set the appropriate host and port. It's important to remember that we didn't change any of the base agent logic. The LangChain code remains the same. We just take advantage of the AgentStack SDK's extensions to easily turn the A2A agent into an AgentStack compatible agent. Now let's move on to the healthcare agent. We'll see the same basic platform setup, plus extra features that allow the user to interact effectively with the healthcare agent through an automatically generated UI. We have our imports. Then we instantiate our Server. We have some helper functions that allow us to keep session memory on the platform across agent handoffs and some other helper functions. Then again, we use our @server.agent decorator to bind the agent. This agent is named Healthcare Concierge, and we also set a bunch of extra parameters that gives the platform details about the agent like its input and output modalities, its AgentDetail, its UI greeting, contributors, tools, and more. Again, we're using the LLM inference managed by the platform. So we add the LLMServiceExtension. But you might also notice that we've added a TrajectoryExtension. This will help to show the agent trace as it works through a problem, and it gives the user better insights into what's happening behind the scenes. Now just like in the last lesson where we built our local healthcare agent, we're adding the PolicyAgent, ResearchAgent, and ProviderAgent as tools for the main orchestrating agent to hand off to. However, this time we're making that assumption that those agents have already been deployed and are managed by AgentStack. So we discover them by searching for them by name from the directory of active agents on AgentStack, and we wrap them as handoff tools. Now we build our RequirementAgent and we add the other agents as conditional requirements. In this case for demo purposes, we want to ensure that each agent is ran at least once to know that the main orchestrator agent, which is an A2A agent, can call other A2A agents managed and deployed on the platform. We stream our response, trajectory, and final answer to the UI. And finally, We start the Agent Stack server for the Healthcare Concierge and set the appropriate host and port. Some key takeaways we learned from this lesson is that in order to transform your agents built in any framework into one that will run on Agent Stack, you need to wrap the server and add the @server.agent decorator. Use your Agent Stack compatible message input and output classes. Use the LLM extension to build your LLM client rather than using local secrets or environment variables. And if stateful, load, store, and maintain your session history. Optionally, you can add trajectory for observability, and lastly configure a run() entry point to start the server. Remember that your core agent logic remains unchanged. Agent Stack just adds the hosting and wiring to make it run on a deployed platform so you can share them with your organization. In my public repo, each agent has a folder. And within that folder are a few important components for deploying our agent. We already walked through the agent logic itself in the agentstack_agents folder. But I'd also like to show you the toml file. You need to ensure that the toml file has a project.scripts section. This actually starts the server. And then we have a Dockerfile, which you can use to build your agents. In the resources section of this lesson, you'll find a repo that you can use as a starter template for deploying your agents. Once you have everything in the correct structure, You can create a new release in GitHub. This is optional, but it gives us version control. So you can update your agents if you have a newly released version available, or roll back to a previous version. Now that you have everything set up correctly, it only takes one command to deploy your agent to Agent Stack. And whether you're deploying your agents to your local instance of agentstack or your organization's hosted version, you would do so the same way. First, let's install the PolicyAgent. We can see that our PolicyAgent has been deployed onto the platform. Now let's deploy the ProviderAgent. Next up, the ResearchAgent. And finally, our main orchestrating agent, the healthcare agent. So you can check your agents have been deployed correctly by using agentstack list. You can see that all of the agents that we've deployed are here, plus some other agents that the platform comes with. Now before we run the agents, we'll need to set our Serper API key. This is an API key that the ResearchAgent will need. We can add it directly through the CLI, which I'm going to do. Or you can use the UI to manage your API keys as well. You can interact with agents through the CLI or you can use the UI. Let's use the UI so it's more user-friendly. We start the UI with agent stack UI. We can see that all the agents that we've deployed have made it on to the Agent Stack. Let's try the PolicyAgent individually first. We can ask it, What is my coinsurance for office visits both in and out of network? Now let's use the healthcare agent that is a main orchestrator agent which passes off task to other A2A agents. Let's ask it. I need mental health assistance and live in Austin, Texas. Who can I see and what is covered by my policy? Now we can see because we implemented trace that we're actually seeing the trajectory behind the scenes. The agent was initialized. We can see it's thinking and reasoning. First the ProviderAgent was called and it found the providers in Austin, Texas. Then the PolicyAgent was called and it provided its response. Then it went back to thinking and it called the Research agent last. Finally, it returned a final answer. They found the mental health providers in Austin, Texas, it gave me their information. She's a psychiatrist with 13 years of experience and is board-certified. And it answered my question regarding my policy coverage. In this lesson, we've taken four separate A2A agents and enabled them to be run locally or remotely using Agent Stack to manage the LLM inference, memory, agent runtime, automatic UI, and more. all without needing to build the platform plumbing from scratch. With Agent Stack, you can take the open source platform and deploy it behind your organization's firewall to share your agents with others in your org, which is great for a team that needs full control over their data and wants to manage all their agent deployments in one place without vendor lock-in. Visit the resources at the end of the lesson to learn more.