DeepLearning.AI
AI is the new electricity and will transform and improve nearly all areas of human lives.

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

Enjoy 30% Off Now. Cancel Anytime!

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

View All Plans and Features

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

DeepLearning.AI
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to Evaluating AI Agents built in partnership with Arize AI. Say you're building an AI coding agent, you might have to carry out a lot of steps to generate good code such as plan, use tools, reflect, and so on. And using an evaluation driven development process would make your development much more efficient. In this course, you'll learn how to add observability to your agent-based applications. Said this, You will see what it is doing every step of the way, so you can evaluate it components wise and efficiently drive improvements at the components. And then also at the whole system level. So if you're asking yourself questions like should you update a prompt at the last step, or should you update the logic of the workflow, or should you change the large language model you're using? Having a disciplined, evaluation driven process will help you a lot in terms of making these decisions in a systematic rather than random try a lot of things and see what works, kind of way. If you've heard of the idea of error analysis, which is a key concept to machine learning, this teaches you how to do that in the agentic workflow development process. If you haven't heard of error analysis, that's fine too. But in this course it takes an important set of ideas and shows you how to do it to develop agentic workflows efficiently. The instructors of this course are John Gilhuly, whose head of developer relations and Aman Khan, who is director of product at Arize AI. It's been fun working with you on this course. Thank you. Andrew. We're excited to teach this course. Thank you. Say you're building a research agent that searches the web, identifies sources, collects content, summarizes findings, and then maybe iterates if it identifies any weaknesses as an output. When you're building this complex system, you need to evaluate the quality of each step's output. For example, for source selection, you might create a test set that comprises research topics and the corresponding set of expected sources, and then measure the percentage of times that the agent chooses to correct sources. Or for open ended tasks like summarization, you can prompt a separate large language model or apply what we call large language model as judge. In order to evaluate the quality of this more open-ended output of a text summary. Apart from testing and improving the quality of your agent's output, you can also evaluate the path taken by the agent to ensure it doesn't get stuck in a loop or repeat steps unnecessarily. And so, in this course, you'll learn how to structure your evaluations to iterate on and improve both the output quality and the path taken by your agent. You'll do this by creating a code based agent that operates as a data analyzer. The agent will have access to a set of tools that allow it to connect to a database and perform analysis. A router that identifies what tool to use, and a memory that keeps track of the chat history. You'll collect and evaluate traces of the steps taken by your agent to process a query and visualize the collected data. You'll then learn how to evaluate each tool in your agent workflow using different types of evaluators. You also evaluate if the router chooses the right tool based on the user's query, and if it extracts the right parameters to execute the tool and assess the trajectory taken by the agent. Finally, you put all of your evaluators into a structured experiment that you can use to iterate on and improve your agent. While the course focuses on applying evaluation during development. You'll also learn how you can monitor your agent during production. Many people have worked to create this course. I'd like to thank Mikyo King, Xander Song, and Aparna Dhinakaran from Arize AI. And from DeepLearning.AI Hawraa Salami has also contributed to this course. John and Aman are both experts on the important topic of how to evaluate AI agentic workflows. Let's now go on to the next video, and I hope you enjoy the course and learn a lot from John and Aman.
course detail
Next Lesson
Evaluating AI Agents
  • Introduction
    Video
    ・
    3 mins
  • Evaluation in the time of LLMs
    Video
    ・
    7 mins
  • Decomposing agents
    Video
    ・
    6 mins
  • Lab 1: Building your agent
    Video with Code Example
    ・
    16 mins
  • Tracing agents
    Video
    ・
    4 mins
  • Lab 2: Tracing your agent
    Video with Code Example
    ・
    16 mins
  • Adding router and skill evaluations
    Video
    ・
    12 mins
  • Lab 3: Adding router and skill evaluations
    Video with Code Example
    ・
    17 mins
  • Adding trajectory evaluations
    Video
    ・
    5 mins
  • Lab 4: Adding trajectory evaluations
    Video with Code Example
    ・
    9 mins
  • Adding structure to your evaluations
    Video
    ・
    7 mins
  • Lab 5: Adding structure to your evaluations
    Video with Code Example
    ・
    15 mins
  • Improving your LLM-as-a-judge
    Video
    ・
    4 mins
  • Monitoring agents
    Video
    ・
    6 mins
  • Conclusion
    Video
    ・
    1 min
  • Appendix - Resources, Tips and Help
    Code Example
    ・
    10 mins
  • Course Feedback
  • Community