DLAI Logo
AI is the new electricity and will transform and improve nearly all areas of human lives.

Welcome back!

We'd like to know you better so we can create more relevant courses. What do you do for work?

DLAI Logo
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to "Building AI Browser Agents", built in partnership with AGI Inc. AI browser or AI web agents can log into websites fill out forms, click through web pages, or even place an online order for you. Your AI web agent can use both visual information that is, screenshots and structural information such as the HTML or the Document Object Model (DOM) representation of a web page to reason and take actions. If you open a page on a website and take a look at the code underlying that web page, you see how large the action space can be for the agent at each step. Since these agents can run long sequences of actions automatically, any error can have unintended consequences like paying for the wrong flights or ordering random products, or if the agent misreads a single field, say a product name, it can hit down the wrong path entirely, and these errors can compound quickly. In this course, you'll learn about these problems and several approaches to tackle them. I'm delighted to introduce the instructors Div Garg and Naman Garg, who are the co-founders of AGI Inc. Div, Naman, and their team have built MultiOn, which is a web agent platform that is based on the approach that they published in the AgentQ paper. Thanks Andrew. To address the challenges you mentioned, we have introduced AgentQ. AgentQ combines Monte Carlo Tree Search (or MCTS) with a self critic mechanism and iterative fine-tuning using Direct Preference Optimization, or DPO. During the search process of AgentQ, different branches or sequential actions are explored and the outcomes are evaluated. The simulations combined with their feedback, are used to create preference pairs at each node of the search tree. The DPO algorithm is then used to fine tune the underlying language policy model by learning from this high level preferences. This helps favor actions that lead to better outcomes or are ranked higher by the AI feedback. In this course, you will build several web agents. You will build a simple web agent that analyzes DeepLearning.AI website, and list all the courses on a specific topic. Then you will extend this to taking actions like clicking on a course or summarizing it, and even signing up for a batch newsletter. Next, you will dive deep into MCTS that is an integral part of our AgentQ method and solve a grid world problem of finding the optimal path, we will then explore a variant of AgentQ plus MCTS that takes a course title and searches the web and navigates the result until it finds the right course. You will visualize and analyze different tree paths the agent takes until it accomplishes the goal. Many people have worked to create this course. I'd like to thank Michelle Gee and Milind Maiti from AGI Inc. From DeepLearning.AI, Esmaeil Gargari and Geoff Ladwig have also contributed to this course. The first lesson will be an introduction to AI web agents. That sounds great! Now let's have you, rather than your web agent, get started.
course detail
Next Lesson
Building AI Browser Agents
  • Introduction
    Video
    ・
    2 mins
  • Intro to Web Agents
    Video
    ・
    11 mins
  • Building a Simple Web Agent
    Video with Code Example
    ・
    7 mins
  • Building an Autonomous Web Agent
    Video with Code Example
    ・
    9 mins
  • Agent Q
    Video
    ・
    8 mins
  • Deep Dive into AgentQ and MCTS
    Video with Code Example
    ・
    9 mins
  • Future of AI Agents
    Video
    ・
    5 mins
  • Conclusion
    Video
    ・
    1 min
  • Appendix – Tips and Help
    Code Example
    ・
    1 min
  • Course Feedback
  • Community
  • 0%