DLAI Logo
AI is the new electricity and will transform and improve nearly all areas of human lives.

Welcome back!

We'd like to know you better so we can create more relevant courses. What do you do for work?

DLAI Logo
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to Building Toward Computer Use with Anthropic. Built in partnership with Anthropic and taught by Colt Steele, whose Anthropic's Head of Curriculum. Welcome, Colt. Thanks, Andrew. I'm delighted to have the opportunity to share this course with all of you. Anthropic made a recent breakthrough and released a model that could use a computer. That is, it can look at the screen, a computer usually running in a virtual machine, take a screenshot and generate mouse clicks or keystrokes in sequence to execute some tasks, such as search the web using a browser and download an image, and so on. This computer use capability is built by using many features of large language models in combination, including their ability to process an image, such as to understand what's happening in a screenshot, or to use tools that generate mouse clicks and keystrokes. And these are wrapped in an iterative agent workflow to then carry out complex tasks by taking many actions on that computer. In this course, you learn about the individual features which will be useful for your applications even outside of LLM-based computer use, as well as see how we can all come together for computer use. And Colt will show you how all this works. Thanks, Andrew. In this course, you will learn how to use many of the models and features that all combine to enable computer use. So here's how the course will progress. You'll first learn a little bit about Anthropic's background and vision and what's unique about our family of models. Then we'll use the API to make some basic requests. This then leads to multi-modal requests, where you'll use the model to analyze images. Then you'll dive into prompting, which Anthropic has really leaned into making models much more predictable with solid prompting, you'll learn about the prompting tips that actually matter, things like chain of thought and n-shot prompting, as well as get a chance to use our prompt improver tools. Recently, large language models have been supporting large input contexts. Anthropic's Claude, for example, supports over 200,000 input tokens, which is more than 500 pages of text. Long inputs can be expensive to process, and that any long conversations with chatbot if you're processing that conversation history over and over to keep on generating that next response, that next response, then that too gets more expensive as that history gets longer as the conversation goes on. Exactly. And that brings us right to prompt caching. Prompt caching retains some of the results of processing prompts between invocation to the model, which can be a large cost and latency saver. You also get to use the model to generate calls to external tools and produce structured output, such as Json, and at the very end, we'll walk through a complete example of computer use that you can run on your own machine. Note that because of the nature of the tool, you will have to run that on a Docker image on your computer, rather than directly in the DeepLearning.AI notebook. I've tried out Computer use myself using Anthropic's models and found it really cool. And I think this capability will make possible a lot of new applications where you can build an AI assistant to use a computer to carry out tasks for you. Kind of think RPA or robotic process automation, which has been good at repetitive tasks but now easier to build and more general with LLM-based tools. Or as this technology is even better than even more flexible and more open-ended tasks. So gradually feel more and more like personal assistants. I could not agree more. Very excited to see where it goes. Many people have worked to create this course. I'd like to thank from Anthropic, Ben Mann, Maggie Vo, Kevin Garcia, the team working on computer use, and from DeepLearning.AI Geoff Ladwig and Esmaeil Gargari. Anthropic has built a lot of really great models, and I regularly use them myself. Colt will share details of these models in the next video. All right, let's get started.
course detail
Next Lesson
Building toward Computer Use with Anthropic
  • Introduction
    Video
    ・
    3 mins
  • Overview
    Video
    ・
    7 mins
  • Working with the API
    Video with Code Example
    ・
    15 mins
  • Multimodal Requests
    Video with Code Example
    ・
    12 mins
  • Real World Prompting
    Video with Code Example
    ・
    17 mins
  • Prompt Caching
    Video with Code Example
    ・
    12 mins
  • Tool Use
    Video with Code Example
    ・
    17 mins
  • Computer Use
    Video
    ・
    10 mins
  • Conclusion
    Video
    ・
    1 min
  • Appendix – Tips and Help
    Code Example
    ・
    1 min
  • Course Feedback
  • Community
  • 0%