DeepLearning.AI
AI is the new electricity and will transform and improve nearly all areas of human lives.

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

Enjoy 30% Off Now. Cancel Anytime!

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

View All Plans and Features

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

DeepLearning.AI
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to Prompt Engineering for Vision Models, built in partnership with Comet and told by Abby Morgan, Jacques Verré, and Caleb Kaiser. Prompting applies not just to text but also to vision, including some image segmentation, object detection and image generation models. Depending on the vision model, the prompts may be text, but it could also be pixel coordinates or bounding boxes or segmentation. In this course, you prompt matters segment anything model or S.A.M, SAM to identify the outline of an object by giving it some coordinates or points or bounding boxes to help it identify the object. For example, maybe a t- shirt that I was wearing. You also apply negative prompts to tell the model which regions to exclude when identifying an object. Using a combination of positive and negative prompts, helps you to isolate a region the object that you are interested in, such as maybe a specific pattern on a t-shirt that a dog is wearing. But prompting SAM is just one example. You learn about many other tools, as well as best practices for prompting both to analyze status to understand images, as well as to generate or to otherwise manipulate or change images. I'm delighted to introduce our instructors for this course. Abby Morgan is a machine learning engineer who creates technical content and tutorials for the developer community. Jacques Verré is head of product at comet and he's spearheaded the release of Comet's Model Production Monitoring offering and is also an active contributor to Comet's, technical documentation. And Caleb Kaiser is also a machine learning engineer who works on Comet's, open source and community projects like Canvas and Comet LLM. Thanks, Andrew. In the course, you'll also apply prompt engineering to image generation. For instance, you can provide the text prompt a dragon to the stability diffusion model to generate the image of a dragon, and you'll iterate on that prompt. For instance, a realistic green dragon. To get a different image of a dragon, you will also prompt the diffusion model to replace a photograph of a cat with the dragon, while keeping the rest of the photograph intact. This is called inpainting, in which you edit a painting or photograph by removing a segmented object and replacing it with a generated image. For inpainting, your prompt will not only be the text a realistic green dragon, but also an outline of the cat that the dragon will replace. You'll obtain that outline or mask using image segmentation. Furthermore, you'll obtain the bounding box that SAM uses as input by prompting an object detection model, this time with a text prompt such as cute dog with a pink jacket to generate a bounding box around that dog. You will iterate on both the prompts and the model hyperparameters that you will tune in this inpainting pipeline. Diffusion models work by transforming a sample from a simple distribution like Gaussian noise, into a complex, learned distribution, like images. The guidance scale hyperparameter determines how heavily the text input should affect the target distribution during the reverse diffusion process. A lower guidance scale will allow the model to sample freely from its learned distribution of images, whereas a higher guidance scale will guide the model towards a sample that more closely matches the text input. The number of inference steps hyperparameter, controls how gradually the model transforms the noisy distribution back into a clean sample. More steps generally allows for a more detailed and accurate generation, as the model has more opportunities to refine the data. However, more steps also means more computation time. The strength, hyperparameter, and the context of stable diffusion, determines how noisy the initial distribution is. During the inpainting process, where the added noise is used to erase portions of the initial image, strength essentially determines how much of the initial image is retained in the diffusion process. Furthermore, what if you wanted to personalize the diffusion model to generate not just a generic dragon, cat or person, but a specific dragon, your specific pet cat, or your best friend? You'll use a fine tuning method called Dreambooth, developed by Google Research, to tune the stable diffusion model to associate a text label with a particular object, such as your good friend. And of course, you'll use the dreamboot tuning process on the stable diffusion model to associate the word Andrew Ng with just six photographs of Andrew. After fine tuning, you can prompt the model with texts such as a Van Gogh painting of Andrew, and the model can generate the image. One unique aspect of vision model development workflows is that evaluation metrics won't always be able to tell you the full story. Oftentimes, you'll want to visualize your image outputs and inspect the manually to understand where your model is getting things right and where it's getting things wrong. And we'll talk through best practices for efficiently carrying out this type of iteration as well. Let's say your object detection model is performing poorly all of a sudden, and your input data distribution hasn't changed. You open up a few incorrect predictions and realize that a new object has been introduced to your images that your model is mistaking for your target object. It's time to further train your model on this new object. It's unlikely that evaluation metrics alone would it be able to paint the full story of what was going on here. And so visualizing your output can be very important. Similarly, when iterating across different sets of hyperparameters, you'll sometimes need to see the output image in order to understand how the hyperparameter values are affecting it. Experiment tracking tools can help you compare these output images side by side and track and organize, which inputs lead to which outputs so you can reproduce them later on. Computer vision workflows are highly iterative, so it's valuable to track each of your experiment runs. Many people have worked to create this course. I'd like to thank on the Comet's side, Sid Mehta, senior growth engineer at Comet. From DeepLearning.AI Eddy Shyu also contributed to this course. In the first lesson, you'll get an overview of visual prompting for image segmentation, object detection, and diffusion models that you'll use in this course. That sounds great. Let's get started. And after finishing this course, I think you could be a real visionary when it comes to prompting.
course detail
Next Lesson
Prompt Engineering for Vision Models
  • Introduction
    Video
    ・
    6 mins
  • Overview
    Video
    ・
    7 mins
  • Image Segmentation
    Video with Code Example
    ・
    12 mins
  • Object Detection
    Video with Code Example
    ・
    23 mins
  • Image Generation
    Video with Code Example
    ・
    11 mins
  • Fine-tuning
    Video with Code Example
    ・
    20 mins
  • Conclusion
    Video
    ・
    1 min
  • Course Feedback
  • Community