DeepLearning.AI
AI is the new electricity and will transform and improve nearly all areas of human lives.

Quick Guide & Tips

๐Ÿ’ป ย  Accessing Utils File and Helper Functions

In each notebook on the top menu:

1: ย  Click on "File"

2: ย  Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


๐Ÿ’ป ย  Downloading Notebooks

In each notebook on the top menu:

1: ย  Click on "File"

2: ย  Then, click on "Download as"

3: ย  Then, click on "Notebook (.ipynb)"


๐Ÿ’ป ย  Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


๐Ÿ“— ย  See Your Progress

Once you enroll in this courseโ€”or any other short course on the DeepLearning.AI platformโ€”and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


๐Ÿ“ฑ ย  Features to Use

๐ŸŽž ย  Adjust Video Speed: Click on the gear icon (โš™) on the video and then from the Speed option, choose your desired video speed.

๐Ÿ—ฃ ย  Captions (English and Spanish): Click on the gear icon (โš™) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

๐Ÿ”… ย  Video Quality: If you do not have access to high-speed internet, click on the gear icon (โš™) on the video and then from Quality, choose the quality that works the best for your Internet speed.

๐Ÿ–ฅ ย  Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

โˆš ย  Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


๐Ÿง‘ ย  Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

๐Ÿง‘ ย  Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

๐Ÿ“… ย  Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

โ˜• ย  Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

๐Ÿ’ฌ ย  Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

โœ ย  Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


๐Ÿ“š ย  Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. ๐Ÿ‘‡

๐Ÿ‘‰๐Ÿ‘‰ ๐Ÿ”— DeepLearning.AI โ€“ All Short Courses [+]


๐Ÿ™‚ ย  Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community ๐Ÿ‘‰๐Ÿ‘‰ ๐Ÿ”— DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

Enjoy 30% Off Now. Cancel Anytime!

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. Youโ€™ll continue to have access to your current plan until then.

View All Plans and Features

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

DeepLearning.AI
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to Prompt Compression and Query Optimization. Built in partnership with MongoDB and taught by Richmond Alake. Richmond is a developer advocate at MongoDB and has worked as a machine learning architect and taught AI and ML for many years. Thanks, Andrew. This course shows you how to combine features of a mature, established database with vector search to reduce the cost of serving a large RAG application. Say you're building a conversational RAG application that helps users select a rental property. A user might enter a text query for one level ranch on a quiet street. You can use semantic search to find a close match to the user description. Using an embedding of the user requests and searching a vector database for homes with descriptions that match. But the user may also have hard requirements like three bedrooms, two bathrooms, and maybe no swimming pool. These are better handled with a more traditional retrieval by selecting data based on fields in the database and explicitly store the number of bedrooms, bathrooms, and so on. In this course, you learn to use the best of both worlds, a traditional database with an added vector index. In RAG applications to retrieve results that provide an LLM for final processing. If the retrieve context is very long, this results in a very long prompt and can thus be costly where retrieval to return, say 10,000 tokens. If you were to run a rental comparison website to search, say, a million queries per day, and if LLM input tokens cost $10 per million tokens, you could be spending over $36 million a year. So, to help you reduce costs, this course, will also cover ways to keep the retrieved results as small and relevant as possible. Thanks, Andrew. Let me describe some of the techniques you will learn. Let's consider your rental app filtering on the number of bedrooms or bathrooms can be done with a pre-filter or post-filter. Efficient pre-filter is done in the database index creation stage. You build a new index of entries that match common queries. So for example, if you know you frequently get queries for bedroom units, you can build an index that includes the bedroom field. So that's pre-filtering. In contrast, post filtering is done often a vector search query is performed where you then apply a filter to this result to select the sub set matching the required condition. Large scale applications may use both of these techniques simultaneously. Another technique to minimize the size of the output is something called projection, which selects a subset of the fields returned from a query. For example, out of 15 fields of a potential rental, you may want to return only three of them. Name, number of bedrooms, and price. Now, you could implement all of this operation directly in your application, but the database can optimize all this operation for performance and enforce role-based access control. So they are best accomplished there. And another powerful technique is reranking the results of a search. For example, after using the text embeddings of the renter description to perform a semantic search, you can rerank the results based on other data fields such as average star rating or number of ratings, to move the more desired results higher up the list of results. In order to then generate better context for the LLM. One final technique is prompt compression. If the retrieve information is very lengthy, seeking all this context into an LLM prompt results in a very long, prompt length, which is expensive to process. To reduce this costs, you can use a small, low cost LLM fine tuned to compress prompts before sending them to the final LLM. There are many opportunities to improve relevance and save costs. Thank you Andrew. You will learn all these techniques in the next few lesson. You will start this course by implementing a vanilla vector search and end by implementing prompts compression. Many people have worked to create this course from MongoDB. I'd like to thank Apoorva Joshi, Pavel Duchovny, Prakul Agarwal, Jesse Hall, Rita Rodrigues, Henry Weller and Shubham Ranjan, and also Esmaeil Gargari from DeepLearning.AI had also contributed to this course. I hope you enjoy this course. Please go to the next video and let's dive in.
course detail
Next Lesson
Prompt Compression and Query Optimization
  • Introduction
    Video
    ใƒป
    4 mins
  • Vanilla Vector Search
    Video with Code Example
    ใƒป
    33 mins
  • Filtering With Metadata
    Video with Code Example
    ใƒป
    19 mins
  • Projections
    Video with Code Example
    ใƒป
    10 mins
  • Boosting
    Video with Code Example
    ใƒป
    12 mins
  • Prompt Compression
    Video with Code Example
    ใƒป
    17 mins
  • Conclusion
    Video
    ใƒป
    1 min
  • Course Feedback
  • Community