DeepLearning.AI
AI is the new electricity and will transform and improve nearly all areas of human lives.

💻   Accessing Utils File and Helper Functions

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Open"

You will be able to see all the notebook files for the lesson, including any helper functions used in the notebook on the left sidebar. See the following image for the steps above.


💻   Downloading Notebooks

In each notebook on the top menu:

1:   Click on "File"

2:   Then, click on "Download as"

3:   Then, click on "Notebook (.ipynb)"


💻   Uploading Your Files

After following the steps shown in the previous section ("File" => "Open"), then click on "Upload" button to upload your files.


📗   See Your Progress

Once you enroll in this course—or any other short course on the DeepLearning.AI platform—and open it, you can click on 'My Learning' at the top right corner of the desktop view. There, you will be able to see all the short courses you have enrolled in and your progress in each one.

Additionally, your progress in each short course is displayed at the bottom-left corner of the learning page for each course (desktop view).


📱   Features to Use

🎞   Adjust Video Speed: Click on the gear icon (⚙) on the video and then from the Speed option, choose your desired video speed.

🗣   Captions (English and Spanish): Click on the gear icon (⚙) on the video and then from the Captions option, choose to see the captions either in English or Spanish.

🔅   Video Quality: If you do not have access to high-speed internet, click on the gear icon (⚙) on the video and then from Quality, choose the quality that works the best for your Internet speed.

🖥   Picture in Picture (PiP): This feature allows you to continue watching the video when you switch to another browser tab or window. Click on the small rectangle shape on the video to go to PiP mode.

√   Hide and Unhide Lesson Navigation Menu: If you do not have a large screen, you may click on the small hamburger icon beside the title of the course to hide the left-side navigation menu. You can then unhide it by clicking on the same icon again.


🧑   Efficient Learning Tips

The following tips can help you have an efficient learning experience with this short course and other courses.

🧑   Create a Dedicated Study Space: Establish a quiet, organized workspace free from distractions. A dedicated learning environment can significantly improve concentration and overall learning efficiency.

📅   Develop a Consistent Learning Schedule: Consistency is key to learning. Set out specific times in your day for study and make it a routine. Consistent study times help build a habit and improve information retention.

Tip: Set a recurring event and reminder in your calendar, with clear action items, to get regular notifications about your study plans and goals.

☕   Take Regular Breaks: Include short breaks in your study sessions. The Pomodoro Technique, which involves studying for 25 minutes followed by a 5-minute break, can be particularly effective.

💬   Engage with the Community: Participate in forums, discussions, and group activities. Engaging with peers can provide additional insights, create a sense of community, and make learning more enjoyable.

✍   Practice Active Learning: Don't just read or run notebooks or watch the material. Engage actively by taking notes, summarizing what you learn, teaching the concept to someone else, or applying the knowledge in your practical projects.


📚   Enroll in Other Short Courses

Keep learning by enrolling in other short courses. We add new short courses regularly. Visit DeepLearning.AI Short Courses page to see our latest courses and begin learning new topics. 👇

👉👉 🔗 DeepLearning.AI – All Short Courses [+]


🙂   Let Us Know What You Think

Your feedback helps us know what you liked and didn't like about the course. We read all your feedback and use them to improve this course and future courses. Please submit your feedback by clicking on "Course Feedback" option at the bottom of the lessons list menu (desktop view).

Also, you are more than welcome to join our community 👉👉 🔗 DeepLearning.AI Forum


Sign in

Create Your Account

Or, sign up with your email
Email Address

Already have an account? Sign in here!

By signing up, you agree to our Terms Of Use and Privacy Policy

Choose Your Learning Path

Enjoy 30% Off Now. Cancel Anytime!

MonthlyYearly

Change Your Plan

Your subscription plan will change at the end of your current billing period. You’ll continue to have access to your current plan until then.

View All Plans and Features

Welcome back!

Hi ,

We'd like to know you better so we can create more relevant courses. What do you do for work?

DeepLearning.AI
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Welcome to this short course, Retrieval Optimization: from Tokenization to Vector Quantization. Built in partnership with Qdrant and taught by Kacper Łukawski. Retrieval augmented generation involves two main steps. First, a retriever searches a large document corpus to find relevant information. Then a generator uses this information to produce accurate and contextually relevant results for the user's query. This course focus on enhancing and optimizing the first step, the retrieval step, in your RAG and search applications. You begin by learning how tokenization is done in large language models and in embedding models, and how the tokenizer can affect the quality of your search. Specifically, tokenizers create a sequence of numerical ids or integers representing the tokens which usually correspond to words or parts of words in a text. The multiple ways to turn a piece of text into a sequence of tokens. For example, simple word level tokenization would split a sentence like "I enjoy learning." Into "I" and then "enjoy" and "learning." While subword tokenization can break it down even further to "I" then "en" "joy", "learn" and then "ing". The tokenizer is a key component of the language model that is also trainable. You learn about tokenization techniques like wordpiece byte-pair encoding and unigram tokenization, as well as how token vocabulary impacts search results, especially with special characters like emojis, typos, and numerical values. I'm delighted to introduce the instructor for this course, Kacper Łukawski, who is developer relations lead for Qdrant. Kacper, has been helping many developers create and optimize the search and RAG systems. Thanks, Andrew. The first step in optimizing your system is measuring the quality of its outputs. Without this how can you, even tell if your changes have made an improvement. So in this course you will learn how to assess the quality of your search using several quality metrics. Vector databases use specialized data structures to approximate the search of nearest neighbors. HNSW which stands for Hierarchical, Navigable, Small words, is the most commonly used one, and it has some parameters that give you control over how good the approximation is. HNSW search is built on top of a multi-layer graph, and it's like shipping the package in the mail. The top layer gets you close enough to the state that mail should be delivered to, the next layer then finds the city, and as you go down the layers, you get closer and closer. You will see how to balance the parameters used for forming and searching the HNSW graph, for higher speed and maximum relevance. If you have millions of vectors a search, then storing, indexing and searching can become resource intensive and slow. Say you built a news analysis app that gathers all the news related to an industry and summarizes them. After chunking thousands of articles published each day within a month, you might easily end up with millions of vector embeddings. If you end up with, say, 5 million vectors, then using OpenAI's Ada embedding model, which generates a vector with a little over a 1500 dimensions, you need about 30 gigs of memory. And this will continue to grow every month. This is where quantization techniques come in. Using the techniques you learned in this course, you better reduce the memory needed for your vector search by up to 64x. You will learn three main quantization techniques. The first is product quantization. This maps subvectors to the nearest centroid, reducing memory usage or increasing indexing time. The second is scalar quantization by converting each float value to 1-byte integers. This significantly reduces memory and speeds up both indexing and search operations. However, this hurts precision slightly. And the third is an extreme form of quantization called binary quantization, which converts float values to Boolean values, which improves memory usage significantly and improves search speed, but at even greater costs to precision. In the next few lessons, you will first learn about embedding models and how they turn text into vectors. You will then learn how tokenization is done. Next, you will look at practical issues with tokenization and how they can affect your vector search and retrieval relevance. You will also learn how to measure the quality of search results in RAG applications, and why this is important for making improvements. We will review HNSW, and learn ways to improve its search results. We will also explore vector quantization to reduce memory use and make searches more efficient. By the end of this course, you will know how to optimize semantic search and build more reliable AI applications. Let's get started and create something great. Many people have worked to create this course. I'd like to thank David Myriel from Qdrant. In addition, Esmaeil Gargari and Geoff Ladwig from DeepLearning.AI have also contributed to this course. Up first is a video on how embedding models turn text into vectors, and the role of the tokenizer in this whole process. I always thought tokenization is one of the really critical, but often ignored and underappreciated aspects of the models we use. So let's go on to the next video and learn about that.
course detail
Next Lesson
Retrieval Optimization: Tokenization to Vector Quantization
  • Introduction
    Video
    ・
    6 mins
  • Embedding models
    Video with Code Example
    ・
    16 mins
  • Role of the tokenizers
    Video with Code Example
    ・
    15 mins
  • Practical implications of the tokenization
    Video with Code Example
    ・
    14 mins
  • Measuring Search Relevance
    Video with Code Example
    ・
    14 mins
  • Optimizing HNSW search
    Video with Code Example
    ・
    10 mins
  • Vector quantization
    Video with Code Example
    ・
    16 mins
  • Conclusion
    Video
    ・
    1 min
  • Course Feedback
  • Community