In this lesson, you will have a brief recap of HNSW, the foundation of vector databases, which enables efficient nearest-neighbor searches. You will learn how to manage and then enhance the quality of the results it delivers. Let's go! Most of the vector databases use HNSW to approximate the nearest-neighbor search. There are certain ways in which we can control how well this approximation works. The hierarchical navigable Small Words is a multi-layer graph that approximates nearest-neighbor search. It uses vectors as nodes connected to each other based on their similarity. For that reason, the similarity measure has to be selected upfront. This data structure builds several stacked graphs, with the top one being the most sparse. This one contains just a small subset of all the vectors. Each consecutive layer contains more and more, but the rule is all the vectors from the layer above are also present in the current layer. The bottom layer is really dense as it contains all the vectors we have. Except for the similarity-based edges, there are also internal layer connections that help us jump between the layers. There are created between the same vectors just at the different layers. HNSW has two parameters that we can control. The first one, M, defines how many edges per node we create. Increasing this parameter's value should improve the approximation quality, but it won't fix any problems with your embedding model. If your representations are just too weak, you cannot expect search to derive the semantics of the data magically. Higher values of M will just bring you closer to the exact k nearest-neighbor search. However, controlling this parameter may also help to adjust latency. There are some cases where you can sacrifice the quality for faster retrieval. The most common operation performed with help of the HNSW graph is search. When we receive a query, it is also vectorized using the same embedding model and create the vector acts as a real query. The search operation starts from the top layer. Here the algorithm finds the ef closest points. Ef is the second parameter we can consider. In our case, its value was set to two, so this is how many points we select in the first layer. If we selected more, we could possibly get better results, but it would also make the process slower. The points selected in the top layer are then selected in the layer below, thanks to the intra layer edges. We do not traverse all the points here, but only focus on the direct neighbors of the points selected in this previous step. Again, we select ef closest matches, then the process continues. Repeat the same procedure until we reach the bottom layer. Each time we only check the previously selected points and their direct neighbors. That effectively narrows down the scope of search. In the last layer, we select the top k entries and these vectors become the most relevant documents that we return. K might be different from ef and it's usually lower. Let's see how changing both parameters might impact the search quality on the once dataset you already know. You will use the collection built in the previous lesson and try to improve the quality by controlling the HSNW parameters. The only difference is that this collection will be loaded from snapshot, and it will contain the vectors of all the products available in one dataset. So you don't have to recompute the metrics again. The process of importing the snapshots may also take a while. If you haven't modified the collection configuration, then all the HSNW parameters should be set to the defaults. It might be accessed in the HSNW config property. In our case, M is set to 16 and EF to 100. Let's start with the evaluation. You will only recompute the embeddings of all the queries, as we still need them to test various configurations of the collection. Let's load the model. Once the model is ready, we can also load the set of queries and create embeddings for every single one. Vector databases approximate nearest-neighbor search and that comes with some quality degradation. The default search operation runs the approximated search by default. We can also enable the exact mode to force the system to run pure K, and then search. That won't be as fast, but you will always get the highest quality possible given the selected embedding model. HSNW configuration changes wont improve the quality of the embedding model. It can only control how well the approximation does its job. The ground truth dataset will no longer be built based on the perfect matches for the benchmark queries, but you will calculate the quality of the approximation itself. For that reason, let's build a Qrels object with the exact search mode enabled. This will be our reference to measure the effectiveness of various HSNW configurations. Everything starts with iterating over the set of queries we have. For each query, we run the search operation in the exact mode so it's pure knn, and then eventually we iterate over the set of results returned by the system and build the corresponding dictionary that is then used to create the Qrels object. Our Qrels are ready and the baseline will be defined by how well the approximation works with the current default configuration. This time we are performing the same operations like for the knn and then search, but the exact mode is set to false. You will now evaluate the first run using ranx again. It doesn't make much sense to calculate metric different from precision at k, as the scores and documents ranx want to change. Embedding model still produces the same embeddings, but some of the results might be just missing compared to exact knn. The current default configuration of our collection achieves the precision at 25. At around 0.998. Even though the precision at 25 seems to be really high, increasing the values of both parameters should result in quality improvement. You can experiment with various numbers or even decrease them if your main concern is speed of search. Let's wait for Qdrant to finish building the new HSNW graph. We need to wait until Qdrant finishes the optimization process and build the newly updated HSNW structures. Otherwise, we might be still using the old ones and get the same quality measures. As a next step, you will perform the same search process like before. Nothing changes regarding how you call Qdrant. But the newly updated parameters should impact the results of each query. This cell should result in another run object being created. As a final step, you will calculate the same metric but using new parameters. It shouldn't be surprising that the quality of our results has improved. Maybe we didn't have to increase the values that much. In the real-world applications, keeping them at a minimum sensible level makes a lot of sense, as high values also come with a cost. The higher there are, the slower the search operations will be. Also, the number of edges in a graph increases its size, so that's a memory overhead. One important detail about how vector databases implementation HSNW is that it's usually divided into segments. So there is no single global graph. In most vector databases, the HSNW graph is not created globally, but points are divided into non-overlapping segments. Each segments will have a separate structure created. It helps a lot from the scalability perspective, as these segments may be distributed across a whole class of machines. That approach also improves the ability of the system to work concurrently in a single-node scenario, using multiple CPU cores simultaneously. It's also way easier to rebuild this single segment HSNW graphs, because we can do it one by one. In this lesson, you learned that controlling the parameters of HSNW graph may also give you an additional boost in terms of the search quality. You can play with different values of both parameters available to see how they will impact it even more. In the next lesson, you will learn some additional optimization techniques that can help you reduce memory requirements and sometimes even boost retrieval speed. See you there!