Welcome to Embedding Models: from Architecture to Implementation. Built in partnership with Vectara Embedding models create the embedding vectors that make it possible to build semantic or meaning-based retrieval systems. This course will describe their history, detailed technical architecture, and implementation. This is a technical course and so we'll focus on the building blocks rather than the applications. You may have heard of Embedding Vectors being used in Generative AI applications. These vectors have an amazing ability to capture the meaning of a word or phrase. You might have used embedding models to create these vectors, but how do these models actually work? To help us take into this, I'm delighted to introduce Amin Ahmad, Vectara's co-founder and Ofer Mendelevitch, who's the company's Head of Developer Relations. Thanks, Andrew. So glad to be here. At Vectara, we've built our own embedding models to support different RAG systems. So we have had to dig into how to select, build and train them. In this course, we will share some of the key technical details with you. That sounds great. Creating a model that can produce a vector that represents the meaning of a word is a challenging problem. You would want to make use of the large volume of existing texts as training data. But how do you go about doing that? Well, one idea was use the words around the target word as clues. Take the word tree. A text training sentence might say something like "the leaves on the tree are green", and another sentence might say "the branches on the tree are dropping". And so the words near the word tree tells you something about what tree means. If you were to have millions of sentences like that, then you might get a decent sense or get some sense of what the word tree means. This approach was made popular by an embedding model called Word2Vec by Tomáš Mikolov, Kai Chen, Greg Corrado and Jeff Dean. Most of my former teammates at Google Brain. And Word2Vec was the model that was trained on natural language to predict the word based on a few words on either side of it. Shortly after an approach called GloVe by Stanford University's Jeff Pennington, Richard Socha and Chris Manning further improved Word2Vec by simplifying the math with needed for learning embeddings. Increasing the context window around the word to produce more accurate embeddings previously involved the use of recurrent neural networks like LSTMs. The introduction of transformers in 2017 changed that, allowing feedforward neural networks, which are much more efficient to train than recurrent networks, to process sequential data effectively. The BERT model, released the following year, is deep transformer networks trained on a simple fill-in-the-missing word task to gain a deep understanding of language, ushering in the modern era of NLP and setting the stage for systems like GPT, which followed shortly after. And you can also go beyond words to longer chunks of text, like phrases or sentences. In a retrieval system, you might want to generate an embedding vector for a query sentence and compare it to vectors of response sentences. And it turns out that you can fine-tune these powerful word embedding models to evaluate sentences as well and Ofer will show you how. Exactly. In this course, you will first learn about where and how embedding models are used. Then you'll learn about BERT. This is an example of a bidirectional transformer. BERT is applied in many applications, but here we will focus on its use in retrieval. You will then learn how to build and use a contrastive loss to train a dual encoder model that is perfect for RAG applications. It has an encoder trained for queries and a separate encoder trained for responses. You will see all of this in practice. Many people have worked to create this course. I'd like to thank Vivek Sourabh from Vectara and from DeepLearning.AI, Esmaeil Gargari and Geoff Ladwig also contributed to this course. The first lesson will start off with an overview of embedding models in retrieval systems. That sounds great. Let's go on to the next video and get started.

Embedding Models: from Architecture to Implementation

Introduction
Video
・
4 mins

Introduction to embedding models
Video
・
4 mins

Contextualized token embeddings
Video with Code Example
・
10 mins

Token vs. sentence embedding
Video with Code Example
・
10 mins

Training a dual encoder
Video with Code Example
・
13 mins

Using embeddings in RAG
Video with Code Example
・
5 mins

Conclusion
Video
・
2 mins

Quiz

Graded・Quiz

・

10 mins