DLAI Logo
AI is the new electricity and will transform and improve nearly all areas of human lives.

Welcome back!

We'd like to know you better so we can create more relevant courses. What do you do for work?

DLAI Logo
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
Hi! In this lesson, you will measure sentence similarity using the Sentence Transformers library. Sentence similarity measures how close two pieces of text are. For instance, the phrase I like kittens and we love cats have similar meaning. This sentence similarity is particularly useful for information retrieval and clustering or grouping. Let's get started! For this classroom, the libraries have already been installed for you. If you are running this on your own machine, you can install Sentence Transformers library by running the following. pip install sentence-transformers Since the libraries are already being installed in this classroom, we don't need to run this cell. So I will just comment it out. Let's load from Sentence Transformers library the Sentence Transformers class. This class will enable you to load many models. Let's load from Sentence Transformers library the Sentence Transformers class. To load a model, you just need to pass the name of the model inside the Sentence Transformers class. For this lesson, we decided to use this particular model to perform sentence similarity. This model was actually functioned by the open source community. Back in 2021, Hugging Face and Google organized an event called Community Week where anyone can join online, use the provided hardware, and train NLP and computer vision models. The Mini-LM sentence embedding model was one of the models that came out of the event. Sentence similarity models convert input text into vectors or so-called embeddings. These embeddings capture semantic information. Let's encode the following sentences. The cat sits outside, a man is playing guitar, and the movies are awesome. To encode these sentences, we will use the encode methods. So for these sentences, we can get the embedding this way. model.encode We put the sentences. And we can also add an argument to make sure that we get tensors at the end. Through the arguments convertToTensor equals true. And now let's print the embeddings. And as you can see, we managed to encode the text. To embeddings or vectors. Let's do the same thing with another list of sentences. So we have the dog plays in the garden. A woman watches TV. And the new movie is so great. We do the exact same thing. And we get the embeddings too for the sentences too. And now. We can calculate how close these sentences are between them. To do that, we will use the cosine distance. Which is a measure to calculate how close and how far two vectors are. To do that, we need to import the utils function from the sentence transformers library. Then all we need to do is to use the cosine methods from the utils. We can pass the two embeddings. This will compute the cosine similarities. And let's print them. Notice that these are pairwise similarity for every sentence in the first list. To every sentence in the second list. If you look at the diagonal of the matrix. You will get the similarities between the first sentences of both lists. The second element of the diagonal will be the second sentences of both lists. And the third element of the diagonal will be the similarities between the third sentences of both lists. Now let's output the score of each pair in both lists. You can see that the cat sits outside and the dog plays in the garden. Gives a score of 2. Gives a score of 0.28. Which suggests that there is a similarity between those two sentences. The model probably picked the fact that cat and dog is quite similar. For the second pair, we have a man is playing guitar and a woman watches TV. These two sentences do not have any similarities. Hence the very low score. The last example, the movie or awesome and the new movie is so great. Shows a score of 0.28. Which suggests that there is a similarity between the two. The second pair is a man and a woman. Shows a score of 0.25. Which suggests that there is a similarity between the two. Hence the very low score. The last example, the movie or awesome and the woman watches TV. These two sentences do not have any similarities. Hence the very low score. Hence the very low score.
course detail
DLAI Logo
AI is the new electricity and will transform and improve nearly all areas of human lives.
LearnCode
Next Lesson
Open Source Models with Hugging Face
  • Introduction
    Video
    ・
    5 mins
  • Selecting models
    Video
    ・
    5 mins
  • Natural Language Processing (NLP)
    Video with Code Example
    ・
    9 mins
  • Translation and Summarization
    Video with Code Example
    ・
    5 mins
  • Sentence Embeddings
    Video with Code Example
    ・
    5 mins
  • Zero-Shot Audio Classification
    Video with Code Example
    ・
    9 mins
  • Automatic Speech Recognition
    Video with Code Example
    ・
    15 mins
  • Text to Speech
    Video with Code Example
    ・
    2 mins
  • Object Detection
    Video with Code Example
    ・
    11 mins
  • Image Segmentation
    Video with Code Example
    ・
    16 mins
  • Image Retrieval
    Video with Code Example
    ・
    7 mins
  • Image Captioning
    Video with Code Example
    ・
    5 mins
  • Multimodal Visual Question Answering
    Video with Code Example
    ・
    4 mins
  • Zero-Shot Image Classification
    Video with Code Example
    ・
    6 mins
  • Deployment
    Video with Code Example
    ・
    11 mins
  • Conclusion
    Video
    ・
    1 mins
  • Course Feedback
  • Community
  • 0%