DLAI Logo
AI is the new electricity and will transform and improve nearly all areas of human lives.

Welcome back!

We'd like to know you better so we can create more relevant courses. What do you do for work?

DLAI Logo
  • Explore Courses
  • Community
    • Forum
    • Events
    • Ambassadors
    • Ambassador Spotlight
  • My Learnings
  • daily streak fire

    You've achieved today's streak!

    Complete one lesson every day to keep the streak going.

    Su

    Mo

    Tu

    We

    Th

    Fr

    Sa

    free pass got

    You earned a Free Pass!

    Free Passes help protect your daily streak. Complete more lessons to earn up to 3 Free Passes.

    Free PassFree PassFree Pass
The more granular the quantization is, the more accurate it will be. However, note that it requires more memory since we need to store more quantization parameters. There are different granularity when it comes to quantization. We have the per tensor quantization, but as you can see, we don't have to use the same scale and zero point for a whole tensor. We can for instance calculate a scale and the zero point for each axis. This is called per channel quantization. We could also choose a group of n elements to get the scale and zero point, and quantize each group with its own scale and zero points. For the per tensor quantization, this is what we did in the previous class. Let's refresh our mind with a simple example. Let's use the test tensor we had in the previous lab. And now this time let's perform the symmetric quantization to this tensor. So we will use the linear q symmetric function that we just coded. So we will get the quantized tensor. The scale. With this function linear_q_symmetric. And we just need to pass the test tensor. Then to have this summary we also need to dequantize it. We will use the linear quantization function we coded in the last lab. And we need to pass the quantized tensor, the scale, and the zero point. But as you remember the zero point is equal to zero for symmetry quantization. Now we have everything to plot the summary. And here we have as you can see, the quantization worked pretty well. The values are pretty close. And we get the quantization error tensor which looks pretty good. Let's have a look at the quantization error. And we get 2.5. And if you remember in the previous lab when we used asymmetric quantization, we had a quantization error around 1.5.
course detail
DLAI Logo
AI is the new electricity and will transform and improve nearly all areas of human lives.
LearnCode
Next Lesson
Quantization in Depth
  • Introduction
    Video
    ・
    4 mins
  • Overview
    Video
    ・
    3 mins
  • Quantize and De-quantize a Tensor
    Video with Code Example
    ・
    11 mins
  • Get the Scale and Zero Point
    Video with Code Example
    ・
    12 mins
  • Symmetric vs Asymmetric Mode
    Video with Code Example
    ・
    7 mins
  • Finer Granularity for more Precision
    Video with Code Example
    ・
    2 mins
  • Per Channel Quantization
    Video with Code Example
    ・
    11 mins
  • Per Group Quantization
    Video with Code Example
    ・
    7 mins
  • Quantizing Weights & Activations for Inference
    Video with Code Example
    ・
    3 mins
  • Custom Build an 8-Bit Quantizer
    Video with Code Example
    ・
    13 mins
  • Replace PyTorch layers with Quantized Layers
    Video with Code Example
    ・
    5 mins
  • Quantize any Open Source PyTorch Model
    Video with Code Example
    ・
    8 mins
  • Load your Quantized Weights from HuggingFace Hub
    Video with Code Example
    ・
    7 mins
  • Weights Packing
    Video
    ・
    5 mins
  • Packing 2-bit Weights
    Video with Code Example
    ・
    8 mins
  • Unpacking 2-Bit Weights
    Video with Code Example
    ・
    8 mins
  • Beyond Linear Quantization
    Video
    ・
    7 mins
  • Conclusion
    Video
    ・
    1 min
  • Course Feedback
  • Community
  • 0%