Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
Deep learning neural network models are available in multiple floating-point precisions like FP32 or FP16 but if we calibrate the model to 8-bit integer format (INT8) we could get accurate 8-bit models which can substantially speed up the inference, improve the performance, and reduce the required memory bandwidth. Such models are called, quantized models, i.e. the models that were trained in the floating-point precision and then transformed to integer representation with floating/fixed-point quantization operations between the layers. This transformation can be done using the Post-Training Optimization Tool (POT). Let's get a brief overview of POT in the next topic.
The table in INT8 vs FP32 Comparison illustrates the performance gain of the model by switching from an FP32 representation to INT8 representation.
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Answer is not availble for this assesment