90 Days


Self-Paced Online


IIT Roorkee


About the Course

Computing systems have fueled the growth of AI. Improvements in deep-learning algorithms have inevitably gone hand-in-hand with the improvements in the hardware-accelerators. Our ability to train increasingly-complex AI models and achieve low-power, real-time inference depends on the capabilities of computing systems.

In recent years, the metrics used for optimizing and evaluating AI algorithms are diversifying: along with accuracy, there is increasing emphasis on the metrics such as energy efficiency and model size. Given this, researchers working on deep-learning can no longer afford to ignore the computing-system. Rather, the knowledge of potential and limitations of computing-system can provide invaluable guidance to them in designing the most efficient and accurate algorithms.

This course aims to inform students, practitioners and researchers in deep-learning algorithms about the potential and limitations of various processor architectures for accelerating the deep learning algorithms. At the same time, it seeks to motivate and even challenge the engineers and professionals in the architecture domain to optimize the processors according to the needs of deep-learning algorithms.

This course discusses acceleration of AI algorithms on various computing systems such as FPGAs, mobile GPUs, smartphones, ASICs (e.g., such as Google's TPU) and CPUs. We primarily focus on CNNs and will also include recurrent neural networks. Apart from performance and energy metrics, this course will also discuss hardware reliability and security issues/techniques for deep-learning algorithms/accelerators. We will also draw from recent research papers to showcase the state-of-art in these fields.

This course is at the intersection of deep learning algorithms and computer architecture, and chip-design, and thus, is expected to be beneficial for a broad range of audience.

Upon successfully completing the course, you will get the certificate from IIT Roorkee which you can use for progressing in your career and finding better opportunities.

Program Highlights

PG Certificate from IIT Roorkee

Certificate from IIT Roorkee

Certificate of Completion by IIT Roorkee

1 Week Immersion Program

Learn from Experts

Learn from IIT Roorkee professors and Industry Experts

Placement Eligibility Test

Placement Eligibility Test

Proctored Exams with Deep Learning models with opportunity to get Placed

Hands-On Project

Guided Projects

Get an hands-on experience with our Guided Projects

Timely Doubt Resolution

Timely Doubt Resolution

Get access to community of learners via our discussion forum

Access to Cloud Lab

Access to Cloud Lab

Lab comes pre-installed with all the software you will need to learn and practice.


What is the certificate like?

  • Why IIT Roorkee?

    IIT Roorkee is ranked first among all the IITs AND 20th position globally in citations per faculty. Established in 1847, it's one of the oldest technical institutions in Asia. IIT Roorkee fosters a very strong entrepreneurial culture. Some of their alumni are highly successful as entrepreneurs in the new age digital economy.

  • Why CloudxLab?

    CloudxLab is a team of developers, engineers, and educators passionate about building innovative products to make learning fun, engaging, and for life. We are a highly motivated team who build fresh and lasting learning experiences for our users. Powered by our innovation processes, we provide a gamified environment where learning is fun and constructive. From creative design to intuitive apps we create a seamless learning experience for our users. We upskill engineers in deep tech - make them employable & future-ready.



Among the IITs in the ‘Citations per Faculty’ parameter

*QS World Rankings

India Today


Ranked Engineering College

*India Today 2020



Ranked for IITs

*NIRF 2020



Ranked Best Global Universities in India

*QS World Rankings

Hands-on Learning

hands-on lab

  • Gamified Learning Platform
    Making learning fun and sustainable

  • Auto-assessment Tests
    Learn by writing code and executing it on lab

  • No Installation Required
    Lab comes pre-installed softwares and accessible everywhere


Instructor Sparsh Mittal

Prof. Sparsh Mittal

Faculty ECE Dept
IIT Roorkee

Dr. Sparsh Mittal is currently working as an assistant professor at IIT Roorkee, India. He received the B.Tech. degree from IIT, Roorkee, India and the Ph.D. degree from Iowa State University (ISU), USA. He has worked as a Post-Doctoral Research Associate at Oak Ridge National Lab (ORNL), USA and as an assistant professor at CSE, IIT Hyderabad. He was the graduating topper of his batch in B.Tech and his BTech project received the best project award. He has received a fellowship from ISU and a performance award from ORNL.

He has published more than 100 papers at top venues and his research has been covered by technical websites such as InsideHPC, HPCWire, Phys.org, and ScientificComputing. He is an associate editor of Elsevier's Journal of Systems Architecture. He has given invited talks at ISC Conference at Germany, New York University, University of Michigan and Xilinx (Hyderabad). In Stanford's list of world's top researchers, in the field of Computer Hardware & Architecture, he was ranked as number 107 (for whole career) and as number 3 (for year 2019 alone).


Instructor Sandeep Giri

Sandeep Giri

Founder at CloudxLab

Past: Amazon, InMobi, D.E.Shaw

Instructor Abhinav Singh

Abhinav Singh

Co-Founder at CloudxLab

Past: Byjus

Instructor Praveen

Praveen Pavithran

Co-Founder at Yatis

Past: YourCabs, Cypress Semiconductor


Foundation Courses

1. Programming Tools and Foundational Concepts
1. Getting Started with Linux
2. Getting Started with Git
3. Python Foundations
4. Machine Learning Prerequisites(Including Numpy, Pandas and Linear Algebra)
5. Getting Started with SQL
6. Statistics Foundations

Course on Accelerators for Deep Learning

Commonly used optimization strategies in deep learning
Examples: tiling, loop optimizations, batching, quantization, pruning
Model-size aware and processor architecture-aware pruning of DNNs.
Accuracy, performance and model-size achieved by model-size aware and architecture-aware pruning.
Convolutional strategies: Direct, FFT-based, Winograd-based and Matrix-multiplication based.
Understanding their compute and memory characteristics and pros and cons; the deep learning frameworks that use these strategies
Deep learning on FPGAs and case study of Microsoft's Brainwave
Optimizing deep learning applications on FPGAs, clustering, etc; efficacy of FPGAs for binarized neural networks (BNNs)
Architecture of Microsoft’s Brainwave and the optimizations used such as pinning of parameters in the on-chip memory
Deep learning on an ASIC (especially Google's Tensor Processing Unit)
Architecture of Google TPUv1/v2
Architecture of Google TPUv1/v2
Qualitative comparison between Google’s TPU and Microsoft’s Brainwave
Deep learning on Embedded System (especially NVIDIA's Jetson Platform)
Comparison of architectural parameters of Jetson (TK1, TX1, TX2) with Intel UP, Raspberry Pi, DSP and FPGA
Study of some real-life applications mapped to Jetson platform, e.g., driver drowsiness detection, pill image recognition, local processing of CNN on a drone, drone racing, classifying weeds from drone imagery, detecting foot ulcers using a CNN, identifying faces of suspected people, etc.
Deep learning on Edge Devices (smartphones)
Challenges and opportunities faced in running Facebook app (which uses deep learning models) on smartphones of varied configurations (architecture/compute/memory-capacity)
Deep-learning on CPUs
Pros and cons of using CPUs in deep learning
Opportunities for CPUs in deep learning, such as low and medium-parallelism workloads, mobile platforms etc
Case study: Hardware/system-challenges in autonomous driving.
Comparison of CPU/GPU/FPGA/ASIC in running CNN workloads used for autonomous driving
Accelerators for recurrent neural networks (RNNs)
Unique architectural characteristics of RNNs compared to CNNs
Acceleration of RNNs on FPGAs, ASICs, etc
Optimization techniques such as pipelining, parallelization, batching, pruning, low-precision, etc. Exploiting tradeoff between compute and memory
Understanding reliability of deep-learning accelerators and algorithms
Reliability impact of errors in early and late layers of a CNN
Resilience of convolution and fully-connected layers N
Techniques for designing resilient deep-learning accelerators
Understanding hardware security of deep-learning accelerators and algorithms
Side-channel attacks
Fault-injection attacks N
Defense mechanisms
Distributed training of DNNs
Need for distributed training
Challenges in and Techniques for distributed training N
Case study: Training AlexNet in minutes using massively parallel supercomputers/GPU-clusters
Hours of Video
Days of Lab Access

Apply Now

Application Process

  • Step 1. Submit the application form and SOP(Statement of Purpose)
    Register by filling the application form

  • Step 2. Reviewing the application
    he admission team will review the application and respond with the application status in 48 hours

  • Step 3. Join The Program
    Confirmation of seat is subject to the payment


The candidate should have an idea of what is deep learning, especially the basics of CNNs and RNNs. Background in computer architecture or embedded-system is preferred, although not mandatory.

No Cost EMI at


Or Program Fee 459

  • 3 Months Program
  • 90 Days of Online Lab Access
  • 24*7 Support
  • Certificate from IIT Roorkee
Apply Now»

Placement Assistance

Placement Eligibility Test

Placement Eligibility Test

We have around 300+ recruitment partners who will be interviewing you based on your performances in PET

Profile Building Sessions

Profile Building Sessions

Sessions will be conducted to guide you on creating the perfect resume and professional profile to get noticed by recruiters

Career Guidance Webinars

Career Guidance Webinars

Career Guidance Webinars from seasoned industry experts


Frequently Asked Questions

What are the prerequisites for this course?

The candidate should have an idea of what is deep learning, especially the basics of CNNs and RNNs. Background in computer architecture or embedded-system is preferred, although not mandatory.

What are the expected career options after pursuing this course?

Someone who has successfully completed this course is expected to be able to solve problems more efficiently using some of the latest technologies in the industry. Learners who have completed this course will be a perfect fit for VLSI, Semiconductor, or similar industries.

What is your refund policy?

If you are unhappy with the product for any reason, let us know within 7 days of purchasing or upgrading your account, and we'll cancel your account and issue a full refund. Please contact us at reachus@cloudxlab.com to request a refund within the stipulated time. We will be sorry to see you go though!

Do I need to install any software before starting this course?

No, we will provide you with the access to our online lab and BootML so that you do not have to install anything on your local machine

What is the validity of course material?

We understand that you might need course material for a longer duration to make most out of your subscription. You will get lifetime access to the course material so that you can refer to the course material anytime.