Big Data with Hadoop & Spark by CloudxLab for $59 | Expires inEnroll Now
Learn Python, NumPy, Pandas, Scikit-learn, HDFS, ZooKeeper, Hive, HBase, NoSQL, Oozie, Flume, Sqoop, Spark, Spark RDD, Spark Streaming, Kafka, SparkR, SparkSQL, MLlib, Regression, Clustering, Classification, SVM, Random Forests, Decision Trees, Dimensionality Reduction, TensorFlow, Convolutional & Recurrent Neural Networks, Autoencoders, Reinforcement and More
The Electronics & ICT Academy program is sponsored by the Ministry of Electronics and Information Technology, Govt. of India.
The E&ICT Academy IIT Roorkee conducts short courses/FDPs in the emerging areas to enrich & upgrade subject knowledge and technical skills benefiting faculty, working professionals and Govt. employees.
The trained beneficiaries are expected to create a cascading effect, transforming the competencies and standards in the parent institutes/organizations.
E & ICT Academy IIT Roorkee supported by Ministry of Electronics and Information Technology (MeitY) with CloudxLab as industry partner, is conducting a training program in Data Science.
The E&ICT courses lay special emphasis on hands-on learning with participation from industry experts. These programs also enable the participants and institutes to build industry connects, upgrade lab facilities and create opportunities for collaboration.
E&ICT courses are recognized by All India Council for Technical Education(AICTE) at par with QIP for recognition/credits.
As of now the E&ICT Academy, IIT Roorkee has conducted 74 courses and trained over 4,000 beneficiaries.
For more details, please visit the E&ICT Academy (IIT Roorkee) official website here: https://eict.iitr.ac.in/
This Data Science Certification Program is a self-paced online course. This gives you complete freedom about your schedule and convenience.
This course has over 180-hours of video content. This consists of 5 courses (Big Data with Hadoop, Big Data with Spark, Python, Machine Learning, and Deep Learning).
Additionally, this course comes with our exclusive lab access to gain the much needed hands-on experience to solve the real-world problems.
Upon successfully completing the course, you will get the certificate from E&ICT, IIT Roorkee which you can use for progressing in your career and finding better opportunities.
1. Introduction to Linux
2. Introduction to Python
3. Hands-on using Jupyter on CloudxLab
4. Overview of Linear Algebra
5. Introduction to NumPy & Pandas
6. Quizzes, gamified assessments & projects
1.1 Big Data Introduction
1.2 Distributed systems
1.3 Big Data Use Cases
1.4 Various Solutions
1.5 Overview of Hadoop Ecosystem
1.6 Spark Ecosystem Walkthrough
2.1 Understanding the CloudxLab
2.2 Getting Started - Hands on
2.3 Hadoop & Spark Hands-on
2.4 Quiz and Assessment
2.5 Basics of Linux - Quick Hands-On
2.6 Understanding Regular Expressions
2.7 Quiz and Assessment
2.8 Setting up VM (optional)
3.1 ZooKeeper - Race Condition
3.2 ZooKeeper - Deadlock
3.4 Quiz & Assessment
3.5 How does election happen - Paxos Algorithm?
3.6 Use cases
3.7 When not to use
3.8 Quiz & Assessment
4.1 Why HDFS or Why not existing file systems?
4.2 HDFS - NameNode & DataNodes
4.4 Advance HDFS Concepts (HA, Federation)
4.6 Hands-on with HDFS (Upload, Download, SetRep)
4.7 Quiz & Assessment
4.8 Data Locality (Rack Awareness)
5.1 YARN - Why not existing tools?
5.2 YARN - Evolution from MapReduce 1.0
5.3 Resource Management: YARN Architecture
5.4 Advance Concepts - Speculative Execution
6.1 MapReduce - Understanding Sorting
6.2 MapReduce - Overview
6.4 Example 0 - Word Frequency Problem - Without MR
6.5 Example 1 - Only Mapper - Image Resizing
6.6 Example 2 - Word Frequency Problem
6.7 Example 3 - Temperature Problem
6.8 Example 4 - Multiple Reducer
6.9 Example 5 - Java MapReduce Walkthrough
7.1 Writing MapReduce Code Using Java
7.2 Building MapReduce project using Apache Ant
7.3 Concept - Associative & Commutative
7.5 Example 8 - Combiner
7.6 Example 9 - Hadoop Streaming
7.7 Example 10 - Adv. Problem Solving - Anagrams
7.8 Example 11 - Adv. Problem Solving - Same DNA
7.9 Example 12 - Adv. Problem Solving - Similar DNA
7.10 Example 12 - Joins - Voting
7.11 Limitations of MapReduce
8.1 Pig - Introduction
8.2 Pig - Modes
8.3 Getting Started
8.4 Example - NYSE Stock Exchange
8.5 Concept - Lazy Evaluation
9.1 Hive - Introduction
9.2 Hive - Data Types
9.3 Getting Started
9.4 Loading Data in Hive (Tables)
9.5 Example: Movielens Data Processing
9.6 Advance Concepts: Views
9.7 Connecting Tableau and HiveServer 2
9.8 Connecting Microsoft Excel and HiveServer 2
9.9 Project: Sentiment Analyses of Twitter Data
9.10 Advanced - Partition Tables
9.11 Understanding HCatalog & Impala
10.1 NoSQL - Scaling Out / Up
10.2 NoSQL - ACID Properties and RDBMS Story
10.3 CAP Theorem
10.4 HBase Architecture - Region Servers etc
10.5 Hbase Data Model - Column Family Orientedness
10.6 Getting Started - Create table, Adding Data
10.7 Adv Example - Google Links Storage
10.8 Concept - Bloom Filter
10.9 Comparison of NOSQL Databases
11.1 Sqoop - Introduction
11.2 Sqoop Import - MySQL to HDFS
11.3 Exporting to MySQL from HDFS
11.4 Concept - Unbounding Dataset Processing or Stream Processing
11.5 Flume Overview: Agents - Source, Sink, Channel
11.6 Example 1 - Data from Local network service into HDFS
11.7 Example 2 - Extracting Twitter Data
11.9 Example 3 - Creating workflow with Oozie
1.1 Apache Spark ecosystem walkthrough
1.2 Spark Introduction - Why Spark?
2.1 Scala - Quick Introduction - Access Scala on CloudxLab
2.2 Scala - Quick Introduction - Variables and Methods
2.3 Getting Started: Interactive, Compilation, SBT
2.4 Types, Variables & Values
2.9 More Features
2.10 Quiz and Assessment
3.1 Apache Spark ecosystem walkthrough
3.2 Spark Introduction - Why Spark?
3.3 Using the Spark Shell on CloudxLab
3.4 Example 1 - Performing Word Count
3.5 Understanding Spark Cluster Modes on YARN
3.6 RDDs (Resilient Distributed Datasets)
3.7 General RDD Operations: Transformations & Actions
3.8 RDD lineage
3.9 RDD Persistence Overview
3.10 Distributed Persistence
4.1 Creating the SparkContext
4.2 Building a Spark Application (Scala, Java, Python)
4.3 The Spark Application Web UI
4.4 Configuring Spark Properties
4.5 Running Spark on Cluster
4.6 RDD Partitions
4.7 Executing Parallel Operations
4.8 Stages and Tasks
5.1 Common Spark Use Cases
5.2 Example 1 - Data Cleaning (Movielens)
5.3 Example 2 - Understanding Spark Streaming
5.4 Understanding Kafka
5.5 Example 3 - Spark Streaming from Kafka
5.6 Iterative Algorithms in Spark
5.7 Project: Real-time analytics of orders in an e-commerce company
6.1 InputFormat and InputSplit
6.5 How to store many small files - SequenceFile?
6.7 Protocol Buffers
6.8 Comparing Compressions
6.9 Understanding Row Oriented and Column Oriented Formats - RCFile?
7.1 Spark SQL - Introduction
7.2 Spark SQL - Dataframe Introduction
7.3 Transforming and Querying DataFrames
7.4 Saving DataFrames
7.5 DataFrames and RDDs
7.6 Comparing Spark SQL, Impala, and Hive-on-Spark
8.1 Machine Learning Introduction
8.2 Applications Of Machine Learning
8.3 MlLib Example: k-means
8.4 SparkR Example
Statistical Inference, Types of Variables, Probability Distribution, Normality, Measures of Central Tendencies, Normal Distribution
Introduction to Machine Learning, Machine Learning Application, Introduction to AI, Different types of Machine Learning - Supervised, Unsupervised, Reinforcement
Machine Learning Projects Checklist, Frame the problem and look at the big picture, Get the data, Explore the data to gain insights, Prepare the data for Machine Learning algorithms, Explore many different models and short-list the best ones, Fine-tune model, Present the solution, Launch, monitor, and maintain the system
Training a Binary classification, Performance Measures, Confusion Matrix, Precision and Recall, Precision/Recall Tradeoff, The ROC Curve, Multiclass Classification, Multilabel Classification, Multioutput Classification
Linear Regression, Gradient Descent, Polynomial Regression, Learning Curves, Regularized Linear Models, Logistic Regression
Linear SVM Classification, Nonlinear SVM Classification, SVM Regression
Training and Visualizing a Decision Tree, Making Predictions, Estimating Class Probabilities, The CART Training Algorithm, Gini Impurity or Entropy, Regularization Hyperparameters, Regression, Instability
Voting Classifiers, Bagging and Pasting, Random Patches and Random Subspaces, Random Forests, Boosting, Stacking
The Curse of Dimensionality, Main Approaches for Dimensionality Reduction, PCA, Kernel PCA, LLE, Other Dimensionality Reduction Techniques
Deep Learning Applications, Artificial Neural Network, TensorFlow Demo, Deep Learning Frameworks
Installation, Creating Your First Graph and Running It in a Session, Managing Graphs, Lifecycle of a Node Value, Linear Regression with TensorFlow, Implementing Gradient Descent, Feeding Data to the Training Algorithm, Saving and Restoring Models, Visualizing the Graph and Training Curves Using TensorBoard, Name Scopes, Modularity, Sharing Variables
From Biological to Artificial Neurons, Training an MLP with TensorFlow’s High-Level API, Training a DNN Using Plain TensorFlow, Fine-Tuning Neural Network Hyperparameters
Vanishing / Exploding Gradients Problems, Reusing Pretrained Layers, Faster Optimizers, Avoiding Overfitting Through Regularization, Practical Guidelines
The Architecture of the Visual Cortex, Convolutional Layer, Pooling Layer, CNN Architectures
Recurrent Neurons, Basic RNNs in TensorFlow, Training RNNs, Deep RNNs, LSTM Cell, GRU Cell, Natural Language Processing
Efficient Data Representations, Performing PCA with an Undercomplete Linear Autoencoder, Stacked Autoencoders, Unsupervised Pretraining Using Stacked Autoencoders, Denoising Autoencoders, Sparse Autoencoders, Variational Autoencoders
Learning to Optimize Rewards, Policy Search, Introduction to OpenAI Gym, Neural Network Policies, Evaluating Actions: The Credit Assignment Problem, Policy Gradients, Markov Decision Processes, Temporal Difference Learning and Q-Learning, Learning to Play Ms. Pac-Man Using Deep Q-Learning
Download all the emails in your inbox using GYB command line tool. Then analyze your emails using Numpy and Pandas and churn it to come up with various interesting insights.
We start Machine Learning course with this end-to-end project. Learn various data manipulation, visualization and cleaning techniques using various libraries of Python like Pandas, Scikit-Learn and Matplotlib.
The MNIST dataset is considered as "Hello World!" of Machine Learning. Write your first classification logic. Starting with Binary Classification learn Multiclass, Multilabel, Multi-output classification and different error analysis techniques.
Build a model that takes a noisy image as an input and outputs the clean image.
IRIS dataset contains 3 classes of 50 instances each, where each class refers to a type of iris plant. The three classes in this dataset are Setosa, Versicolor, and Verginica. Learn Decision Trees, CART algorithm and Ensemble method. Then use Random Forest classifier to make predictions.
The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. In this project, you build a model to predict which passengers survived the tragedy.
Build a model to predict the bikes demand given the past data.
Build a model to classify email as spam or ham. First, download examples of spam and ham from Apache SpamAssassin’s public datasets and then train a model to classify email.
In this project, you will build a basic neural network to classify if a given image is of cat or not.
Download images of various animals and then download the latest pretrained Inception v3 model. Run the model to classify downloaded images and display the top five predictions for each image, along with the estimated probability.
Build a model to classify clothes into various categories in Fashion MNIST dataset.
This is a time series prediction task: you are given snapshots of polarimetric radar values and asked to predict the hourly rain gauge total.
Sentiment analysis of "Iron Man 3" movie using Hive and visualizing the sentiment data using BI tools such as Tableau
Process the NSE (National Stock Exchange) data using Hive for various insights
Analyze MovieLens data using Hive
Generate movie recommendations using Spark MLlib
Derive the importance of various handles at Twitter using Spark GraphX
Churn the logs of NASA Kennedy Space Center WWW server using Spark to find out useful business and devops metrics
Write end-to-end Spark application starting from writing code on your local machine to deploying to the cluster
Real-time analytics dashboard for an e-commerce company using Apache Spark, Kafka, Spark Streaming, Node.js, Socket.IO and Highcharts
Our course is exhaustive and the certificate rewarded by us is proof that you have taken a big leap in Hadoop, Spark, Machine Learning and Deep Learning.
The knowledge you have gained from working on projects, videos, quizzes, hands-on assessments and case studies gives you a competitive edge.
Highlight your new skills on your resume, LinkedIn, Facebook and Twitter. Tell your friends and colleagues about it.
You need to complete at least 60% of the topics from the course. You also need to complete projects 1 and 2 from python, Sentiment Analysis (Hive) from Hadoop, Log Parsing from Spark, any 3 projects from Machine Learning and any 2 projects from Deep Learning. All the above requirements need to be met within 330 days from the course enrollment date to be eligible for the certificate.
Please log in at CloudxLab.com with your Gmail Id and access your course under "My Courses".
Have more questions? Please contact us at email@example.com