Course on
Big Data with Hadoop

Learn From Industry Experts With 1:1 Mentoring & Online Instructor-led Training

About the Course

As humans, we are immersed in data in our every-day lives. As per IBM, the data doubles every two years on this planet. The value that data holds can only be understood when we can start to identify patterns and trends in the data. Normal computing principles do not work when data becomes huge.

There is massive growth in the big data space, and job opportunities are skyrocketing, making this the perfect time to launch your career in this space.

In this specialization, you will learn Hadoop and Spark to drive better business decisions and solve real-world problems.



1 course

Learn from industry experts. Follow the suggested order or choose your own.

Projects & Lab

Apply the skills you learn on a distributed cluster to solve real-world problems.

Certificate

Highlight your new skills on your resume or LinkedIn.

1:1 Mentoring

Subscribe to 1:1 mentoring sessions and get guidance from industry leaders and professionals.

Best-in-class Support

24×7 support and forum access to answer all your queries throughout your learning journey.

Certifications

Compatiable to Cloudera CCA 175 certification
Learning Path

Course

Big Data with Hadoop

About the Course

Hardware and Software requirements:
Course requires a good internet (1 Mbps or more) and a browser to watch videos and do hands-on the lab. We've configured all the tools in the lab so that you can focus on learning and practicing in a real-world cluster.

What is Big Data?

Why Now?

Big Data Use Cases

Various Solutions

Overview of Hadoop Ecosystem

Spark Ecosystem Walkthrough

Quiz

Understanding the CloudxLab

CloudxLab Hands-On

Hadoop & Spark Hands-on

Quiz and Assessment

Basics of Linux - Quick Hands-On

Understanding Regular Expressions

Quiz and Assessment

Setting up VM (optional)

Why Do we need it?

Understanding Data Model

Hands-On

Quiz & Assessment

How does election happen - Paxos Algorithm?

Use cases

When not to use

Quiz & Assessment

Why HDFS or Why not existing file systems?

Understanding the architecture

Quiz

Advance HDFS Concepts (HA, Federation)

Quiz

Hands-on with HDFS (Upload, Download, SetRep)

Quiz & Assessment

Data Locality (Rack Awareness)

Computing - Why not existing tools?

MapReduce 1.0

Resource Management: YARN Architecture

Advance Concepts - Speculative Execution

Quiz

Why MapReduce?

Understanding MapReduce Framework

Quiz

Example 0 - Word Frequency Problem - Without MR

Example 1 - Only Mapper - Image Resizing

Example 2 - Word Frequency Problem

Example 3 - Temperature Problem

Example 4 - Multiple Reducer

Example 5 - Java MapReduce Walkthrough

Quiz

Example 6 - Secondary Sorting (Word Recommendation)

Example 7 - Partitioner

Concept - Associative & Commutative

Quiz

Example 8 - Combiner

Example 9 - Hadoop Streaming

Example 10 - Adv. Problem Solving - Anagrams

Example 11 - Adv. Problem Solving - Same DNA

Example 12 - Adv. Problem Solving - Similar DNA

Example 12 - Joins - Voting

Limitations of MapReduce

Quiz

Why Pig?

Basic Structure of Pig Latin

Getting Started

Example - NYSE Stock Exchange

Concept - Lazy Evaluation

Why Hive?

Hive Architecture Overview

Getting Started

Loading Data in Hive (Tables)

Example: Movielens Data Processing

Advance Concepts: Views

Connecting Tableau and HiveServer 2

Connecting Microsoft Excel and HiveServer 2

Project: Sentiment Analyses of Twitter Data

Advanced - Partition Tables

Understanding HCatalog & Impala

Quiz

Case Study: The days before NoSQL

What is NoSQL?

CAP Theorem

HBase Architecture - Region Servers etc

Hbase Data Model - Column Family Orientedness

Getting Started - Create table, Adding Data

Adv Example - Google Links Storage

Concept - Bloom Filter

Comparison of NOSQL Databases

Quiz

Sqoop Overview

Import From MySQL to HDFS, Hive, HBase

Exporting to MySQL from HDFS

Concept - Unbounding Dataset Processing or Stream Processing

Flume Overview: Agents - Source, Sink, Channel

Example 1 - Data from Local network service into HDFS

Example 2 - Extracting Twitter Data

Quiz

Example 3 - Creating workflow with Oozie

Certificate

Earn your certificate

Our Specialization is exhaustive and the certificate rewarded by us is proof that you have taken a big leap in Big Data domain.


Differentiate yourself

The knowledge you have gained from working on projects, videos, quizzes, hands-on assessments and case studies gives you a competitive edge.


Share your achievement

Highlight your new skills on your resume, LinkedIn, Facebook and Twitter. Tell your friends and colleagues about it.

 Course Certificate Sample
Enrollment
Self-paced Learning

Learn at your pace


99 149

High-quality videos, slides, hands-on examples, quizzes, automated assessments, case studies, real-world projects

Lifetime access to cutting-edge self-paced learning content

90 days of lab access for hands-on practice

24x7 support to answer your queries

Earn certificate in Big Data with Hadoop & Apache Spark

Enroll Now
Instructor
Sandeep Giri

Sandeep Giri

Founder at CloudxLab, Amazon, InMobi, Founder @ tBits Global, D.E.Shaw

For last 15 years, Sandeep has been building products and churning large amounts of data for various product companies. He has an all-around experience of software development and big data analysis.

Apart from digging data and technologies, Sandeep enjoys conducting interviews and explaining difficult concepts in simple ways.

Course Creators
Abhinav Singh

Abhinav Singh

Co-Founder at CloudxLab, Byjus
Course Developer
Aditya Zutshi

Aditya Zutshi

Cisco, IMT Ghaziabad, IIT Roorkee
Program Manager
 Jatin Shah

Jatin Shah

LinkedIn, Yahoo, Yale CS Ph.D., IIT-B
Course Advisor

Testimonials

FAQ
It will take 2-3 months with 6-8 hours of effort per week.
We understand that you might need course material for a longer duration to make most out of your subscription. You will get lifetime access to the course material so that you can refer to the course material anytime
At the end, of course, you will work on a real-time project. You will receive a problem statement along with a data-set to work on CloudxLab. Once you are done with the project (it will be reviewed by an expert), you will be awarded a certificate which you can share on LinkedIn.
We will provide 90 days of access to CloudxLab so that you learn by practice in a real time environment.
Yes. Java is generally required for understanding MapReduce. MapReduce is a programming paradigm for writing your logic in the form of Mapper and reducer functions. We provide a self-paced course on Java for free. As soon as you signup, it would be available in your account section.
For self-paced course, we provide 100% fees refund if the request is raised within 7 days from enrolment date. Thereafter, no refund is provided.
For instructor-led course, we provide 100% refund if not more than 1 live session has been conducted -- and we provide 50% refund if 2-4 live sessions have been conducted. If 5 or more live sessions have been conducted, then no refund will be provided.

Have more questions? Please contact us at reachus@cloudxlab.com