As humans, we are immersed in data in our every-day lives. As per IBM, the data doubles every two years on this planet. The value that data holds can only be understood when we can start to identify patterns and trends in the data. Normal computing principles do not work when data becomes huge.
There is massive growth in the big data space, and job opportunities are skyrocketing, making this the perfect time to launch your career in this space.
In this course, you will learn Hadoop to drive better business decisions and solve real-world problems.
What is Big Data?
Big Data Use Cases
Overview of Hadoop Ecosystem
Spark Ecosystem Walkthrough
Understanding the CloudxLab
Hadoop & Spark Hands-on
Quiz and Assessment
Basics of Linux - Quick Hands-On
Understanding Regular Expressions
Quiz and Assessment
Setting up VM (optional)
Why Do we need it?
Understanding Data Model
Quiz & Assessment
How does election happen - Paxos Algorithm?
When not to use
Quiz & Assessment
Why HDFS or Why not existing file systems?
Understanding the architecture
Advance HDFS Concepts (HA, Federation)
Hands-on with HDFS (Upload, Download, SetRep)
Quiz & Assessment
Data Locality (Rack Awareness)
Computing - Why not existing tools?
Resource Management: YARN Architecture
Advance Concepts - Speculative Execution
Understanding MapReduce Framework
Example 0 - Word Frequency Problem - Without MR
Example 1 - Only Mapper - Image Resizing
Example 2 - Word Frequency Problem
Example 3 - Temperature Problem
Example 4 - Multiple Reducer
Example 5 - Java MapReduce Walkthrough
Example 6 - Secondary Sorting (Word Recommendation)
Example 7 - Partitioner
Concept - Associative & Commutative
Example 8 - Combiner
Example 9 - Hadoop Streaming
Example 10 - Adv. Problem Solving - Anagrams
Example 11 - Adv. Problem Solving - Same DNA
Example 12 - Adv. Problem Solving - Similar DNA
Example 12 - Joins - Voting
Limitations of MapReduce
Basic Structure of Pig Latin
Example - NYSE Stock Exchange
Concept - Lazy Evaluation
Hive Architecture Overview
Loading Data in Hive (Tables)
Example: Movielens Data Processing
Advance Concepts: Views
Connecting Tableau and HiveServer 2
Connecting Microsoft Excel and HiveServer 2
Project: Sentiment Analyses of Twitter Data
Advanced - Partition Tables
Understanding HCatalog & Impala
Case Study: The days before NoSQL
What is NoSQL?
HBase Architecture - Region Servers etc
Hbase Data Model - Column Family Orientedness
Getting Started - Create table, Adding Data
Adv Example - Google Links Storage
Concept - Bloom Filter
Comparison of NOSQL Databases
Import From MySQL to HDFS, Hive, HBase
Exporting to MySQL from HDFS
Concept - Unbounding Dataset Processing or Stream Processing
Flume Overview: Agents - Source, Sink, Channel
Example 1 - Data from Local network service into HDFS
Example 2 - Extracting Twitter Data
Example 3 - Creating workflow with Oozie
1. Sentiment analysis of "Iron Man 3" movie using Hive and visualizing the sentiment data using BI tools such as Tableau
2. Process the NSE (National Stock Exchange) data using Hive for various insights
3. Analyze MovieLens data using Hive
Our Specialization is exhaustive and the certificate rewarded by us is proof that you have taken a big leap in Big Data domain.
The knowledge you have gained from working on projects, videos, quizzes, hands-on assessments and case studies gives you a competitive edge.
Highlight your new skills on your resume, LinkedIn, Facebook and Twitter. Tell your friends and colleagues about it.
The instructors for this course are industry experts having years of experience in mentoring students across the world.
It will take 2-3 months with 6-8 hours of effort per week.
We understand that you might need course material for a longer duration to make most out of your subscription. You will get lifetime access (Till the company is operational) to the course material so that you can refer to the course material anytime.
In online instructor-led training, Sandeep Giri along with his team of experts will train you with a group of our course learners for 25+ hours over online conferencing software like Zoom. Classes will happen every Saturday and Sunday
We offer mentoring sessions to our learners with industry leaders and professionals so you can get 1 on 1 help with any questions you may have, whether your questions are technical, job-related or anything else.
It is a paid service and exclusively available to learners enrolling for the course. We will provide more information on subscription information for the same after the course is launched.
At the end, of course, you will work on a real-time project. You will receive a problem statement along with a data-set to work on CloudxLab. Once you are done with the project (it will be reviewed by an expert), you will be awarded a certificate which you can share on LinkedIn.
Enrollment into self-paced course entails 90 days of free access to CloudxLab. Enrollment into instructor-led course entails 90 days of free access to Cloudxlab, depending on date of enrollment.
Yes. Java is generally required for understanding MapReduce. MapReduce is a programming paradigm for writing your logic in the form of Mapper and reducer functions. We provide a self-paced course on Java for free. As soon as you signup, it would be available in your account section.
Course requires a good internet (1 Mbps or more) and a browser to watch videos and do hands-on the lab. We've configured all the tools in the lab so that you can focus on learning and practicing in a real-world cluster.
For self-paced course, we provide 100/% fees refund if the request is raised within 7 days from enrollment date. Thereafter, no refund is provided.
For instructor-led course, we provide 100% refund if not more than 1 live session has been conducted -- and we provide 50% refund if 2-4 live sessions have been conducted. If 5 or more live sessions have been conducted, then no refund will be provided.
Yes, you can renew your subscription anytime. Please choose your desired plan for the lab and make payment to renew your subscription.
Have more questions? Please contact us at email@example.com