Understanding Big Data Stack - Apache Hadoop and Spark

Understanding Big Data Stack - Apache Hadoop and Spark

0% completed
4 Concepts | 5 Learners

There are many Big Data Solution stacks.

The first and most powerful stack is Apache Hadoop and Spark together. While Hadoop provides storage for structured and unstructured data, Spark provides the computational capability on top of Hadoop.

The second way could be to use Cassandra or MongoDB. The third could be to use Google Compute Engine or Microsoft Azure. In such cases, you would have to upload your data to Google or Microsoft which may not be acceptable to your organization sometimes.

In this post, we will understand the basics of:

  • Apache Hadoop
  • components of the Hadoop ecosystem
  • overview of Apache Spark ecosystem

Blog for this playlist: Understanding Big Data Stack - Apache Hadoop and Spark

Pre-requisites: It is highly recommended to go through our previous post on Introduction to Big Data and Distributed Systems, where we have discussed on the basics of Big Data and its applications in various fields.

This topic is part of below listed courses -

Big Data Blogs

Instructor:

Machine Learning Engineer @ CloudxLab