Learn Python, NumPy, Pandas, Scikit-learn, HDFS, ZooKeeper, Hive, HBase, NoSQL, Oozie, Flume, Sqoop, Spark, Spark RDD, Spark Streaming, Kafka, SparkR, SparkSQL, MLlib, Regression, Clustering, Classification, SVM, Random Forests, Decision Trees, Dimensionality Reduction, TensorFlow 2, Keras, Convolutional & Recurrent Neural Networks, Autoencoders, and Reinforcement Learning
Learn HDFS, ZooKeeper, Hive, HBase, NoSQL, Oozie, Flume, Sqoop, Spark, Spark RDD, Spark Streaming, Kafka, SparkR, SparkSQL, MLlib, and GraphX.
In this chapter, we learn the basics of Big Data which include various concepts, use-cases and understanding of the eco-system.
This chapter doesn't require any knowledge of programming or technology. We believe it is very useful for every to learn the basics of Big Data. So, jump in!
Happy Learning!
MapReduce is Framework as well as a paradigm of computing. By the way of map-reduce, we are able to break-down complex computation into distributed computing.
As part of this chapter, we are going to learning how to build MapReduce programmes using Java.
Please make sure you work along with the course instead of just sitting back and watching.
Happy Learning!
There are many Big Data Solution stacks.
The first and most powerful stack is Apache Hadoop and Spark together. While Hadoop provides storage for structured and unstructured data, Spark provides the computational capability on top of Hadoop.
The second way could be to use Cassandra or MongoDB. The third could be to use Google Compute Engine or Microsoft Azure. In such cases, you would have to upload your data to Google or Microsoft which may not be acceptable to your organization sometimes.
In this post, we will understand the basics of: