Introduction to Apache Zookeeper

In the Hadoop ecosystem, Apache Zookeeper plays an important role in coordination amongst distributed resources. Apart from being an important component of Hadoop, it is also a very good concept to learn for a system design interview.

If you would prefer the videos with hands-on, feel free to jump in here.

Alright, so let’s get started.

Goals

In this post, we will understand the following:

  • What is Apache Zookeeper?
  • How Zookeeper achieves coordination?
  • Zookeeper Architecture
  • Zookeeper Data Model
  • Some Hands-on with Zookeeper
  • Election & Majority in Zookeeper
  • Zookeeper Sessions
  • Application of Zookeeper
  • What kind of guarantees does ZooKeeper provide?
  • Operations provided by Zookeeper
  • Zookeeper APIs
  • Zookeeper Watches
  • ACL in Zookeeper
  • Zookeeper Usecases
Continue reading “Introduction to Apache Zookeeper”

Distributed Computing with Locks

Introduction

Having known of the prevalence of BigData in real-world scenarios, it’s time for us to understand how they work. This is a very important topic in understanding the principles behind system design and coordination among machines in big data. So let’s dive in.

Scenario:

Consider a scenario where there is a resource of data, and there is a worker machine that has to accomplish some task using that resource. For example, this worker is to process the data by accessing that resource. Remember that the data source is having huge data; that is, the data to be processed for the task is very huge.

Continue reading “Distributed Computing with Locks”