Say, there is an inbox from which we need to index emails. Indexing is a heavy process and might take a lot of time. So, you have multiple machines which are indexing the emails. Every email has an id. You can not delete any email. You can only read an email and mark it read or unread. Now how would you handle the coordination between multiple indexer processes so that every email is indexed?
If indexers were running as multiple threads of a single process, it was easier by the way of using synchronization constructs of programming language.
But since there are multiple processes running on multiple machines which need to coordinate, we need a central storage. This central storage should be safe from all concurrency related problems. This central storage is exactly the role of Zookeeper.