[Flume - Agents]
Flume agents are independent daemon processes. Flume agent consists of three parts: Source, Channel and Sink.
Examples - Tweets, Web server logs, Click event data in any application, etc
Examples - File System channel, memory channel, etc
Examples - HDFS, HBase etc
When the rate of incoming data from the source exceeds the rate at which it can be written to the destination, flume channel acts as a mediator between the source and sink by buffering the data.
[Flume - Use Case - Agents]
Flume agents run on every machine where we want to collect the data.
As displayed in the image, in our previous use case, flume agents will be installed on every web server where data is being produced. A data collector collects data from agents and pushes it to a centralized data store.
[Slide Flume - Multiple Agents]
Flume agents can be arranged in arbitrary topologies. As shown in the image, the source is consuming data from the sink and the same sink data is getting consumed by multiple sources.