{"id":1267,"date":"2018-04-18T07:39:20","date_gmt":"2018-04-18T07:39:20","guid":{"rendered":"http:\/\/blog.cloudxlab.com\/?p=1267"},"modified":"2019-01-08T12:35:10","modified_gmt":"2019-01-08T12:35:10","slug":"introduction-apache-flume","status":"publish","type":"post","link":"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/","title":{"rendered":"Introduction to Apache Flume in 30 minutes"},"content":{"rendered":"<h2><strong>What is Apache Flume?<\/strong><\/h2>\n<p>Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating &amp; moving large data from many different sources to a centralized data store.<\/p>\n<p><img class=\"alignnone\" src=\"https:\/\/www.knowbigdata.com\/sites\/default\/files\/flume.png\" alt=\"\" width=\"406\" height=\"178\" \/><\/p>\n<h2><strong>Flume supports a large variety of sources Including:<\/strong><\/h2>\n<ul class=\"ili-indent\">\n<li>tail (like unix tail -f),<\/li>\n<li>syslog,<\/li>\n<li>log4j &#8211; allowing Java applications to write logs to HDFS via flume<\/li>\n<\/ul>\n<h3><strong>Flume Nodes<\/strong><\/h3>\n<p>Flume nodes can be arranged in arbitrary topologies.Typically there is a node running on each source machine, with tiers of aggregating nodes that the data flows through on its way to HDFS.<\/p>\n<div>\n<p><b><strong>Topics Covered<\/strong><\/b><\/p>\n<\/div>\n<ul class=\"ili-indent\">\n<li>What is Flume<\/li>\n<li>Flume: Use Case<\/li>\n<li>Flume: Agents<\/li>\n<li>Flume: Use Case &#8211; Agents<\/li>\n<li>Flume: Multiple Agents<\/li>\n<li>Flume: Sources<\/li>\n<li>Flume: Delivery Reliability<\/li>\n<li>Flume: Hands-on<\/li>\n<\/ul>\n<div>\n<p><u><strong>Introduction to Flume Presentation<\/strong><\/u><\/p>\n<div style=\"left: 0; width: 100%; height: 0; position: relative; padding-bottom: 56.25%; padding-top: 30px;\"><iframe title=\"H12. Flume\" src=\"https:\/\/docs.google.com\/presentation\/d\/1_WptpZRaFooKvAwWnZL9r1pZbAer9CmVN_UKyKXmK6I\/embed\" style=\"border: 0; top: 0; left: 0; width: 100%; height: 100%; position: absolute;\" allowfullscreen scrolling=\"no\" allow=\"encrypted-media\"><\/iframe><\/div>\n<p><script type=\"text\/javascript\">window.addEventListener(\"message\",function(e){\n                window.parent.postMessage(e.data,\"*\");\n            },false);<\/script><\/p>\n<p>&nbsp;<\/p>\n<p>Please feel free to leave your comments in the comment box so that we can improve the guide and serve you better. Also, Follow\u00a0<a href=\"https:\/\/twitter.com\/CloudxLab\" target=\"_blank\" rel=\"noopener\">CloudxLab on Twitter<\/a> to get updates on new blogs and videos.<\/p>\n<p>If you wish to learn Hadoop and Spark technologies such as MapReduce, Hive, HBase, Sqoop, Flume, Oozie, Spark RDD, Spark Streaming, Kafka, Data frames, SparkSQL, SparkR, MLlib, GraphX and build a career in BigData and Spark domain then check out our signature course on\u00a0<a href=\"https:\/\/cloudxlab.com\/course\/1\/big-data-with-hadoop-and-spark\" target=\"_blank\" rel=\"noopener\">Big Data with Apache Spark and Hadoop<\/a>\u00a0which comes with<\/p>\n<ul>\n<li>Online instructor-led training by professionals having years of experience in building world-class BigData products<\/li>\n<li>High-quality learning content including videos and quizzes<\/li>\n<li>Automated hands-on assessments<\/li>\n<li>90 days of lab access so that you can learn by doing<\/li>\n<li>24\u00d77 support and forum access to answer all your queries throughout your learning journey<\/li>\n<li>Real-world projects<\/li>\n<li>A certificate which you can share on LinkedIn<\/li>\n<\/ul>\n<\/div>\n","protected":false},"excerpt":{"rendered":"<p>What is Apache Flume? Apache Flume is a distributed, reliable, and available system for efficiently collecting, aggregating &amp; moving large data from many different sources to a centralized data store. Flume supports a large variety of sources Including: tail (like unix tail -f), syslog, log4j &#8211; allowing Java applications to write logs to HDFS via &hellip; <a href=\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;Introduction to Apache Flume in 30 minutes&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[24,14],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v16.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Introduction to Apache Flume in 30 minutes | CloudxLab Blog<\/title>\n<meta name=\"description\" content=\"Apache Flume is a distributed, reliable system for efficiently aggregating &amp; moving large data from many different sources to a centralized data store.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Introduction to Apache Flume in 30 minutes | CloudxLab Blog\" \/>\n<meta property=\"og:description\" content=\"Apache Flume is a distributed, reliable system for efficiently aggregating &amp; moving large data from many different sources to a centralized data store.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/\" \/>\n<meta property=\"og:site_name\" content=\"CloudxLab Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cloudxlab\" \/>\n<meta property=\"article:published_time\" content=\"2018-04-18T07:39:20+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-01-08T12:35:10+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.knowbigdata.com\/sites\/default\/files\/flume.png\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:site\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\">\n\t<meta name=\"twitter:data1\" content=\"1 minute\">\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"CloudxLab Blog\",\"description\":\"Learn AI, Machine Learning, Deep Learning, Devops &amp; Big Data\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/cloudxlab.com\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/www.knowbigdata.com\/sites\/default\/files\/flume.png\",\"contentUrl\":\"https:\/\/www.knowbigdata.com\/sites\/default\/files\/flume.png\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/#webpage\",\"url\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/\",\"name\":\"Introduction to Apache Flume in 30 minutes | CloudxLab Blog\",\"isPartOf\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/#primaryimage\"},\"datePublished\":\"2018-04-18T07:39:20+00:00\",\"dateModified\":\"2019-01-08T12:35:10+00:00\",\"author\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/0efa3c54df68406de820ea466f002d3c\"},\"description\":\"Apache Flume is a distributed, reliable system for efficiently aggregating & moving large data from many different sources to a centralized data store.\",\"breadcrumb\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"position\":2,\"item\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/introduction-apache-flume\/#webpage\"}}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/0efa3c54df68406de820ea466f002d3c\",\"name\":\"Abhinav Singh\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/fc74fe31169bf872f6ab11bbab621d53?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/fc74fe31169bf872f6ab11bbab621d53?s=96&d=mm&r=g\",\"caption\":\"Abhinav Singh\"},\"sameAs\":[\"https:\/\/cloudxlab.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/1267"}],"collection":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/comments?post=1267"}],"version-history":[{"count":5,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/1267\/revisions"}],"predecessor-version":[{"id":1299,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/1267\/revisions\/1299"}],"wp:attachment":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/media?parent=1267"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/categories?post=1267"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/tags?post=1267"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}