{"id":949,"date":"2017-11-29T14:06:51","date_gmt":"2017-11-29T14:06:51","guid":{"rendered":"http:\/\/blog.cloudxlab.com\/?p=949"},"modified":"2019-01-08T12:45:47","modified_gmt":"2019-01-08T12:45:47","slug":"big-data-introduction","status":"publish","type":"post","link":"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/","title":{"rendered":"What is Big Data? An Easy Introduction to Big Data Terminologies"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">Unless you&#8217;ve been living under the rock, you must have heard or read the term &#8211; Big Data. But many people don&#8217;t know what Big Data actually means. Even if they do then the definition of the same is not clear to them. If you&#8217;re one of them then don&#8217;t be disheartened. By the time you complete reading this very article, you will have a clear idea about Big Data and its terminology. <\/span><\/p>\n<h2>What is Big Data?<\/h2>\n<p><span style=\"font-weight: 400;\">In very simple words, Big Data is data of very big size which can not be processed with usual tools like file systems &amp; relational databases. And to process such data we need to have distributed architecture. In other words, we need multiple systems to process the data to achieve a common goal.<\/span><\/p>\n<p><!--more--><\/p>\n<p>Generally, we classify the problems related to the handling of Big Data into three buckets:<\/p>\n<p><img class=\"alignnone wp-image-970 size-full\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2017\/11\/Screen-Shot-2017-11-30-at-6.50.16-PM.png\" alt=\"Characteristics of Big Data\" width=\"694\" height=\"485\" srcset=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Screen-Shot-2017-11-30-at-6.50.16-PM.png 694w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Screen-Shot-2017-11-30-at-6.50.16-PM-300x210.png 300w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\n<h3 style=\"padding-left: 30px;\">1. Volume<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">When the problem we are solving is related to how we would store such huge data, we call it Volume. For example, Facebook stores 600 TB of Data in just one day!<\/span><\/p>\n<h3 style=\"padding-left: 30px;\">2. Velocity<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">When we are trying to handle many requests per second, we call this characteristic Velocity. For example, the number of requests received by Facebook or Google per second.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\">3. Veracity<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"> If the problem at hand is complex or data that we are processing is complex, we call such problems as related to variety. For example, problems involving complex data structures like Maps &amp; Social Graphs.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Data could be termed as Big Data if either Volume, Velocity or Variety becomes impossible to handle using traditional tools.<\/span><\/p>\n<h2>Why do we need Big Data now?<\/h2>\n<p><span style=\"font-weight: 400;\">You will see the answer to this question when we look at the huge transition from Analog storage to Digital storage.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">For your information, Paper, Tapes etc are examples of analog storage while CDs, DVDs, hard disk drives are considered digital storage.<\/span><\/p>\n<p><img class=\"alignnone size-full wp-image-972\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2017\/11\/Global-Information-Storage-Capacity.png\" alt=\"Global Information Storage Capacity\" width=\"800\" height=\"600\" srcset=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Global-Information-Storage-Capacity.png 800w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Global-Information-Storage-Capacity-300x225.png 300w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Global-Information-Storage-Capacity-768x576.png 768w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/p>\n<p><span style=\"font-weight: 400;\">This graph shows that the digital storage has started increasing exponentially after 2002 while analog storage remained practically same.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The year 2002 is called beginning of the digital age. Why so? The answer is twofold: <em>Devices<\/em> &amp; <em>Connectivity<\/em>. Devices became cheaper, faster and smaller and on the other hand, the connectivity improved.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">This lead to a lot of very useful applications such as a very vibrant world wide web, social networks, and Internet of things leading to huge data generation.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">With the huge data generation, it became practically impossible to store &amp; process such humongous data. Let&#8217;s go through some basics to better understand the need for multiple systems to process big data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Roughly, the computer is made of 4 components.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"><strong>1. CPU<\/strong> &#8211; Which executes instructions. CPU is characterized by its speed. More the number of instructions it can execute per second, faster it is considered.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"><strong>2. RAM<\/strong> &#8211; Random access memory. While processing, we load data into RAM. If we can load more data into ram, CPU can perform better. So, RAM has two main attributes which matter: Size and its speed of reading and writing.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"><strong>3. Storage Disk<\/strong> &#8211; To permanently store data, we need hard disk drive or solid-state drive. The SSD is faster but smaller and costlier. The faster and bigger the disk, faster we can process data.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"><strong>4. Network<\/strong> &#8211; Another component that we frequently forget while thinking about the speed of computation is the network. Why? Often our data is stored on different machines and we need to read it over a network to process.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">While processing Big Data at least one of these four components become the bottleneck. In fact, all of the following components can impact the speed of computing &#8211; CPU, Memory Size, Memory Read Speed, Disk Speed, Disk Size, and Network Speed. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">Which is why we need to move to multiple computers or distributed computing architecture.<\/span><\/p>\n<h2>Big Data Applications<\/h2>\n<p><span style=\"font-weight: 400;\">So far we have tried to establish that while handling humongous data we would need a new set of tools which can operate in a distributed fashion.<br \/>\nBut who would be generating such data or who would need to process such humongous data? A quick answer is everyone.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now, let us try to take few examples.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\">1. E-Commerce Recommendation<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">In e-commerce industry, the recommendation is a great example of Big Data processing. The recommendation, also known as collaborative filtering is the process of suggesting someone a product based on their preferences or behavior.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">The e-commerce website would gather a lot of data about the customer&#8217;s behavior. In a very simplistic algorithm, we would basically try to find similar users and then cross-suggest them the products. So, more the users, better the results.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">As per Amazon, a major chunk of their sales happens via recommendations on the website and emails. <\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">As of today, generating recommendations have become pretty simple. The engines such as MLLib or Mahout have made it very simple to generate recommendations on humongous data. All you have to do is format the data in the three column format: user id, movie id, and ratings.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\">2. A\/B Testing<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">A\/B Testing is a process to compare the response of the users with respect to two different variations.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"><img class=\"aligncenter size-large wp-image-956\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2017\/11\/Big-Data-with-Hadoop-Spark-Introduction-1-1024x767.jpg\" alt=\"Big Data Customers\" width=\"840\" height=\"629\" srcset=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Big-Data-with-Hadoop-Spark-Introduction-1-1024x767.jpg 1024w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Big-Data-with-Hadoop-Spark-Introduction-1-300x225.jpg 300w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Big-Data-with-Hadoop-Spark-Introduction-1-768x576.jpg 768w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Big-Data-with-Hadoop-Spark-Introduction-1-1200x899.jpg 1200w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Big-Data-with-Hadoop-Spark-Introduction-1.jpg 1365w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\"><br \/>\nAs you can see in the diagram, randomly selected half of the users are shown variation A and other half is shown variation B. We can clearly see that variation A is very effective because it is giving double conversions.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">This method is effective only if we have a significant amount of users. Also, the ratio of the users need not be 50-50.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">To manage so many variations on such a high number of users, we generally need Big Data platforms.<\/span><\/p>\n<h2>Big Data Customers<\/h2>\n<h3 style=\"padding-left: 30px;\">1. Government<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">Since governments have huge data about the citizens, any analysis would be Big Data analysis. The applications are many.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">First is <em>Fraud Detection<\/em>. Be it antimony laundering or user identification, the amount of data processing required is really high. <\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">In Cyber Security Welfare and Justice, the Big Data is being generated and Big Data tools are getting adopted.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\">2. Telecom<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">The telecom companies can use big data in order to understand why their customers are leaving and how they can prevent their customers from leaving. This is known as <em>customer churn prevention<\/em>. <\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">The data that could help in customer churn prevention is<\/span><\/p>\n<ul class=\"ili-indent\">\n<li style=\"padding-left: 30px;\">How many calls did customers make to the call center?<\/li>\n<li style=\"padding-left: 30px;\">For how long were they out of coverage area?<\/li>\n<li style=\"padding-left: 30px;\">What was the usage pattern?<\/li>\n<\/ul>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">The other use-case is <em>Network Performance Optimization<\/em>. Based on the past history of traffic, the telecoms can forecast the network traffic and accordingly optimize the performance.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">Third most common use-case of Big Data in Telecommunication industry is <em>Calling Data Record Analysis<\/em>. Since there are millions of users of a telecom company and each user makes 100s of calls per day. Analysing the calling Data records is a Big Data problem.<\/span><\/p>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">It is very much possible to predict the failure of hardware based on all the data points when previous failures occurred. A seemingly impossible task is possible because of the sheer volume of data.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\">3. Healthcare<\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">Healthcare inherently has humongous data and complex problems to solve. Such problems can be solved with the new Big Data Technologies as supercomputers could not solve most of these problems.<br \/>\nFew examples of such problems are Health information exchange, Gene sequencing, Healthcare improvements and Drug Safety.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/cloudxlab.com\/course\/specialization\/3\/big-data-with-hadoop-and-spark\"><img class=\"alignnone size-full wp-image-974\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2017\/11\/Learn-Practice-Big-Data.png\" alt=\"Learn &amp; Practice Big Data\" width=\"784\" height=\"295\" srcset=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Learn-Practice-Big-Data.png 784w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Learn-Practice-Big-Data-300x113.png 300w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Learn-Practice-Big-Data-768x289.png 768w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px\" \/><\/a><\/p>\n<h2>Data Variety<\/h2>\n<p><span style=\"font-weight: 400;\">The first term that you must know in Big Data is Data Variety. You will often come across this term as we move forward in the Big Data course. So let\u2019s quickly define different data structures for your quick understanding.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Data is largely classified as <em>Structured<\/em>, <em>Semi-Structured<\/em> and <em>Un-Structured<\/em>.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\"><img class=\"aligncenter size-large wp-image-954\" src=\"https:\/\/blog.cloudxlab.com\/wp-content\/uploads\/2017\/11\/Data-variety-1024x767.jpg\" alt=\"Data Variety\" width=\"840\" height=\"629\" srcset=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Data-variety-1024x767.jpg 1024w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Data-variety-300x225.jpg 300w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Data-variety-768x576.jpg 768w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Data-variety-1200x899.jpg 1200w, https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/Data-variety.jpg 1365w\" sizes=\"(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 1362px) 62vw, 840px\" \/><span style=\"font-size: 23px;\"><strong>1. Structured Data<\/strong><\/span><\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">If we clearly know the number of fields as well as their datatype, then we call it structured. More often than not, you will find structured data in the tabular form. The data in relational databases such as MySQL, Oracle or Microsoft SQL is an example of structured data.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\"><strong>2. Semi-Structured Data<\/strong><\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">The data in which we know the number of fields or columns but we do not know their datatypes, we call such data as semi-structured data. For example, data in CSV which is comma separated values is known as semi-structured data.<\/span><\/p>\n<h3 style=\"padding-left: 30px;\"><strong>3. Unstructured Data<\/strong><\/h3>\n<p style=\"padding-left: 30px;\"><span style=\"font-weight: 400;\">If the data doesn&#8217;t contain columns or fields, we call it unstructured data. The data in the form of plain text files or logs generated on a server are examples of unstructured data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now that we know the data variety we can discuss one of the significant problems in Big Data &#8211; ETL<\/span><\/p>\n<p><span style=\"font-weight: 400;\">ETL stands for Extract, Transform and Load. It is the process of translating unstructured data into structured data.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">ETL is a big problem in Big Data. Which is why Data engineers spend a significant amount of their time on ETL. We will learn more about it in the later articles.<\/span><\/p>\n<h2>Distributed Systems<\/h2>\n<p><span style=\"font-weight: 400;\">The second term that you will see a lot while learning Big Data technologies is Distributed system.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">When networked computers are utilized to achieve a common goal, it is known as a distributed system. The work gets distributed amongst many computers.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Please note that distributed systems doesn&#8217;t mean that systems are just connected. The networked computers must work together to solve a problem and only then can it be called distributed system. It is also important to note that Big Data is largely about Distributed systems.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The branch of computing that studies distributed systems is known as distributed computing. <\/span><\/p>\n<p><span style=\"font-weight: 400;\">The purpose of distributed computing is to get the work done faster by utilizing many computers. Most but not all the tasks can be performed using distributed computing.<\/span><\/p>\n<h2>Big Data Solutions<\/h2>\n<p><span style=\"font-weight: 400;\">There are many Big Data Solution stacks. Some popular stacks are listed below.<\/span><\/p>\n<ul class=\"ili-indent\">\n<li><span style=\"font-weight: 400;\">Apache Hadoop<\/span><\/li>\n<li><span style=\"font-weight: 400;\">Apache Spark<\/span><\/li>\n<li><span style=\"font-weight: 400;\">Cassandra<\/span><\/li>\n<li><span style=\"font-weight: 400;\">MongoDB<\/span><\/li>\n<li><span style=\"font-weight: 400;\">Google Compute Engine<\/span><\/li>\n<li><span style=\"font-weight: 400;\">Microsoft Azure<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400;\">The first and most powerful stack is Apache Hadoop and Spark together. While Hadoop provides storage for structured and unstructured data, the Spark provides the computational capability on top of Hadoop.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The second way would be to use Cassandra or MongoDB. \u00a0These are NoSQL Databases which run on multiple computers to provide huge volume, handle high velocity and the data in the complex structure.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Third could be to use Google Compute Engine or Microsoft Azure. In such cases, you would have to upload your data to Google or Microsoft which may not be acceptable to your organization sometimes.<\/span><\/p>\n<h2>Next Steps<\/h2>\n<p>If you like the article and love to know more about Big Data, then you can see our <a href=\"https:\/\/cloudxlab.com\/course\/specialization\/3\/big-data-with-hadoop-and-spark?utm_source=blog&amp;utm_medium=big-data-introduction\">Big Data course<\/a>. CloudxLab provides both self-paced &amp; online instructor-led training in Big Data technologies.<\/p>\n<p>The course comes with the free <a href=\"https:\/\/cloudxlab.com\/lab\/?utm_source=blog&amp;utm_medium=big-data-introduction\">lab subscription<\/a> which comes handy in practicing Big Data technologies.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Unless you&#8217;ve been living under the rock, you must have heard or read the term &#8211; Big Data. But many people don&#8217;t know what Big Data actually means. Even if they do then the definition of the same is not clear to them. If you&#8217;re one of them then don&#8217;t be disheartened. By the time &hellip; <a href=\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;What is Big Data? An Easy Introduction to Big Data Terminologies&#8221;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":976,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[15],"tags":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v16.2 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>An Easy Introduction to Big Data Terminologies | CloudxLab Blog<\/title>\n<meta name=\"description\" content=\"Learning Big Data isn&#039;t hard. By the time you complete reading this very article, you will have a clear idea about Big Data and its terminologies.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"An Easy Introduction to Big Data Terminologies | CloudxLab Blog\" \/>\n<meta property=\"og:description\" content=\"Learning Big Data isn&#039;t hard. By the time you complete reading this very article, you will have a clear idea about Big Data and its terminologies.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/\" \/>\n<meta property=\"og:site_name\" content=\"CloudxLab Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/cloudxlab\" \/>\n<meta property=\"article:published_time\" content=\"2017-11-29T14:06:51+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2019-01-08T12:45:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/pexels-photo-669615.jpeg\" \/>\n\t<meta property=\"og:image:width\" content=\"1024\" \/>\n\t<meta property=\"og:image:height\" content=\"512\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:site\" content=\"@CloudxLab\" \/>\n<meta name=\"twitter:label1\" content=\"Est. reading time\">\n\t<meta name=\"twitter:data1\" content=\"9 minutes\">\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"WebSite\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"CloudxLab Blog\",\"description\":\"Learn AI, Machine Learning, Deep Learning, Devops &amp; Big Data\",\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":\"https:\/\/cloudxlab.com\/blog\/?s={search_term_string}\",\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/#primaryimage\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/pexels-photo-669615.jpeg\",\"contentUrl\":\"https:\/\/cloudxlab.com\/blog\/wp-content\/uploads\/2017\/11\/pexels-photo-669615.jpeg\",\"width\":1024,\"height\":512,\"caption\":\"What is Big Data\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/#webpage\",\"url\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/\",\"name\":\"An Easy Introduction to Big Data Terminologies | CloudxLab Blog\",\"isPartOf\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/#primaryimage\"},\"datePublished\":\"2017-11-29T14:06:51+00:00\",\"dateModified\":\"2019-01-08T12:45:47+00:00\",\"author\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/0efa3c54df68406de820ea466f002d3c\"},\"description\":\"Learning Big Data isn't hard. By the time you complete reading this very article, you will have a clear idea about Big Data and its terminologies.\",\"breadcrumb\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"item\":{\"@type\":\"WebPage\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/\",\"url\":\"https:\/\/cloudxlab.com\/blog\/\",\"name\":\"Home\"}},{\"@type\":\"ListItem\",\"position\":2,\"item\":{\"@id\":\"https:\/\/cloudxlab.com\/blog\/big-data-introduction\/#webpage\"}}]},{\"@type\":\"Person\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#\/schema\/person\/0efa3c54df68406de820ea466f002d3c\",\"name\":\"Abhinav Singh\",\"image\":{\"@type\":\"ImageObject\",\"@id\":\"https:\/\/cloudxlab.com\/blog\/#personlogo\",\"inLanguage\":\"en-US\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/fc74fe31169bf872f6ab11bbab621d53?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/fc74fe31169bf872f6ab11bbab621d53?s=96&d=mm&r=g\",\"caption\":\"Abhinav Singh\"},\"sameAs\":[\"https:\/\/cloudxlab.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","_links":{"self":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/949"}],"collection":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/comments?post=949"}],"version-history":[{"count":21,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/949\/revisions"}],"predecessor-version":[{"id":1513,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/posts\/949\/revisions\/1513"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/media\/976"}],"wp:attachment":[{"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/media?parent=949"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/categories?post=949"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cloudxlab.com\/blog\/wp-json\/wp\/v2\/tags?post=949"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}