Big Data Introduction

1 / 27

Big Data Introduction - Types of Data




Not able to play video? Try with vimeo

Data is largely classified as Structured, Semi-Structured and Un-Structured.

If we know the fields as well as their datatype, then we call it structured. The data in relational databases such as MySQL, Oracle or Microsoft SQL is an example of structured data.

The data in which we know the fields or columns but we do not know the datatypes, we call it semi-structured data. For example, data in CSV which is comma separated values is known as semi-structured data.

If our data doesn't contain columns or fields, we call it unstructured data. The data in the form of plain text files or logs generated on a server are examples of unstructured data.

The process of translating unstructured data into structured is known as ETL - Extract, Transform and Load.


Please login to comment

9 Comments

Great job for publishing such a beneficial website. Your weblog isn’t only useful but it is additionally really creative too 

 

 2  Upvote    Share

I guess the definition of ETL is wrong/

 

  Upvote    Share

Hi Rohan,

Why do you think so?

  Upvote    Share

What is the full form of CSV?  

  Upvote    Share

Comma Separated Values

  Upvote    Share

I always thought csv is a strucrured data but json, xml are semi-structured?

  Upvote    Share

In relational databases, we create the column by specifying its data type. We can't create the column without specifying its data type. Is such a case while creating a CSV file? 

Also, using structured data, we can make a data model. But in CSV files, we can store values belonging to all data types in a single column. Can we make a data model with such type of data?

  Upvote    Share

Very important thread understanding content advanced technology ifind 

Best wishes, 

Lahcene 

  Upvote    Share