- Home
- Assessment

39 / 94

To get a better understanding of the data, we plot histogram for each numerical attribute. It shows us the number of instances that lie between a particular range.

Let's plot a histogram for an arbitrarily chosen dataset-

So, on seeing the above histogram, we can conclude that-

- There are 100 instances in the dataset whose value lie between 0 and 1.
- There are 40 instances in the dataset whose value lie between 1 and 2.
- There are 20 instances in the dataset whose value lie between 2 and 3.
- There are 60 instances in the dataset whose value lie between 3 and 4.
- There are 80 instances in the dataset whose value lie between 4 and 5.

We do this generally for numerical attributes as we can see the count of instances belonging to each category of a categorical attribute by `value_counts()`

method of the DataFrame object which we have done before, because it gives us exact figures of the count.

We plot a histogram by calling the `hist()`

method of the DataFrame object. It calls the `hist()`

method of `matplotlib.pyplot`

internally , on each attribute in the DataFrame, resulting in one histogram per column. Hence, we have to first import `matplotlib.pyplot`

to make it work.

Here, `matplotlib`

is a module and `pyplot`

is a sub-module of it. Most of the `matplotlib`

utilities lie under `pyplot`

. It is generally imported under the `plt`

alias.

Taking you to the next exercise in seconds...

Want to create exercises like this yourself? Click here.

XP

Checking Please wait.

Success

Error

No hints are availble for this assesment

Answer is not availble for this assesment

## Loading comments...