Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
Now we will explore the dataset. Here, we will be using the hist
method to plot a histogram to view the data. A histogram is used to visually represent the distribution of the data instead of the actual data itself, simply put, it is used to summarize discrete or continuous data.
The hist
methods here has the bins
parameter. These are also sometimes referred to as classes
, intervals
, or `buckets, are groups of equal widths into which the data is separated. Each bin is plotted as a bar whose height corresponds to how many data points are in that bin.
We will also be using the cut
method from Pandas. This is used to segment and sort data values into bins. cut
is also helpful for converting from a continuous variable to a categorical variable. For example, cut could convert ages to groups of age ranges. Supports binning into an equal number of bins, or a pre-specified array of bins. The labels
parameters here specifies the labels for the returned bins. It has to be of the same length as the resulting bins. Also, if you notice, we have mentioned a np.inf
here for the bins. That is a form of floating point representation of infinity.
Use the info
method to get more information on the dataset
housing.<<your code goes here>>
Get a better understand of the mean, standard deviation, maximum value and other such information from the dataset by using the describe
method
housing.<<your code goes here>>
Plot histograms of all the features using hist
method
housing.<<your code goes here>>(bins=50, figsize=(20,15))
plt.show()
Plot a histogram of the median income attribute of the dataset
housing["median_income"].<<your code goes here>>
Divide the median income attribute into bins and labels using the cut
mthod, and then plot another histogram of the same
housing["income_cat"] = pd.<<your code goes here>>(housing["median_income"],
bins=[0., 1.5, 3.0, 4.5, 6., np.inf],
labels=[1, 2, 3, 4, 5])
housing["income_cat"].hist()
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here
Loading comments...