End-to-End ML Project- Beginner friendly

56 / 95

Visualizing geographical data

We have two attributes, latitude and longitude that tell us about the geographical information of the blocks.

We can create a scatterplot of all blocks to visualize the data. We can do that by using the DataFrame.plot() method of the pandas library. Its syntax is:


where DataFrame is the name of the DataFrame.

Some of the important attributes of the plot() method to plot a scatter plot are:-

  1. kind- It specifies the kind of plot we want to produce. For example, we can set its value to 'bar' for a bar plot or to 'hist' for a histogram. For the scatter plot, we have to set its value to 'scatter'.
  2. x- It specifies the attribute used in the x-axis.
  3. y- It specifies the attribute used in the y-axis.
  4. alpha- It specifies the opacity(transparency) of the graph. It ranges from 0 to 1 where 1 means no opacity. We set its value such that we can distinguish between high-density regions and low-density regions in the plot.

It uses matplotlib by default in the backend. You can refer to plot() documentation for more details about the method.


Plot a scatterplot between the attributes latitude (represented in the x-axis) and longitude (represented in the y-axis) of the DataFrame train_copy.

Set the value of parameter alpha to 0.1. We experimented with many values of alpha and got the best result at 0.1.

Note- We can also specify longitude for the y-axis and latitude for the x-axis. Our goal is to visualize the data and either way will accomplish the task.

Get Hint See Answer

Loading comments...