End-to-End ML Project- Beginner friendly

59 / 95

Looking for correlations

Now let's compute the correlation coefficient. Please refer to the above video for understanding what is correlation.

We find the correlation between the attributes to understand the relationship between the variables so we can preprocess and model the data better. Also it helps us to identify if there are any data quirks. We remove the data quirks before feeding the data to an algorithm. We can also check for multicollinearity by it. We'll study about multicollinearity in the next chapter.

Note- There is a common misconception that new learners believe that the positive correlation is greater than the negative correlation. For example, they believe a correlation coefficient of 0.98 is greater than -0.98. It is totally incorrect. Sign only tells us about the direction of correlation. That is, positive correlation means values increase together while in the negative correlation, one value decreases as the other value increases. So, two variables with a correlation coefficient of 0.98 are the same strongly correlated as those of -0.98 differing in only the direction of correlation.

Loading comments...