Treating Outliers in Python

1 / 16

Introduction to treating outliers with Python

What is an outlier?

In a dataset, an outlier is a data point that differs significantly from other observations in that dataset.

What causes an outlier?

An outlier can be cause by multiple reasons. It can be due to data entry errors cause during collection of data, or in other words it can be a human error. It can be caused by faulty instruments leading to mechanical errors. Or, it can simply be a naturally causing anomaly and not artificial like the previous two categories. Most of the outliers in the real-world data belongs to this third category.

Why is detection of outliers important?

Most of the machine learning models, such as linear regression are based on the parametric statistic. So the presence of an outlier can lead to an erroneous models giving incorrect results.

Let's see a simple example of an outlier using Python.

No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...