Treating Outliers in Python

2 / 16

Outlier example using Python

In this step, we will see an example of an outlier using Python.

INSTRUCTIONS
  • First, we will generate 2 datasets, with and without having outliers in them

    data_without_outlier = [1, 2, 3, 3, 4, 5, 4]
    data_with_outlier = [1, 2, 3, 3, 4, 4, 700]
    
  • Now we will import the statistics module. This module provides functions for calculating mathematical statistics of numeric, real-valued data.

    import statistics
    
  • Now we will create a function calculate_statistics that will print the mean, median, and standard deviation for these 2 datasets

    def <<your code goes here>>(data):
        print("Mean: ", statistics.mean(data))
        print("Median: ", statistics.median(data))
        print("Standard deviation: ", statistics.stdev(data))
    
  • Finally, we will call the above functions with the 2 datasets that we created in the first step to print their respective mean, median, and standard deviation values in separate cells

    calculate_statistics(data_without_outlier)
    
    calculate_statistics(data_with_outlier)
    

As you can see from the above observations, the mean and standard deviation varies with the presence of an outlier. However, the median is relatively not affected by the presence of outliers. But this is a relatively smaller dataset, so it is easier to detect the outliers. However, that becomes a difficult task in case of a larger, real-life dataset.

See Answer

No hints are availble for this assesment


Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...