Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
In this step, we will see an example of an outlier using Python.
First, we will generate 2 datasets, with and without having outliers in them
data_without_outlier = [1, 2, 3, 3, 4, 5, 4]
data_with_outlier = [1, 2, 3, 3, 4, 4, 700]
Now we will import the statistics module. This module provides functions for calculating mathematical statistics of numeric, real-valued data.
import statistics
Now we will create a function calculate_statistics
that will print the mean
, median
, and standard deviation
for these 2 datasets
def <<your code goes here>>(data):
print("Mean: ", statistics.mean(data))
print("Median: ", statistics.median(data))
print("Standard deviation: ", statistics.stdev(data))
Finally, we will call the above functions with the 2 datasets that we created in the first step to print their respective mean
, median
, and standard deviation
values in separate cells
calculate_statistics(data_without_outlier)
calculate_statistics(data_with_outlier)
As you can see from the above observations, the mean and standard deviation varies with the presence of an outlier. However, the median is relatively not affected by the presence of outliers. But this is a relatively smaller dataset, so it is easier to detect the outliers. However, that becomes a difficult task in case of a larger, real-life dataset.
Taking you to the next exercise in seconds...
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here
Loading comments...