Registrations Closing Soon for DevOps Certification Training by CloudxLab | Registrations Closing inEnroll Now
As the name suggests, DBSCAN is a density-based and unsupervised machine learning algorithm. It takes multi-dimensional data as inputs and clusters them according to the model parameters — e.g. epsilon and minimum samples. Based on these parameters, the algorithm determines whether certain values in the dataset are outliers or not.
Scikit-learn has a DBSCAN module as part of its unsupervised machine learning algorithms. This algorithm has many real life implementation when it comes to detecting outliers, for example we can use it in fraud detection for credit card transactions. Here, we will demonstrate how to detect outliers in the Iris dataset.
import pandas as pd import matplotlib.pyplot as plt from sklearn.cluster import DBSCAN from sklearn import datasets df = pd.read_csv("https://raw.githubusercontent.com/uiuc-cse/data-fa14/gh-pages/data/iris.csv") print(df.head()) data = df[["sepal_length", "sepal_width"]] model = DBSCAN(eps = 0.4, min_samples = 10).fit(data) colors = model.labels_ plt.scatter(data["sepal_length"], data["sepal_width"], c = colors) outliers = data[model.labels_ == -1] print(outliers)
No hints are availble for this assesment
Answer is not availble for this assesment
Note - Having trouble with the assessment engine? Follow the steps listed here