Project - Credit Card Fraud Detection using Machine Learning

24 / 25

ROC-AUC Curve

Let us now plot the ROC-AUC curve. The Receiver Operator Characteristic (ROC) curve is an evaluation metric for binary classification problems. It is a probability curve that plots the TPR against FPR at various threshold values. The Area Under the Curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve. The higher the AUC, the better the performance of the model at distinguishing between the positive and negative classes.

Note:

  • decision_function predicts confidence scores for samples. The confidence score for a sample is the signed distance of that sample to the hyperplane. The advantage of Decision Function output is to set DECISION THRESHOLD and predict a new output for X_test, such that we get the desired precision or recall value.

  • roc_curve computes ROC by taking true binary labels and confidence values, or non-thresholded measure of decisions as input arguments. It returns

    • increasing false-positive rates such that element i is the false positive rate of predictions with score >= thresholds[i] (fpr)
    • Increasing true positive rates such that element i is the true positive rate of predictions with score >= thresholds[i] (tpr)
    • Decreasing thresholds on the decision function used to compute fpr and tpr.
INSTRUCTIONS
  • Use decision_function method of model k and pass X_test as argument. Receive the resultant scores in y_k.

    y_k =  k.<< your code comes here >>(X_test)
    
  • Call roc_curve function by passing y_test, y_k as input arguments and receive the returned fpr, tpr and thresholds.

    fpr, tpr, thresholds = << your code comes here >>(y_test, y_k)
    
  • Calculate the Area Under Curve for the fpr and tpr returned by roc_curve. Call auc function.

    roc_auc = << your code comes here >>(fpr, tpr)
    
  • Print the roc_auc measure.

    print("ROC-AUC:", roc_auc)
    
  • Now visualize the roc_auc curve.

    plt.title('Receiver Operating Characteristic')
    plt.plot(fpr, tpr, 'b',label='AUC = %0.3f'% roc_auc)
    plt.legend(loc='lower right')
    plt.plot([0,1],[0,1],'r--')
    plt.xlim([-0.1,1.0])
    plt.ylim([-0.1,1.01])
    plt.ylabel('True Positive Rate')
    plt.xlabel('False Positive Rate')
    plt.show()
    
Get Hint See Answer


Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...