Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left

  Apply Now

Guided Projects

Free Guided Projects

  • Project - Building Spam Classifier

    NLP Python scikit-learn Data Processing Machine Learning Predictive Modelling NLTK Free Guided Project

    13 Concepts | 2 Questions | 12 Assessments | 736 Learners

    Welcome to this project on the Spam Classifier Project with Logistic Regression Classifier using scikit-learn. In this project, you will use Python and scikit-learn to build a Logistic Regression Classifier, and apply it to predict whether an email is Spam or Ham.

    The world is full of textual data being generated at a very rapid pace each second. The most important data preprocessing steps include accessing and cleansing the real-time data, transforming it to get a refined form, and making it in an ML-algorithm compatible way by representing the textual data into numerical form. You will learn to achieve all these data preprocessing steps using NLTK - a famous Natural Language Processing API - in conjunction with Python. You will build data transformers and use them in scikit-learn pipelines in order to effectively preprocess the data. Finally, you will build a Logistic Regression Classifier to predict the class of an email.

    Skills you will develop:

    1. Textual Data Preprocessing
    2. Data Preprocessing Pipelines
    3. Data Transforming
    4. NLTK
    5. Python Programming
    6. Predictive Modeling
    7. Machine Learning
    8. scikit-Learn