Welcome to the second chapter of AI and ML course for managers.
As part of this chapter, we are going to first learn how the machine learning approach to solving problems is different from the traditional approach.
Then we will learn the training and prediction phases. We will also learn about the various types of machine learning tasks. We will go through various examples of supervised, unsupervised, reinforcement etcetera type of machine learning tasks. Afterwards, we will learn how to represent our data and frame the problem. We will also learn the training and testing phases along with an understanding of various biases. Afterwards, We will learn with a very simple example what it means by underfitting and overfitting.
Alright, let's get started!
So what is the difference between the traditional approach and machine learning based approach to solving problems and making machines intelligent? Let us understand this with a couple of examples.
We've already discussed the spam filter in the first chapter. A spam filter is a program which marks the incoming emails as spam or ham. Something that is not a spam is called as ham. If the incoming email is not a spam, it appears in the inbox, else it appears in the spam folder. This helps in keeping the inbox clean by avoiding promotional or unwanted emails.
Now, the question is how would you build such a spam filter?
Let us first look at the spam filter using the traditional approach. Or the approach without AI or machine learning. This approach is also known as the rule-based approach.
Let us look at this diagram.
In this diagram, the first step usually is to study the problem and look at what classifies an email as a spam email. We study the various features or properties of email.
We may observe that the subject line or the email body of a spam email generally have words like "credit card", "free" and "Amazing" etcetera.
Also, we may observe that the emails sent from a particular email address are spam.
In step 2, we would write an algorithm or set of rules which detect if an incoming email has above observations.
When an incoming email passes through the rules we have written, it would be classified correctly as spam or ham.
Once we have written the rules or conditions for classifying emails, we will evaluate our rules on the data that has already been marked as spam or ham.
In case, our algorithm or set of rules is giving good accuracy we launch. In other words, if it doing good on test dataset, we put in production to do the real job.
If the accuracy is not good, we go back to studying the problem and write better or more rules.
This was the traditional approach where manually identify and code the rules.
So what are the problems with this traditional approach?
We will have to manually write a long list of complex rules because we do not know what all words or parameters are the characteristics of a spam email.
Also, let's say we have written a rule to classify an email as spam if it contains the word "amazing". If spammers notice that all their emails that contain "amazing" are blocked, they might start writing "Fantastic" instead of
Now we will have to write one more rule to block emails containing the word "fantastic". Spammers will keep working around our spam filter, we will need to keep writing new rules forever.
Let us take a look at how the same problem is solved using machine learning.
In the ML approach, Instead of us writing the rules, the rules are inferred by the algorithm based on the data.
Unlike the traditional approach, a spam filter based on Machine Learning techniques automatically notices that "fantastic" has
become unusually frequent in spam flagged by the users, and it starts flagging them without your intervention.
Let us take an example of speech recognition. Speech recognition involves inferring what is being spoken basically it is a conversion of sound into text.
How would we write a speech recognition program? For simplicity let's assume that the program should be able to distinguish only two words one and two.
How would we write such a program using traditional approach?
The word "two" starts with a high-pitch sound "T". So we can write an algorithm that measures the high-pitch sound intensity and hard code a rule that if the pitch is high, classify the word as "Two", else classify it as "one".
What are the problems with the traditional approach in speech recognition example?
The traditional approach will not work for the words spoken by millions of very different people in noisy environments.
Also, the rule-based approach will not really work great with the voice of people with different accents.
How does machine learning approach solves this problem?
We collect the recordings of each word by millions of people in different accents and noisy environments. Then we feed the recordings to the algorithm and the algorithm learns on its own. Then our algorithm learns on itself using the collected data. The algorithm creates its own rules to recognize the spoken words. This is found to be performing better than human created rules.
The other advantage is that the ML-based Approach of solving problems can help us in improving our understanding of a subject. Machine Learning algorithms can be inspected to see what they have learned.
For example, in the case of a spam filter, our algorithm can reveal the list and combinations of the words that it believes are the best predictors of the spam. Sometimes this will reveal
unsuspected correlation or the new trends, and thereby leads to a better understanding of the problem.
Now, we understand the difference between the machine learning approach and the traditional approach. Let us take a look at typical machine learning process.
During the training phase, the algorithm creates the model and using the model the predictions are done.
The model is nothing but the list of rules as we discussed earlier just that these rules are created by an algorithm based on the past data, not humans.
Let us take a detail look at the training phase.
Before we start training, we need to collect the data, clean the data and process the data.
The collection of data might involve building apps, installing sensors and writing programs that gather the data. Once we have collected the data, we clean it to bring it to a manageable format.
Afterwards, we perform various kinds of preprocessing tasks such that it can be fed to the machine learning algorithm.
To understand the training phase, let us take the example of the spam filter.
We have data containing the historical emails along with the label marking which email is spam and which is not a spam.
During training phase, we feed these emails along with the label to an algorithm which generates the model.
Once we have the model, we can do the prediction.
For example, if we have a model trained to classify spam or ham, we can feed the incoming email in the real world and it will predict whether it is spam or not.
We can summarize the process in the following way. Using a machine learning algorithm on historical data we generate the model ...
... and with model we do the predictions.
Let us understand the meaning of features, labels, and instances before diving deep into machine learning. You can frame the business problem into a machine learning problem if you can identify features, labels, and instances. Let’s understand these with the spam filter example again.
Say our data contains subject, sender and if the email is spam or not. We feed this data to the algorithm to train the model. And then later we use the trained model to classify if a new incoming email is a spam or not.
Here subject and sender are “features”. You can think of features as attributes of an object. For example, color, size, weight and taste are the attributes of an an apple.
A label is a column which we have to predict for the unknown data. In case of email filter, the algorithm classifies if we the incoming emails are spam or ham so “Is Spam” is a label.
Each row of the input data is called “instance”. In this data, there are 3 rows so the number of instances is 3.
We will learn more about features, labels, and instances later in the chapter.
Here subject and sender are “features”. You can think of features as attributes of an object. For example, color, size, taste are attributes of an apple.
For example, color, size, taste are attributes of an apple.
A label is a column which the model predicts. In the case of email filter, the algorithm classifies if the incoming emails are spam or ham so “Is Spam” is a label.
Each row of the input data is called and “instance”. In this data, there are 3 rows so the number of instances are 3.
We will learn more about features, labels, and instances later in the chapter again.