Use-cases of Machine Learning in E-Commerce

What computing did to the usual industry earlier, Machine Learning is doing the same to usual rule-based computing now. It is eating the market of the same. Earlier, in organizations, there used to be separate groups for Image Processing, Audio Processing, Analytics and Predictions. Now, these groups are merged because machine learning is basically overlapping with every domain of computing. Let us discuss how machine learning is impacting e-commerce in particular.

The first use case of Machine Learning that became really popular was Amazon Recommendations. Afterwards, the Netflix launched a challenge of Movie Recommendations which gave birth to Kaggle, now an online platform of various machine learning challenges.

Before I dive deep into the details further, lets quickly brief the terms that are found often confusing. AI stands for Artificial Intelligence which means being able to display human-like intelligence. AI is basically an objective. Machine learning is making computers learn based on historical or empirical data instead of explicitly writing the rules. Artificial Neural networks are the computing constructs designed on a similar structure like the animal brain. Deep Learning is a branch of machine learning where we use a complex Artificial Neural network for predictions.

There are many other use-cases of machine learning in E-commence apart from the product recommendation as follows.

Demand Forecasting

One of the key challenges with E-commerce giants is to predict the demand for a product. Demand is a function of many parameters and the best way to predict the demand is to look at the history of various products based on various attributes of each product such as category, brand, price, launch date, last month’s demand, demand across each month etc.

To be able to do the demand forecasting precisely, the only way would use machine learning is historical data is available.

Price forecasting

When deciding upon the price of a product, we usually used the classic ways such as competition, MRP, etc. But most of the price forecasting using the rule-based ways was not as effective as using machine learning to do more aggressive historical price and purchase history.

The price forecasting is one of the core works of machine learning teams these days.

Selecting the seller

One way of selecting a seller is just going through the usual comparison of various parameters and checking if the seller is meeting the basic criteria and how well it is comparing with others.

But in case of a large e-commerce giant such as Amazon, there are just too many sellers for each product, selecting a seller becomes quite an interesting problem.

The selection of a seller is done by training the historical data to machine learning models and figuring out which seller performs better.

Review and Rating Quality

Every product that you see on an e-commerce site might have fake reviews and ratings. Also, there are various biases such as volunteer bias when it comes to ratings and reviews meaning the chance of a person writing a review is very high if they are dissatisfied with the product.

So, figuring out the genuine quality score of the product based on its ratings and reviews becomes extremely difficult.

This is very similar to spam detection in many ways. So, the quality of the review can be measured based on the language, IP Address, whether the user really purchased, how many days since the purchase, how often does the user rates, what is the usual rating etc.

To understand the language a general technique that proven really well in recent times is Word Embedding.

We can solve such a complex task only with machine learning techniques instead of using manual analytics or rule-based systems.

Search Results Ordering / Ranking

If you look at the history of search results, before Google, it seemed like a trivial problem – create an index of pages and list the results when someone searches for exact same keywords. The only criteria was which page matched the keyword most.

But today, as the data has increased, search ranking is critical.

Imagine, you come to an E-commerce website and search for “Levis jeans” and through some results from various books where “levi” and “jean” was found. The search results will be practically useless and the user would go away and buy elsewhere.

The search engine needs to be able to predict what is it that you are looking for. When I type “Twenty One Pilots”, it should understand that I am talking about a band.

This ranking can largely be learnt by itself based on which results are getting clicked on which keyword as well as converting the keyword into embedding and observing similarities between the terms based on the word embeddings.

Malicious Returns

A problem that has occurred in e-commerce expansion to a very large scale is malicious returns. Some users buy the products and they return the package by putting fake items. In such cases, it is very hard to figure out whether it is the buyer or seller who supplied the fake product.

This problem is sometimes called Malicious Returns. Malicious returns are hard to identify. While on one hand, an e-commerce company needs to be very user biased, on the other, such returns can cause huge loss and potential distrust on sellers.

To avoid such problems, we need to analyse user behaviour with respect to the purchases and then come up with a model which predicts the probability of a return being malicious. If you look closely, the problem is basically a machine learning problem.

Product content quality

The product content refers to the content that is displayed on the product details page such as the image that you see on the top left, product description, feature bullets and marketing images. This content is usually provided by various sellers. And often there are many versions of this content provided by the sellers. The first challenge is to identify the possible incorrect content or copyrighted content.

One important task I did at Amazon was to rank the images of each product and pick the one that that is best. There were many criteria to consider.

Another task I did for one of the companies was to identify images with the logo of a competitor because sellers were taking images from the competitors’ site and putting them on their product page.

This definitely cannot be done manually. Also, since the problem is fairly complex, using a rule-based system will not suffice. Instead, the only way to get it done at scale in a reasonable time and with good accuracy is to use machine learning in various ways.

Building Deals and Bundles

Earlier, building deals and bundles used to be the role of merchandising team but soon enough the merchandising team was provided with the suggestion from Analytics team specifically A/B Analytics. Even the physical stores of target did A/B Analytics. For A/B Analytics, the results of experiments are compared to see which combination works best.

Now, since most of the e-commerce companies have the historical data of the purchases, creating deals and bundles based on the purchase history becomes a very obvious answer.

This is very well done using the unsupervised class of algorithms in machine learning.


The last but the most prevalent use case of machine learning is recommendations. I have kept it at last because it is the most common use case. In fact, this use case is so common that it is almost considered to be the only use case for machine learning. The recommendations of the product that you see on any e-commerce platform’s website are mostly auto-generated. Only, in cases of companies with few products or less history of user purchases, the recommendations are hand-curated.

Recommendation generation is an interesting problem. A common way is to find out similar users and cross recommend each other’s product. There are various forms of recommendations such as product-product (people who bought this also bought that), user-product recommendation. The recommendation generation is also called collaborative filtering or user-product matrix completion.

Closing Note

Though, a general fear of a technology manager used to be that if we are going to use machine learning for everything, the system might become a black box. But today, as managers learn more ways of validating and trusting the models, the industry has opened up for machine learning.

At the same, machine learning, specifically, deep learning has shown far better performance than any other means of prediction.

So, my suggestion to every manager is to upskill themselves with a course like AI for Managers.