Project- Exploring Web Scraping: Python Adventures on Wikipedia and Amazon

Scraping Amazon Product Reviews - Import the libraries

In the previous assessment, we scraped a Wikipedia page using Beautiful Soup. This time, we would use lxml to scrape the reviews from the Amazon page given below:

lxml is is a Pythonic binding for the C libraries libxml2 and libxslt. It is one of the fastest and feature-rich libraries for processing XML and HTML in Python. Using Python lxml library, XML and HTML documents can be created, parsed, and queried.

You can find out more about lxml from their official page given below:

  • First, we will import the html module from the lxml library

    from <<your code goes here>> import html
  • Now, like we did in the previous assessment, we will import the requests module to make HTTP request

    import <<your code goes here>>
  • Finally, we will import Pandas as pd

    import pandas as <<your code goes here>>
