Running PySpark in Jupyter / IPython notebook

We are glad to inform that now you can run PySpark code in Jupyter notebook on CloudxLab.

What is Jupyter notebook?

The IPython Notebook is now known as the Jupyter Notebook. It is an interactive computational environment, in which you can combine code execution, rich text, mathematics, plots and rich media. For more details on the Jupyter Notebook, please see the Jupyter website.

Please follow below steps to access the Jupyter notebook on CloudxLab

Step 1 – Login to web console

Step 2 – Run below commands on web console

Above commands will launch a Jupyter notebook and display these lines in console

To access the notebook, go to this address http://webconsole:port where

         webconsole – the domain of your web console

         port – port on which your notebook is running

If your web console is f.cloudxlab.com and your notebook is running on port 8890, go to http://f.cloudxlab.com:8890 to access the notebook on your browser.

Step 3- Set up environment variables

Step 4- Load PySpark module

You should get the result like below image if pyspark module is loaded properly.

PySpark Jupyter Notebook
PySpark Jupyter Notebook