NAVARATRI-SALE| 15% off on all CloudxLab Pro Subscriptions! | COUPON CODE - NAV2022 | Valid Till

  Enroll Now
Subscribe now and get access to over 100+ courses Subscribe »

Lab Features

Real World Experience

The lab setup is exactly how enterprises work on Machine Learning, Deep Learning and Big Data technologies. Learn More >>

Learn Through Practice

The best way to learn Machine Learning and Big Data technologies is to write code and experiment. Learn by writing code and executing it on lab. Learn More >>

Connect From Anywhere

Start practising and learning Machine Learning and Big Data technologies from anywhere in the world. You just need an internet connection to access lab. Learn More >>

No Installation and Compatibility Issues

Lab comes pre-installed with all the software you will need to learn and practice Machine Learning and Big Data technologies. Learn More >>

Centralized Data sets

Upload any dataset to lab just once and use it anywhere, anytime. Lab also has publicly available data sets stored on the cluster so you can start practising right away. Learn More >>

Connect From Any Device

Connect to lab using a browser or SSH from any device or operating system. Learn More >>
Subscribe Now
Refer your friends and get 30 days free lab access Invite Friends »

Free Trial
  • Access to all CloudxLab self-paced courses
  • Real-time cluster access for 3 days
  • No access to third-party courses and instructor-led trainings
Subscribe Now
Subscription for 1 month
  • Unlimited Access to all CloudxLab self-paced courses
  • Real-time cluster access
  • Earn Industry-relevant Certificates
  • No access to third-party courses and instructor-led trainings
  • Access to Job Portal
Subscribe Now
Subscription for 6 months
  • Unlimited Access to all CloudxLab self-paced courses
  • Real-time cluster access
  • Earn Industry-relevant Certificates
  • No access to third-party courses and instructor-led trainings
  • Access to Job Portal
Subscribe Now

What technologies can I learn and practice on CloudxLab?

CloudxLab Technologies


Some of our esteemed customers

Frequently Asked Questions and Answers

Please add your question here.

Please add the answer here.

Do I get a dedicated cluster of my own?

CloudxLab is a shared cluster where you will be sharing resources with the other users. For dedicated cluster requests for multiple users

Can I install my own software?

CloudxLab is a managed service where the configurations and installations are taken care of by us. We have already set up most of the tools needed for practice.

Please contact us at if you are looking for other tools and we will try our best to make it available on the cluster. We are in pursuit of providing the best experience for our learning. If the tools/library you need is open source or free and it could be useful to more than 5% users, we would like to install it.

Also, you can install various libraries in your own environment such as virtualenv that do not require any administration.

How are we different?

Dedicated Big Data & AI organization focusing on lab services. Machine Learning ecosystem with Big Data such as TensorFlow, ScikitLearn, NumPy, SciPy, Pandas and Analytics tools such as R, Jupyter, etc. We have multiple versions of Spark. Automated Assessments and Email Support

What technologies can I practice on CloudxLab?

The tools and components available in the cluster include Hadoop, Spark, Kafka, Hive, Pig, HBase, Oozie, ZooKeeper, Flume, Sqoop, Mahout, R, Linux, Python, Scala, MongoDB, NumPy, SciPy, Pandas, Scikit-learn etc. Again, if you are looking for other tools please contact us at

Which hadoop distribution do you provide?

We provide the Hortonworks Data Platform. The Hadoop version on the cluster is 2.7. Please find the version of all the software components installed on CloudxLab here.

How many nodes in the cluster?

Currently, we have 5 nodes in the cluster. We automatically scale up and down based on the cluster load. Three nodes have 8 cores and 32 GB RAM and the other two nodes have 16 cores and 60 GB RAM each depending on the services running on them.

What are the limits on the usage of lab or what is the fair usage policy (FUP)?

The CloudxLab cluster is used for educational and PoC purposes. The reason we are able to provide the cluster at a very low cost is that we are able to share the systems.

The system resources are limited. If you try to use more resources, it is going to hurt other users. We have been avoiding putting hard limits on the resources consumption because we do not want to put roadblocks to the learning path to our users.

Here are the limits as per the fair usage policy:

  • HDFS - We provide 4.5 GB of storage space on HDFS with the replication factor of 3. That means if the replication factor is 3, you can store up to 1.5 GB data. And if the replication factor is 1, you can store up to 4.5 GB data. This is a hard limit meaning the HDFS will throw an error if you want to go beyond that storage.

  • Local Storage on the Linux console - The allowed storage is 3 GB on the web console in your home directory.

    • If you exceed this 3GB quota, you are given a 7 day grace period to reduce your usage to less than 3GB.

    • During this grace period you can have a maximum of 4GB of data in your home directory.

    • Once the grace period expires or you exceed the 4GB limit, whichever is earlier, you will no longer be able to create any new files in your home directory, and also your Jupyter server will stop working until you reduce your usage to less than 3GB.

    • Also, our scripts keep observing the storage consumption. Our bots will automatically delete your data if your storage is more than 4GB.

    • To clean the unnecessary files please follow the instructions given here.

  • Hive - Please ensure that you do not create too many databases in Apache Hive. The permissible number of databases in the Apache Hive is one.

  • RAM - Please ensure that your programs are not consuming memory (RAM) beyond 2GB. This hurts the other users. Our bots will automatically kill your processes if your RAM usages are more than 2GB.

  • Duration - Please do not run a long process such as the Hive, pyspark, spark-shell, or Jupyter notebook. Your process will be killed by our bots if 1) It is running for more than 3 hours 2) Your notebook is idle for more than 60 mins, 3) You are using more than one YARN container at a time. While Hive, pyspark, spark-shell consume the containers from YARN, the Jupyter notebook consumes the local memory.

  • CPU - Please do not run CPU-intensive tasks such as bitcoin mining or an infinite loop.

  • Bandwidth - Please do not download more than 5 GB of data a month.

  • MySQL - Please note that you will not be able to create new databases in MySQL. In MySQL, it becomes difficult to manage if we are allowing everyone to create databases.

Please note that violating these terms is an offense and your account might get disabled in case of an offense.

Is there bulk discounts available on buying lab subscriptions?

An additional 10% discount is available if more than 100 subscriptions' upfront payment is done. For more details, reach out to

Are there sample datasets in the cluster?

We have tried to keep everything available so you can start practicing without delay. So, yes, we do provide sample datasets. Please find the list of available datasets here

Will I get support?

Yes! Please feel free to ask your questions on CloudxLab forum and our community and team of experts will answer your questions. We believe forum will add better perspectives, ideas, and solutions to your questions.

What is your refund policy?

If you are unhappy with the product for any reason, let us know within 7 days of purchasing or upgrading your account, and we'll cancel your account and issue a full refund. Please contact us at to request a refund within the stipulated time. We will be sorry to see you go though!

Can I share my lab account with other people?

No. The lab is like a buffet. You can use as much as you want under the fair usage policy but you can not share with others.

Also, the lab is very personalized. The experience points are rewarded based on the lab usage and therefore it should be used individually.

If you will share your logins with others then our bot will automatically disable your account and this is irrevocable. You might not be able to use the CloudxLab services in future in such cases.

How to run scala IDE for spark jobs in Cloudxlab?

We have the Jupyter notebook and the Unix text editors on CloudxLab using which you can code. ?It is not possible to install the software like IntelliJ or Eclipse on a cloud based environment. You can use such IDEs on desktops/laptop and then upload the code on CloudxLab for execution. See this chapter to learn about it in more details:

Is there a way to subscribe my whole team or class?

Yes! Your whole team or class can opt for our Corporate Training Program. We provide assessment platforms too for your current employees and new candidates.

As an instructor, how can I monitor the work done by my students on the lab?

We can help you out to create assessments in our assessment engine. With this engine, you will be able to track if the students have completed the given hands-on exercises or not. For more details, let us know at

I'm an instructor. How should I provide CloudxLab to my students?

Please sign up here as an instructor and we will provide you the details.