Students from top US Universities
Learners from top corporates
You will receive your credentials right after subscribing to CloudxLab. You can log in to the cluster and start practicing right away. If you have not received the details an hour later, then something is definitely up and we would like to hear from you at firstname.lastname@example.org
CloudxLab is a shared cluster where you will be sharing resources with the other users. For dedicated cluster requests for multiple users email@example.com.
CloudxLab is a managed service where the configurations and installations are taken care of by us. We have already set up most of the tools needed for practice.
Please contact us at firstname.lastname@example.org if you are looking for other tools and we will try our best to make it available on the cluster. We are in pursuit of providing the best experience for our learning. If the tools/library you need is open source or free and it could be useful to more than 5% users, we would like to install it.
Also, you can install various libraries in your own environment such as virtualenv that do not require any administration.
Dedicated Big Data & AI organization focusing on lab services. Machine Learning ecosystem with Big Data such as TensorFlow, ScikitLearn, NumPy, SciPy, Pandas and Analytics tools such as R, Jupyter, etc. We have multiple versions of Spark. Automated Assessments and Email Support
The tools and components available in the cluster include Hadoop, Spark, Kafka, Hive, Pig, HBase, Oozie, ZooKeeper, Flume, Sqoop, Mahout, R, Linux, Python, Scala, MongoDB, NumPy, SciPy, Pandas, Scikit-learn etc. Again, if you are looking for other tools please contact us at email@example.com.
We provide the Hortonworks Data Platform. The Hadoop version on the cluster is 2.7. We also provide Hue from Cloudera to analyze data using web interface. Please find the version of all the software components installed on CloudxLab here.
Currently, we have 4 nodes in the cluster. We automatically scale up and down based on the cluster load. Three nodes have 8 cores and 32 GB RAM and the other has 2 cores and 4 GB RAM depending on the services running on them.
The CloudxLab cluster is used for educational and PoC purposes. The reason we are able to provide the cluster at a very low cost is that we are able to share the systems.
The system resources are limited. If you try to use more resources, it is going to hurt other users. We have been avoiding putting hard limits on the resources consumption because we do not want to put roadblocks to the learning path to our users.
Here are the limits as per the fair usage policy:
HDFS - We provide 4.5 GB of storage space on HDFS with the replication factor of 3. That means if the replication factor is 3, you can store up to 1.5 GB data. And if the replication factor is 1, you can store up to 4.5 GB data. This is a hard limit meaning the HDFS will throw an error if you want to go beyond that storage.
Local Storage on the Linux console - The allowed storage is 1 GB on web console. Our scripts keep observing the storage consumption. Our bots will automatically delete your data if your storage is more than 1GB.
Hive - Please ensure that you do not create too many databases in Apache Hive. The permissible number of databases in the Apache Hive is one.
RAM - Please ensure that your programs are not consuming memory (RAM) beyond 2GB. This hurts the other users. Our bots will automatically kill your processes if your RAM usages are more than 2GB.
Duration - Please do not run a long process such as the hive, pyspark, spark-shell or Jupyter notebook beyond 60 minutes. While Hive, pyspark, spark-shell consume the containers from YARN, the Jupyter notebook consumes the local memory.
CPU - Please do not run CPU intensive tasks such as bitcoin mining or an infinite loop.
Bandwidth - Please do not download more than 5 GB data a month.
Please note that on violating these terms is an offence and your account might get disabled in case of offence.
We have tried to keep everything available so you can start practicing without delay. So, yes, we do provide sample datasets. Please find the list of available datasets here
Yes! Please feel free to ask your questions on CloudxLab forum and our community and team of experts will answer your questions. We believe forum will add better perspectives, ideas, and solutions to your questions.
If you are unhappy with the product for any reason, let us know within 3 days of purchasing or upgrading your account, and we'll cancel your account and issue a full refund. Please contact us at firstname.lastname@example.org to request a refund within the stipulated time. We will be sorry to see you go though!
No. The lab is like a buffet. You can use as much as you want under the fair usage policy but you can not share with others.
Also, the lab is very personalized. The experience points are rewarded based on the lab usage and therefore it should be used individually.
If you will share your logins with others then our bot will automatically disable your account and this is irrevocable. You might not be able to use the CloudxLab services in future in such cases.
Yes! Your whole team or class can practice real time on CloudxLab. You will only need to share the name and email address of every attendee. Please contact us at email@example.com to know more.
We can help you out to create assessments in our assessment engine. With this engine, you will be able to track if the students have completed the given hands-on exercises or not. For more details, let us know at firstname.lastname@example.org
Please sign up here as an instructor and we will provide you the details.
Absolutely! Please contact us here