10 Things to Look for When Choosing a Big Data course / Institute

Big Data

Every now and then, I keep seeing a new company coming up with Hadoop classes/courses. Also, my friends keep asking me which of these courses is good to take. I gave them a few tips to choose the best course suitable for them. Here are the few tips to decide which course you should attend to:

1. Does the instructor have domain expertise?

Know your instructor. You must know about the instructor’s background. Has (s)he done any big data related work? I have seen a lot of instructors who just attend a course somewhere and become instructors.

If the instructor never worked in the domain, do not take such classes. Also, avoid training institutes that do not tell you details about the instructor.

2. Is the instructor hands on? When did she/he code last time?

In the domain of technology, there is a humongous difference between one instructor who is hands-on in coding and another who is delivering based on theoretical knowledge. Also, know when the instructor worked on codes the last time. If instructor never coded, do not attend the class.

3. Does the instructor encourage & answer your questions?

There are many recorded free videos available across the internet. The only reason you would go for live classes would be to get your questions answered and doubts cleared immediately.

If the instructor does not encourage questions and answers, such classes are fairly useless.

4. Do they provide a cloud-based lab?

A cloud is basically a computer setup at someone else’s place. When I say my data is in the cloud, it means my data is on a computer that is remotely available.

In olden days, people use to have a physical laboratory of computers for learning basic computer skills. In today’s times, while learning advanced technologies, we require a similar setup but on the cloud i.e. at a remote location. A cloud-based lab provides the following benefits:

  • Instantaneously available – you do not have to wait for your computer to boot or install something.
  • Accessible from everywhere – whether you want to work on problems from your office or from home, they should be accessible from everywhere.
  • Easy to get your code debugged through the instructors – While working on assignments, you might get stuck and need to show the assignments to your instructor and seek review. If your environment, code, error log and history of commands are available to the instructor immediately, the instructor will be able to test and debug your program right away.

Why multiple computer setups on cloud-based labs?

Since Big Data technologies are all about distributed computing i.e. tools that run on multiple computers simultaneously.

If you go through the following list of tools related to big data, you would understand that Big Data is all about multiple computers working together to solve a problem:

  • Hadoop Distributed File System – A file system that utilizes multiple computers’ disk space and disk IO to provide really high performance and huge space.
  • Hadoop Yarn / MapReduce – a compute engine that utilizes multiple computers’ processor and IOs (disk read/write) to solve computing problems without involving too much network transfer of data.
  • NoSQL (HBase, Cassandra, MongoDB etc) – Databases that run on multiple computers (nodes) simultaneously to provide really high performance when dealing with a huge number of read-writes per second. Such databases provide really huge storage using the storage space of multiple computers.
  • Apache Spark – Utilizes the memory (RAM) and CPU of multiple computers to provide really high throughput.

So, it is very important to have a setup that has multiple computers. It does not make any sense to have a setup with only one computer.

5. Do they not promise jobs?

If you find an institute promising jobs or providing job guarantees, stay away from them. An institute can at most try to connect you with the job industry, they can not give you job guarantees. If you are considering an institute that is promising you a job, please enquire before joining the course.

6. What is the refund policy?

What if you found after attending the first few classes that the course is not up to your expectations, and you want your refund. Check if they have a proper refund policy in place.

7. Is it online?

Finding instructors in advanced technologies is difficult. And it is even more difficult to find good instructors in your local location. So, the chances of getting a good instructor for classroom training is very very low. Getting a great instructor for online training is easier.

So, always prefer an online training over offline training in case of Big Data. By online, I am referring to online live training and not a recorded one.

8. Are the founders of the institute from a technology background?

In a good institution or university, even an administrator or a PR person is a professor or a lecturer.

Therefore an institute providing Big Data or Hadoop training, whether online or offline, whether big or small, cannot sustain if the founders are not from a technology background. The founders who are technologically challenged may hire a sub-par instructor and may not be able to address the real problems that students face.

So, always go for an institute where the founders have a good background in technology.

9. Has this institute published something useful in the big data world?

If the institute has a strong technology-based foundation, they will definitely do some innovations and/ or publish some articles and research papers from time to time. These research papers could be as blog posts or in ACM etc. These institutes can be considered.

If the institute’s blog is filled with marketing material only, and not any substantially useful information, the institute is not putting enough efforts into having good instructors or good subject matter experts. Such institutes are more focused on marketing themselves than in adding any value in their domain.

10. Are they asking for a direct transfer?

If an institute is accepting payments through net banking, they must have signed up with a payment gateway such as PayPal. Also, the payment gateways generally make sure there is a refund policy. However, if the institute is asking you to pay directly and not through any payment gateway, know that you should stay away from such an institute.

Please feel free to leave your comments in the comment box so that we can improve the guide and serve you better. Also, Follow CloudxLab on Twitter to get updates on new blogs and videos.

If you wish to learn Hadoop and Spark technologies such as MapReduce, Hive, HBase, Sqoop, Flume, Oozie, Spark RDD, Spark Streaming, Kafka, Data frames, SparkSQL, SparkR, MLlib, GraphX and build a career in BigData and Spark domain then check out our signature course on Big Data with Apache Spark and Hadoop which comes with

  • Online instructor-led training by professionals having years of experience in building world-class BigData products
  • High-quality learning content including videos and quizzes
  • Automated hands-on assessments
  • 90 days of lab access so that you can learn by doing
  • 24×7 support and forum access to answer all your queries throughout your learning journey
  • Real-world projects
  • A certificate which you can share on LinkedIn