Login using Social Account
     Continue with GoogleLogin using your credentials
Projects in self-paced course
Twitter Dataset (Monday - 16 Sept) https://cloudxlab.com/assessment/playlist-intro/283/analytics-vidhya-hive-project?course_id=65&playlist_id=283 The main objective is to process the twitter messages using Hive. The detailed objective is provided on that page.
Apache Log Unstructured Data (23 Sept) https://cloudxlab.com/assessment/displayslide/3795/example-objective?course_id=65&playlist_id=275 This problem statement is around process unstructured data using Apache Spark. This would require an understanding of regular expressions in Python and understanding of RDD and data frame operations. The solution that is available is in Scala, the learners have to build the same in python.
Movie Recommendation (23 Sept) https://github.com/cloudxlab/bigdata/blob/master/spark/examples/mllib/movie-recommendations.py This is the code to generate movie recommendations. Following this, the learners are supposed to generate movie recommendations for themselves. For the same, they first need to find out the unique id to assign to themselves and then filling their movie ratings and then train a model using the ratings in the dataset and their own dataset.
-- 3. Dataframe Notebook (Monday - 16 Sept) Analytics using Spark DataFrame - https://github.com/cloudxlab/bigdata/blob/master/spark/python/pyspark_dataframe_problems.ipynb These are multiple problems involving different kind of data formats and simple analytics. This has most of the work already done. The candidate has to follow the basics and answer the remaining questions.
Want to create exercises like this yourself? Click here.
No hints are availble for this assesment
Answer is not availble for this assesment
Loading comments...