Analytics Vidhya - Live Class

15 / 15

Assessment and Projects Plan

  1. Exercises in Self - Paced Course (23 Sept)
  2. Projects in self-paced course

    • Twitter Dataset (Monday - 16 Sept) https://cloudxlab.com/assessment/playlist-intro/283/analytics-vidhya-hive-project?course_id=65&playlist_id=283 The main objective is to process the twitter messages using Hive. The detailed objective is provided on that page.

    • Apache Log Unstructured Data (23 Sept) https://cloudxlab.com/assessment/displayslide/3795/example-objective?course_id=65&playlist_id=275 This problem statement is around process unstructured data using Apache Spark. This would require an understanding of regular expressions in Python and understanding of RDD and data frame operations. The solution that is available is in Scala, the learners have to build the same in python.

    • Movie Recommendation (23 Sept) https://github.com/cloudxlab/bigdata/blob/master/spark/examples/mllib/movie-recommendations.py This is the code to generate movie recommendations. Following this, the learners are supposed to generate movie recommendations for themselves. For the same, they first need to find out the unique id to assign to themselves and then filling their movie ratings and then train a model using the ratings in the dataset and their own dataset.

-- 3. Dataframe Notebook (Monday - 16 Sept) Analytics using Spark DataFrame - https://github.com/cloudxlab/bigdata/blob/master/spark/python/pyspark_dataframe_problems.ipynb These are multiple problems involving different kind of data formats and simple analytics. This has most of the work already done. The candidate has to follow the basics and answer the remaining questions.

  1. Loans dataset - [Classification Random Forrest] - Spark ML - Mllib
    • Try other approaches
    • Hyper paramter tuning MLlib https://github.com/cloudxlab/bigdata/blob/master/spark/examples/mllib/mllib_random_forrest.ipynb It is an example project for binary classification. The candidates have to follow this, run it and improve it.

No hints are availble for this assesment

Answer is not availble for this assesment

Loading comments...