Only Few Seats Left for Advanced Certification Courses on Data Science, ML & AI by E&ICT Academy IIT RoorkeeApply Now
In this project, we will parse Apache logs to get some meaningful insights from the logs.
We've already done a part of it in Writing Spark Applications topic.
Extend the same project, write unit test cases and code for the next set of problems and answers the questions.
Data set -
Dataset is located in /data/spark/project/NASA_access_log_Aug95.gz directory in HDFS
Above dataset is access log of NASA Kennedy Space Center WWW server in Florida.
The logs are an ASCII file with one line per request, with the following columns:
Note that from 01/Aug/1995:14:52:01 until 03/Aug/1995:04:36:13 there are no accesses recorded, as the Web server was shut down, due to Hurricane Erin.
Based on the above data, please answer the following questions
Note - If you are stuck, please take inspiration from this solution by one of our students.
No hints are availble for this assesment
Answer is not availble for this assesment