Enrollments closing soon for Post Graduate Certificate Program in Applied Data Science & AI By IIT Roorkee | 3 Seats Left
Apply NowLogin using Social Account
     Continue with GoogleLogin using your credentials
The following steps are needed in order to setup windows dev machine:
cd c:\Users\MY_WINDOWS_USERNAME\Documents\GitHub\bigdata\spark\projects\apache-log-parsing_sbt
sbt test
on command prompt. This should some test cases have failed. So, there is an error in the code that we need to fix. The following steps demonstrate the software development process of fixing the error using eclipse.Install eclipse:
Create eclipse project using sbt eclipse plugin: https://github.com/typesafehub/sbteclipse
With sublime texteditor, edit C:/Users/myuser/.sbt/1.0/plugins/plugins.sbt
And add following to it or whatever is mentioned on sbteclipse github homepage:
addSbtPlugin("com.typesafe.sbteclipse" % "sbteclipse-plugin" % "5.2.2")
Re-Open command prompt and go to the project folder:
cd c:\Users\MY_WINDOWS_USERNAME\Documents\GitHub\bigdata\spark\projects\apache-log-parsing_sbt
sbt eclipse
In file log-parser.java (src/main/scala/com/cloudxlab/logparsing directory), add the following function inside Utils class:
def isClassA(ip:String):Boolean = { ip.split('.')(0).toInt < 127 }
In log-parser-test.scala (located in test/scala/com/cloudxlab/logparsing directory), add a unit test case:
"CLASSA" should "Return true if class is A" in { val utils = new Utils assert(utils.isClassA("121.242.40.10 ")) assert(!utils.isClassA("212.242.40.10 ")) assert(!utils.isClassA("239.242.40.10 ")) assert(!utils.isClassA("191.242.40.10 ")) }
In log-parser.scala (located in src/main/scala/com/cloudxlab/logparsing directory), add a filter after extracting the IP :
var cleanips = ipaccesslogs.map(extractIP(_))
var cleanips = ipaccesslogs.map(extractIP(_)).filter(isClassA)
spark-submit apache-log-parsing_2.10-0.0.1.jar com.cloudxlab.logparsing.EntryPoint 10 /data/spark/project/access/access.log.2.gz
You should see something like the following on the screen after lots of log messages:
===== TOP 10 IP Addresses ===== (107.170.18.142,142072) (106.216.188.163,4584) (69.65.19.184,1259) (106.216.154.50,1187) (78.46.22.138,1093) (106.216.189.5,638) (59.97.17.204,478) (72.195.144.124,436) (4.26.51.74,393) (122.172.105.180,387)
Want to create exercises like this yourself? Click here.
Loading comments...