Writing Spark Applications

11 / 16
   

Tutorial - Understand Code

Now lets try to understand the code on github. Please note that this repository contains the code related to multiple projects and examples. Let browse to spark -> projects -> apache-log-parsing_sbt

README.MD

By default the description shown is coming from README.md and is in a format called markdown. You can click readme.md and edit it if you want.

readme_md

On Clicking on Edit, this is how it would be displayed: readme-edit

Now, Let's go back.

build.sbt

The file build.sbt has the instruction on how to compile this code using sbt. Sbt stands for scala build tool. It is used for building scala projects like maven and ant are used for java projects and make is used for C / C++ projects. For understanding more about sbt, please visit sbt documentation.

If you click on this file, it would look like this: build_sbt

It defines the name, version of the app. It also specifies which minimum version of scala is required. Also, libraryDependencies is a list of libraries our project depends upon.

Here we are adding two libraries to the list spark and testing base. These libraries will be downloaded automatically while building. Each definition is of the form:

groupID %% artifactID % revision % configuration

See more at: Library Dependencies Documentation