Scala Project - Churn Email Inbox with Scala

4 / 7

Scala Project - Churn Emails - Find Average Spam Confidence

In the previous exercise, we saw a couple of examples of startswith. Let's do one more hands-on with startswith. Here we will split a line and calculate an average based on the second element from the line based on certain conditions.

You can split a line using the split function as follows:

splittedContent = stringToSplit.split(" ")
  • Define a function average_spam_confidence which calculates the average spam confidence and returns it
  • Open the file mbox-short.txt which is located at /cxldata/datasets/project/mbox-short.txt
  • Loop through the file handle
  • Select only those lines starts with X-DSPAM-Confidence:
  • Split the lines at : and take the string value which is spam confidence
  • Convert that string value into float
  • Find the average of this spam confidence in the entire file and return it
  • Finally print the average spam confidence by calling the function from a new cell.

Note: If your logic is correct then the correct spam confidence score should be 0.7507186.

See Answer

No hints are availble for this assesment

Note - Having trouble with the assessment engine? Follow the steps listed here

Loading comments...