Enrollments closing soon for **Post Graduate Certificate Program in Applied Data Science & AI** By **IIT Roorkee** | 3 Seats Left

- Home
- Assessment

68 / 86

In the last slide we saw that since average is not commutative, we could not use `reduce`

directly to calculate the average on a set of numbers. So how do we calculate average using `reduce`

in that case? Let's see.

First, let's define a set of elements for which we will be calculating the average, and store them in an RDD named

`rdd`

`<<your code goes here>> = sc.parallelize([1.0, 2, 3, 4, 5, 6, 7], 3)`

Now let's calculate the average by using

`reduce`

to calculate the sum of the elements, and`count`

to get the number of elements. Then we divided the sum by the count to get the average and then store the result in a new RDD called`avg`

`avg = rdd.<<your code goes here>>(lambda x, y: x + y) / rdd.count()`

The average given here is

`4.0`

which is correct. However, this is not the correct approach since we are computing RDD twice - during reduce and during count. So, we will move to the next approachWith the next approach, we will first translate all the values into a composite value such that each element of RDD represent a value along with how many elements have been summed up to reach this value. So we transform each element into a tuple with the value, and

`1`

which represents how many numbers have been added to reach the value (which is initially`1`

). We will use`map`

for this as shown below`rdd_count = rdd.<<your code goes here>>(lambda x: (x, 1))`

Next, we will define a function

`add_tuples`

that will keep traversing the elements, and update both their sum and the number of elements that were summed up to reach this value and return a resulting tuple`def <<your code goes here>>(x, y): return (x[0] + y[0], x[1] + y[1])`

Now, we will use

`reduce`

with this function to return a sum of the values and their counts. We will store this in variables`sum`

and`count`

`(sum, count) = rdd_count.<<your code goes here>>(add_tuples)`

Finally, we will calculate the average using these values

`avg = sum / <<your code goes here>>`

This approach takes a significantly less amount of time than the previous one.

XP

Taking you to the next exercise in seconds...

Want to create exercises like this yourself? Click here.

Checking Please wait.

Success

Error

No hints are availble for this assesment

Fetching answer, please wait...

Error

**Note - **Having trouble with the assessment engine? Follow the steps listed
here

## Loading comments...