Apache Spark Basics

51 / 89

Apache Spark - Lazy Evaluation & Lineage Graph




Not able to play video? Try with youtube

INSTRUCTIONS
  • Actions: Lazy Evaluation - Optimization - Scala

    Individual map transformations

    def Map1(x:String):String = x.trim();
    def Map2(x:String):String =  x.toUpperCase();
    var lines = sc.textFile(...)
    var lines1 = lines.map(Map1);
    var lines2 = lines1.map(Map2);
    lines2.collect()
    

    ...converted by Spark into single transformation because of lazy evaluation

    def Map3(x:String):String={
        var y = x.trim();
        return y.toUpperCase();
    }
    lines = sc.textFile(...)
    lines2 = lines.map(Map3);
    lines2.collect()
    
  • Lineage graph

    lines  = sc.textFile("myfile");
    fewlines = lines.filter(...)
    uppercaselines = fewlines.map(...)
    uppercaselines = fewlines.map(...)
    uppercaselines.count()
    

Loading comments...