Spark Streaming

16 / 20

Please login to comment

24 Comments

Hi 
I am getting this error when I am trying to clone the bigdata repository. Could you please help me with this issue

  Upvote    Share

Hi,

There are no streaming related code inside the folder while I tired checkin after the pull command 

  Upvote    Share

Hi Suma,

Both the repositories above contains the code for spark streaming. You can see the path of the code file by opening the repo in the browser itself and then navigate to the respective path.

  Upvote    Share

Hi,

      Trying kafka+spark streaming but getting below error.

 

  Upvote    Share

This is my script 

import org.apache.spark.sql.{DataFrame, Row, SparkSession}
import org.apache.spark.sql.types.{StructType, StructField, IntegerType, StringType, FloatType, DateType, TimestampType}
import org.apache.spark.sql.functions._
import org.apache.spark.sql.streaming.{ProcessingTime, Trigger}
import java.util.Properties
val spark = SparkSession.builder.master("local[*]").appName("spark_kafka_streaming").getOrCreate()
val df = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "cxln2.c.thelab-240901.internal:6667").option("subscribe", "ranveer_second_topic").load()
df.printSchema()

My build.sbt file

name := "producer_consumer"
version := "1.0"
scalaVersion := "2.11.8"
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.8"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.8" % "provided"
libraryDependencies += "org.apache.spark" %% "spark-sql-kafka-0-10" % "2.4.8" % Test

 

I have also tried with below command but got the same error

spark-submit --class "KafkaStreaming" target/scala-2.11/producer_consumer_2.11-1.0.jar

 

  Upvote    Share

Any update here ?

Why am I getting above error ?

  Upvote    Share

This comment has been removed.

Hi, 

Make sure the jar file may contains the required class KafkaStreaming.

  Upvote    Share

hi,

   It has already found the class. As error is coming at below line 

 

val df = spark.readStream.format("kafka").option("kafka.bootstrap.servers", "cxln2.c.thelab-240901.internal:6667").option("subscribe", "ranveer_second_topic").load()

 

This is scala file which i used to create jar file

  Upvote    Share

how do I install kafka in scala ?

  Upvote    Share

name := "producer_consumer"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.kafka" %% "kafka" % "2.3.1"

  Upvote    Share

I am referring that only this is my build.sbt file

 

 

name := "producer_consumer"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.kafka" %% "kafka" % "2.3.1"

I am getting below error

[ranveersachin1433053@cxln5 new_test]$ vi build.sbt[ranveersachin1433053@cxln5 new_test]$ sbt console[info] Updated file /home/ranveersachin1433053/new_test/project/build.properties: set sbt.version to 1.2.8[info] Loading project definition from /home/ranveersachin1433053/new_test/project[info] Updating ProjectRef(uri("file:/home/ranveersachin1433053/new_test/project/"), "new_test-build")...Waiting for lock on /home/ranveersachin1433053/.ivy2/.sbt.ivy.lock to be available...

 

For simple kafka installation I have to follow up number of times. Why cant you guys preinstall it. I want below statement to work in scala

 

import org.apache.kafka.clients.proudcer.{KafkaProducer, ProducerConfig, ProducerRecord}

  Upvote    Share

Hi, 

As the error suggests, the file is locked by some other process. The ".sbt.ivy.lock" file is a lock file used to ensure that only one process at a time can modify the Ivy cache.

You can try waiting for a few moments and then try again. If the problem persists, you can try terminating any other processes that may be using the file, or you can manually delete the lock file to release the lock. However, be careful when deleting lock files as it can potentially corrupt your project.

  Upvote    Share

It is not working I have tried with 1 day gap. I have pasted my build.sbt file content below. Let me know if you have successfully installed kafka or not 

 

name := "producer_consumer"

version := "1.0"

scalaVersion := "2.11.8"

libraryDependencies += "org.apache.kafka" %% "kafka" % "2.3.1"

  Upvote    Share

why we have to install kafka why cant you preinstall kafka like spark ?

  Upvote    Share

This comment has been removed.

Hi Sachin,

Kafka is pre-installed in our lab. You can check it at https://cloudxlab.com/assessment/displayslide/6682/kafka. But it is a standalone installation that is not associated with your Scala project.

That's why while building Scala project, You need to add Kafka's dependency.

I deleted the lock files, and cleaned up little of your project. After that it was working fine.

 

If the lock issue still persists, feel free to contact us.

  Upvote    Share

This comment has been removed.

How zookeper is helping to locate ip? exactly what's the role of ZK here?

  Upvote    Share