本文介绍了spark提交失败,spark流workdcount python代码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我刚刚复制了spark streaming wodcount python代码,在Spark集群中使用spark-submit运行wordcount python代码,但是显示如下错误:

I just copied the spark streaming wodcount python code, and use spark-submit to run the wordcount python code in Spark cluster, but it shows the following errors:

py4j.protocol.Py4JJavaError: An error occurred while calling o23.loadClass.
: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

我确实构建了 jar spark-streaming-kafka-assembly_2.10-1.4.0-SNAPSHOT.jar.我使用以下脚本提交:bin/spark-submit/data/spark-1.3.0-bin-hadoop2.4/wordcount.py --master spark://192.168.100.6:7077 --jars/data/spark-1.3.0-bin-hadoop2.4/kafka-assembly/target/spark-streaming-kafka-assembly_*.jar.

I did build the jar spark-streaming-kafka-assembly_2.10-1.4.0-SNAPSHOT.jar. And I used the following script to submit:bin/spark-submit /data/spark-1.3.0-bin-hadoop2.4/wordcount.py --master spark://192.168.100.6:7077 --jars /data/spark-1.3.0-bin-hadoop2.4/kafka-assembly/target/spark-streaming-kafka-assembly_*.jar.

提前致谢!

推荐答案

实际上我刚刚意识到您在脚本之后包含了 --jars.除非在脚本名称之前指定了 jars,否则不会包含 jar 文件. 所以使用 spark-submit --jars spark-streaming-kafka-assembly_2.10-1.3.1.jar Script.py而不是 spark-submit Script.py --jars spark-streaming-kafka-assembly_2.10-1.3.1.jar.

Actually I just realized you have included the --jars after the script. The jar files will not be included unless the jars are specified before the script name. So use spark-submit --jars spark-streaming-kafka-assembly_2.10-1.3.1.jar Script.py instead of spark-submit Script.py --jars spark-streaming-kafka-assembly_2.10-1.3.1.jar.

这篇关于spark提交失败,spark流workdcount python代码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-06 10:18