问题描述
我刚才复制的火花流wodcount蟒蛇code和使用火花提交运行星火集群中的单词计数蟒蛇code,但它显示了以下错误:
I just copied the spark streaming wodcount python code, and use spark-submit to run the wordcount python code in Spark cluster, but it shows the following errors:
py4j.protocol.Py4JJavaError: An error occurred while calling o23.loadClass.
: java.lang.ClassNotFoundException: org.apache.spark.streaming.kafka.KafkaUtilsPythonHelper
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
我没有建的jar火花流 - 卡夫卡assembly_2.10-1.4.0-SNAPSHOT.jar。我用下面的脚本提交:
斌/火花提交/data/spark-1.3.0-bin-hadoop2.4/wordcount.py --master火花://192.168.100.6:7077 --jars /data/spark-1.3.0-bin-hadoop2 0.4 /卡夫卡组装/目标/火花流 - 卡夫卡组装_ *。jar中。
I did build the jar spark-streaming-kafka-assembly_2.10-1.4.0-SNAPSHOT.jar. And I used the following script to submit:bin/spark-submit /data/spark-1.3.0-bin-hadoop2.4/wordcount.py --master spark://192.168.100.6:7077 --jars /data/spark-1.3.0-bin-hadoop2.4/kafka-assembly/target/spark-streaming-kafka-assembly_*.jar.
在此先感谢!
推荐答案
其实我才意识到你已经包含了脚本后--jars。 的jar文件将不被包括在内,除非罐子在脚本名称之前指定。因此,使用火花提交--jars火花流 - 卡夫卡assembly_2.10-1.3.1.jar Script.py而不是火花提交Script.py --jars火花流 - 卡夫卡assembly_2.10-1.3.1.jar。
Actually I just realized you have included the --jars after the script. The jar files will not be included unless the jars are specified before the script name. So use spark-submit --jars spark-streaming-kafka-assembly_2.10-1.3.1.jar Script.py instead of spark-submit Script.py --jars spark-streaming-kafka-assembly_2.10-1.3.1.jar.
这篇关于火花提交失败,火花流workdcount蟒蛇code的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!