本文介绍了Spark 2.3-Minikube-Kubernetes-Windows-演示-未找到SparkPi的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试遵循,但是我遇到错误.

I am trying to follow this but I am encountering an error.

特别是当我跑步时:

spark-submit.cmd --master k8s://https://192.168.1.40:8443 --deploy-mode cluster --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=1 --conf spark.kubernetes.container.image=spark:spark --conf spark.kubernetes.driver.pod.name=spark-pi-driver local:///opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar

我得到:

2018-03-17 02:09:00 INFO  LoggingPodStatusWatcherImpl:54 - State changed, new state:
         pod name: spark-pi-driver
         namespace: default
         labels: spark-app-selector -> spark-798e78e46c5c4a11870354b4b89602c0, spark-role -> driver
         pod uid: c6de9eb7-297f-11e8-b458-00155d735103
         creation time: 2018-03-17T01:09:00Z
         service account name: default
         volumes: default-token-m4k7h
         node name: minikube
         start time: 2018-03-17T01:09:00Z
         container images: spark:spark
         phase: Failed
         status: [ContainerStatus(containerID=docker://5c3a1c81333b9ee42a4e41ef5c83003cc110b37b4e0b064b0edffbfcd3d823b8, image=spark:spark, imageID=docker://sha256:92e664ebc1612a34d3b0cc7522615522805581ae10b60ebf8c144854f4207c06, lastState=ContainerState(running=null, terminated=null, waiting=null, additionalProperties={}), name=spark-kubernetes-driver, ready=false, restartCount=0, state=ContainerState(running=null, terminated=ContainerStateTerminated(containerID=docker://5c3a1c81333b9ee42a4e41ef5c83003cc110b37b4e0b064b0edffbfcd3d823b8, exitCode=1, finishedAt=Time(time=2018-03-17T01:09:01Z, additionalProperties={}), message=null, reason=Error, signal=null, startedAt=Time(time=2018-03-17T01:09:01Z, additionalProperties={}), additionalProperties={}), waiting=null, additionalProperties={}), additionalProperties={})]

kubectl logs -f spark-pi-driver告诉我:

C:\spark-2.3.0-bin-hadoop2.7>kubectl logs -f spark-pi-driver
++ id -u
+ myuid=0
++ id -g
+ mygid=0
++ getent passwd 0
+ uidentry=root:x:0:0:root:/root:/bin/ash
+ '[' -z root:x:0:0:root:/root:/bin/ash ']'
+ SPARK_K8S_CMD=driver
+ '[' -z driver ']'
+ shift 1
+ SPARK_CLASSPATH=':/opt/spark/jars/*'
+ env
+ grep SPARK_JAVA_OPT_
+ sed 's/[^=]*=\(.*\)/\1/g'
+ readarray -t SPARK_JAVA_OPTS
+ '[' -n '/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' ']'
+ SPARK_CLASSPATH=':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar'
+ '[' -n '' ']'
+ case "$SPARK_K8S_CMD" in
+ CMD=(${JAVA_HOME}/bin/java "${SPARK_JAVA_OPTS[@]}" -cp "$SPARK_CLASSPATH" -Xms$SPARK_DRIVER_MEMORY -Xmx$SPARK_DRIVER_MEMORY -Dspark.driver.bindAddress=$SPARK_DRIVER_BIND_ADDRESS $SPARK_DRIVER_CLASS $SPARK_DRIVER_ARGS)
+ exec /sbin/tini -s -- /usr/lib/jvm/java-1.8-openjdk/bin/java -Dspark.executor.instances=1 -Dspark.driver.port=7078 -Dspark.driver.blockManager.port=7079 -Dspark.submit.deployMode=cluster -Dspark.jars=/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar,/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar -Dspark.app.id=spark-798e78e46c5c4a11870354b4b89602c0 -Dspark.kubernetes.container.image=spark:spark -Dspark.master=k8s://https://192.168.1.40:8443 -Dspark.kubernetes.executor.podNamePrefix=spark-pi-fb36460b4e853cc78f4f7ec4d9ec8d0a -Dspark.app.name=spark-pi -Dspark.driver.host=spark-pi-fb36460b4e853cc78f4f7ec4d9ec8d0a-driver-svc.default.svc -Dspark.kubernetes.driver.pod.name=spark-pi-driver -cp ':/opt/spark/jars/*:/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar;/opt/spark/examples/jars/spark-examples_2.11-2.3.0.jar' -Xms1g -Xmx1g -Dspark.driver.bindAddress=172.17.0.4 org.apache.spark.examples.SparkPi
Error: Could not find or load main class org.apache.spark.examples.SparkPi

找不到SparkPi类.然而,当我探索spark:spark容器时,JAR便在其中:

It cannot find the SparkPi class. Yet, when I explore the spark:spark container, the JAR is inside:

\opt\spark\examples\jars:
spark-examples_2.11-2.3.0.jar

因此图像是正确构建的...

So the image was built correctly...

有什么想法吗?

帮助!

编辑

我一直在做更多的测试.我确实在Azure中设置了AKS,并启动了相同的Docker映像,但出现相同的错误.我正在按照指令进行操作,但使用的是与Docker映像相同的通过ACR在本地进行.

I have been doing some more testing. I did set up an AKS in Azure and launched the same Docker image getting the same error. I was following this instructions but using the same Docker image as in local through ACR.

此外,.JAR已上传到Blob存储,并且URL用于AKS.仍然我有完全相同的错误.

Also, the .JAR was uploaded to Blob Storage and an URL used for the case of AKS. Still I got the exact same error.

某种程度上,这使我认为错误可能出在构建映像本身的方式上,还是在构建.JAR的方式上,而不是群集本身的某些配置中.

This somehow makes me think the error might be in the way I build the image itself or in the way I build the .JAR more so than in some configuration of the Cluster itself.

但是,没有雪茄.

任何想法-甚至是获取有效Spark 2.3图像的URL-都将受到欢迎.我在Windows中构建映像.我将很快尝试在Linux中构建它,也许一直以来就是问题所在...

Any ideas - or even an URL to get a working Spark 2.3 image - would be welcome.I build the image in Windows. I will try to build it in Linux shortly, maybe that is the problem all along...

Thx

推荐答案

我知道该主题已有3个月了,但是由于我遇到类似的问题并且没有找到有效的答案,因此我将发布该主题,也许会为其他人提供帮助:

I know the topic is 3 months old but since I had similar issue and didn't found any valid answer, I'll post mine, maybe it'll help for the others:

如此处指出 http://mail-archives.apache.org/mod_mbox/spark-user/201804.mbox/%3cCAAOnQ7v-oeWeW-VMtV5fuonjPau8vafzQPheypzjv+2M8aEp=Q@mail.gmail.com%3e /a>,则问题可能出在不同的类路径分隔符上.为了进行测试,我最后通过从官方Spark-Hadoop软件包修改/kubernetes/dockerfiles/spark/Dockerfile进行了测试.我在ENV SPARK_HOME /opt/spark之前直接添加了这两行,我的工作就可以开始了:

As pointed here http://mail-archives.apache.org/mod_mbox/spark-user/201804.mbox/%3cCAAOnQ7v-oeWeW-VMtV5fuonjPau8vafzQPheypzjv+2M8aEp=Q@mail.gmail.com%3e, the problem may come from different classpath separator. To test, I ended up by modifying /kubernetes/dockerfiles/spark/Dockerfile from official Spark-Hadoop package. I added these 2 lines directly before ENV SPARK_HOME /opt/spark and my job could start:

COPY examples/jars/spark-examples_2.11-2.3.0.jar /opt/spark/jars
COPY examples/jars/scopt_2.11-3.7.0.jar /opt/spark/jars

这是一种解决方法,而不是适当的解决方案,但至少可以进行测试.

It's a workaround instead of proper solution but at least it lets to make the tests.

spark-submit命令如下:

./bin/spark-submit.cmd  --master k8s://localhost:6445  --deploy-mode cluster  --name spark-pi --class org.apache.spark.examples.SparkPi --conf spark.executor.instances=2  --conf spark.kubernetes.container.image=spark:latest --conf spark.app.name=spark-pi   local:///opt/spark/jars/spark-examples_2.11-2.3.0.jar

然后我像这样构建Docker映像:docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .

And I build Docker image like that: docker build -t spark:latest -f kubernetes/dockerfiles/spark/Dockerfile .

这篇关于Spark 2.3-Minikube-Kubernetes-Windows-演示-未找到SparkPi的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!