在Windows上使用pyspark无法正常工作

在Windows上使用pyspark无法正常工作

本文介绍了在Windows上使用pyspark无法正常工作-py4j的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用在Windows上安装了Zeppelin本教程.我还安装了Java 8以避免出现问题.

I installed Zeppelin on Windows using this tutorial and this.I also installed java 8 to avoid problems.

我现在能够启动Zeppelin服务器,并且我正在尝试运行此代码-

I'm now able to start the Zeppelin server, and I'm trying to run this code -

%pyspark
a=5*4
print("value = %i" % (a))
sc.version

我遇到了与py4j有关的错误.之前我对此库有其他问题(与此处)一样,为了避免出现这种情况,我将计算机上的Zeppelin和Spark中的py4j库替换为最新版本-py4j 0.10.7.

I'm getting this error, related to py4j. I had other problems with this library before (same as here), and to avoid them I replaced the library of py4j in the Zeppelin and Spark on my computer with the latest version- py4j 0.10.7.

这是我得到的错误-

Traceback (most recent call last):
  File "C:\Users\SHIRM~1.ARG\AppData\Local\Temp\zeppelin_pyspark-1240802621138907911.py", line 309, in <module>
    sc = _zsc_ = SparkContext(jsc=jsc, gateway=gateway, conf=conf)
  File "C:\Users\SHIRM.ARGUS\spark-2.3.2\spark-2.3.2-bin-hadoop2.7\python\pyspark\context.py", line 118, in __init__
    conf, jsc, profiler_cls)
  File "C:\Users\SHIRM.ARGUS\spark-2.3.2\spark-2.3.2-bin-hadoop2.7\python\pyspark\context.py", line 189, in _do_init
    self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port, auth_token)
  File "C:\Users\SHIRM.ARGUS\Documents\zeppelin-0.8.0-bin-all\interpreter\spark\pyspark\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1525, in __call__
  File "C:\Users\SHIRM.ARGUS\Documents\zeppelin-0.8.0-bin-all\interpreter\spark\pyspark\py4j-0.10.7-src.zip\py4j\protocol.py", line 332, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:

我用Google搜索了它,但找不到它发生的任何人.

I googled it, but couldn't find anyone that it had happened to.

有人知道我该如何解决吗?

Does anyone have an idea how can I solve this?

谢谢

推荐答案

我觉得您已经安装了Java 9或10.卸载这两个版本中的任何一个,然后从此处安装Java 8的新副本: https://www.oracle.com/technetwork/java/javase/downloads/jdk8 -downloads-2133151.html

I feel you have installed Java 9 or 10. Uninstall either of those versions and install a fresh copy of Java 8 from here: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

然后在hadoop_env.cmd(使用任何文本编辑器打开)中设置JAVA_HOME.

注意:Java 8或7是稳定版本,可以使用和卸载任何现有的JAVA版本.确保在JAVA_HOME中添加JDK(不是JRE).

Note: Java 8 or 7 are stable versions to use and uninstall any existing versions of JAVA. Make sure you add JDK (not JRE) in JAVA_HOME.

这篇关于在Windows上使用pyspark无法正常工作-py4j的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-05 08:39