本文介绍了pyspark:尽管将Winutils添加到HADOOP_HOME中,但出现错误:在Hadoop二进制文件中找不到可执行文件null \ bin \ winutils.exe的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 HADOOP_HOME 环境变量中设置了 winutils.exe 路径.我还在pyspark的PATH变量中设置了其他路径,例如python,spark,java和所有这些路径.从命令提示符运行 pyspark 时,我仍然遇到错误:

I set winutils.exe path in HADOOP_HOME environment variable. I also set other paths such as python,spark,java and all these paths in PATH variable as well for pyspark. When running pyspark from command prompt I'm still facing the error :

ERROR Shell: Failed to locate the winutils binary in the hadoop binary path
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
        at org.apache.hadoop.util.Shell.getQualifiedBinPath(Shell.java:379)
        at org.apache.hadoop.util.Shell.getWinUtilsPath(Shell.java:394)
        at org.apache.hadoop.util.Shell.<clinit>(Shell.java:387)
        at org.apache.hadoop.hive.conf.HiveConf$ConfVars.findHadoopBinary(HiveConf.java:2327)
        at org.apache.hadoop.hive.conf.HiveConf$ConfVars.<clinit>(HiveConf.java:365)
        at org.apache.hadoop.hive.conf.HiveConf.<clinit>(HiveConf.java:105)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Unknown Source)
        at py4j.reflection.CurrentThreadClassLoadingStrategy.classForName(CurrentThreadClassLoadingStrategy.java:40)
        at py4j.reflection.ReflectionUtil.classForName(ReflectionUtil.java:51)
        at py4j.reflection.TypeUtil.forName(TypeUtil.java:243)
        at py4j.commands.ReflectionCommand.getUnknownMember(ReflectionCommand.java:175)
        at py4j.commands.ReflectionCommand.execute(ReflectionCommand.java:87)
        at py4j.GatewayConnection.run(GatewayConnection.java:214)
        at java.lang.Thread.run(Unknown Source)
.
.
.
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"

如何摆脱这个错误?

推荐答案

变量 HADOOP_HOME 不应直接指向 winutils.exe ,而应指向具有 bin \ winutils.exe .

The variable HADOOP_HOME should not point to winutils.exe directly, but to a folder having a bin\winutils.exe in it.

例如

如果您具有 C:\ hadoop \ bin \ winutils.exe ,则将 HADOOP_HOME 设置为 C:\ hadoop

if you have C:\hadoop\bin\winutils.exe, then set HADOOP_HOME to C:\hadoop

这篇关于pyspark:尽管将Winutils添加到HADOOP_HOME中,但出现错误:在Hadoop二进制文件中找不到可执行文件null \ bin \ winutils.exe的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-20 01:20