本文介绍了使用spark-submit部署程序时出现java.lang.NoSuchMethodError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个程序,用于将数据上传到某些s3a://链接.该程序通过mvn install编译.在本地运行程序(如使用java -jar jarfile.jar一样)没有返回错误.但是,当我使用spark-submit(如使用spark-submit jarfile.jar一样)时,它返回了这样的错误:

I am writing a program to upload a data to some s3a:// link. The program is compiled through mvn install. Running the program locally (as in using java -jar jarfile.jar) returned no error. However, when I use spark-submit (as in using spark-submit jarfile.jar), it returned such error:

错误日志可以追溯到我的源代码的这一部分:

The error log traced to this portion of my source code:

sparkDataset
        .write()
        .format("parquet")
        .mode(SaveMode.Overwrite)
        .save("some s3a:// link");

其中sparkDatasetorg.apache.spark.sql.Dataset的实例.

尝试如何从Apache Spark访问s3a://文件? 不成功,并返回了另一个错误,例如:

Trying How to access s3a:// files from Apache Spark? is unsuccessful and returned another error as such:

java.lang.NoSuchMethodError的问题:org.apache .hadoop.conf.Configuration.reloadExistingConfigurations()V 也不太可能,因为我可以在本地运行,而兼容性不是问题.

Problem from java.lang.NoSuchMethodError: org.apache.hadoop.conf.Configuration.reloadExistingConfigurations()V is also unlikely because I can run locally, in which the compatilibity is not a problem.

此外,这些是我使用的相关库的版本:

In addition, these are the version of related libraries that I used:

  • aws-java-sdk-bundle:1.11.199
  • hadoop-aws:3.0.0

我希望通过s3a://链接写入文件.我认为依赖不是问题,因为我可以在本地运行.我只有在使用spark-submit来运行此程序时才遇到此问题.有人对如何解决这个问题有任何想法吗?

I am expecting files written through the s3a:// links. I think dependency is not the issue because I can run locally. I only face this problem when using spark-submit to run this program. Anyone have any ideas on how to resolve this?

此外,我已经检查了spark提交的spark版本是否据说是针对hadoop 2.7及更高版本构建的.我严格使用hadoop 3.0.0.难道这就是为什么我的程序中发生这种错误的线索?

In addition, I have checked that the spark version of the spark submit is said to be built for hadoop 2.7 and above. I am strictly using hadoop 3.0.0. Could this be a clue for why such error happened in my program?

推荐答案

从似乎可以指导我找到自己的解决方案.

Answer from Run spark-submit with my own build of hadoop had seem to guide me on finding my own solution.

基于我的理解,出于某些未知原因*,发行版"spark-2.4.0-bin-hadoop2.7.tgz"提供的spark-submit将排除在您的应用程序中一起编译的所有hadoop程序包

Based on my understanding, for some unknown reason*, the spark-submit provided by the distribution 'spark-2.4.0-bin-hadoop2.7.tgz' will exclude any packages of hadoop that is compiled together in your application.

之所以引发NoSuchMethodError错误的原因是,直到Hadoop版本2.8.x,方法reloadExistingConfiguration才存在.似乎在编写实木复合地板的过程中会以某种方式调用此特定方法.

The reason why was the NoSuchMethodError error raised is because the method reloadExistingConfiguration does not exist until Hadoop version 2.8.x. It seemed that writing a parquet would somehow invoke this particular method along the way.

我的解决方案是在将"spark-2.4.0-without-hadoop.tgz"连接到hadoop 3.0.0时使用单独的发行版,以便即使spark-submit排除了在执行期间将其打包到您的应用程序中.

My solution is to use the separate distribution of 'spark-2.4.0-without-hadoop.tgz' while connecting it to hadoop 3.0.0 so that it will use the correct version of hadoop even if spark-submit excluded the packages in your application during execution.

此外,由于无论如何都会通过spark-submit排除软件包,因此在通过Maven进行编译期间,我不会创建胖子.相反,我将在执行过程中使用标志--packages来指定运行我的应用程序所需的依赖项.

In addition, since the packages would be excluded by spark-submit anyway, I would not create a fat jar during compilation through Maven. Instead, I would use the flag --packages during execution to specify the dependencies that is required to run my application.

这篇关于使用spark-submit部署程序时出现java.lang.NoSuchMethodError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 15:22