我正在构建 spark 2.4.3
以使其与最新的 hadoop 3.2.0
兼容。
源代码从https://www.apache.org/dyn/closer.lua/spark/spark-2.4.3/spark-2.4.3.tgz下载
构建命令是 ./build/mvn -Pyarn -Phadoop-3.2 -Dhadoop.version=3.2.0 -DskipTests clean package
构建结果是:
[INFO] Spark Project Parent POM ........................... SUCCESS [ 1.761 s]
[INFO] Spark Project Tags ................................. SUCCESS [ 1.221 s]
[INFO] Spark Project Sketch ............................... SUCCESS [ 0.551 s]
[INFO] Spark Project Local DB ............................. SUCCESS [ 0.608 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 1.558 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 0.631 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 0.444 s]
[INFO] Spark Project Launcher ............................. SUCCESS [ 2.501 s]
[INFO] Spark Project Core ................................. SUCCESS [ 13.536 s]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 0.549 s]
[INFO] Spark Project GraphX ............................... SUCCESS [ 1.614 s]
[INFO] Spark Project Streaming ............................ SUCCESS [ 3.332 s]
[INFO] Spark Project Catalyst ............................. SUCCESS [ 14.271 s]
[INFO] Spark Project SQL .................................. SUCCESS [ 13.008 s]
[INFO] Spark Project ML Library ........................... SUCCESS [ 7.923 s]
[INFO] Spark Project Tools ................................ SUCCESS [ 0.187 s]
[INFO] Spark Project Hive ................................. SUCCESS [ 6.664 s]
[INFO] Spark Project REPL ................................. SUCCESS [ 1.285 s]
[INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 4.824 s]
[INFO] Spark Project YARN ................................. SUCCESS [ 3.020 s]
[INFO] Spark Project Assembly ............................. SUCCESS [ 1.558 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 1.411 s]
[INFO] Kafka 0.10+ Source for Structured Streaming ........ SUCCESS [ 1.573 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 1.702 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [ 5.969 s]
[INFO] Spark Avro ......................................... SUCCESS [ 0.702 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:32 min
[INFO] Finished at: 2019-07-31T18:56:24+08:00
[INFO] ------------------------------------------------------------------------
[WARNING] The requested profile "hadoop-3.2" could not be activated because it does not exist.
按照我的预期,在build目录下会生成一个像
spark-2.4.3-bin-hadoop3.2.tgz
这样的一体式压缩文件,就像官网下载的二进制文件https://www.apache.org/dyn/closer.lua/spark/spark-2.4.3/spark-2.4.3-bin-hadoop2.7.tgz一样。如何删除警告
The requested profile "hadoop-3.2" could not be activated because it does not exist
,这是什么意思? 最佳答案
注意:如果您不知道自己在做什么,那么您正在尝试做的事情可能会导致非常不稳定的环境。
也就是说,spark 2.4.x 稳定版没有配置文件 hadoop-3.2
,它有 hadoop-3.1
。
您需要从 master 中提取代码以实现您想要实现的目标。
如果您的唯一目的是使 spark 2.4.3
与 hadoop 3.2
兼容,您可以查看 master 中的配置文件以及相关更改,然后将其挑选到您自己的工作区中。
关于apache-spark - 无法针对 hadoop 3.2.0 构建 spark2.4.3,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/57292328/