本文介绍了电光提交java.lang.IllegalArgumentException:无法从空字符串创建路径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在电光提交时收到此错误。java.lang.IllegalArgumentException:无法从空字符串创建路径我使用的是电光2.4.7版Hadoop版本3.3.0智能积德JDK 8首先,我得到了类未找到错误,我解决了这个问题,现在我得到了这个错误是因为数据集还是其他什么原因。https://www.kaggle.com/datasnaek/youtube-new?select=INvideos.csv数据集链接
错误:
C:sparkspark-2.4.7-bin-hadoop2.7in>spark-submit --class org.example.TopViewedCategories --master local C:UsersPiyushIdeaProjectsBDA argetBDA-1.0-SNAPSHOT.jar
Started Processing
21/05/04 06:56:04 INFO SparkContext: Running Spark version 2.4.7
21/05/04 06:56:04 INFO SparkContext: Submitted application: YouTubeDM
21/05/04 06:56:04 INFO SecurityManager: Changing view acls to: Piyush
21/05/04 06:56:04 INFO SecurityManager: Changing modify acls to: Piyush
21/05/04 06:56:04 INFO SecurityManager: Changing view acls groups to:
21/05/04 06:56:04 INFO SecurityManager: Changing modify acls groups to:
21/05/04 06:56:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(Piyush); groups with view permissions: Set(); users with modify permissions: Set(Piyush); groups with modify permissions: Set()
21/05/04 06:56:04 INFO Utils: Successfully started service 'sparkDriver' on port 63708.
21/05/04 06:56:04 INFO SparkEnv: Registering MapOutputTracker
21/05/04 06:56:04 INFO SparkEnv: Registering BlockManagerMaster
21/05/04 06:56:04 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/05/04 06:56:04 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/05/04 06:56:04 INFO DiskBlockManager: Created local directory at C:UsersPiyushAppDataLocalTemplockmgr-9f91b0fe-b655-422e-b0bf-38172b70dff0
21/05/04 06:56:05 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
21/05/04 06:56:05 INFO SparkEnv: Registering OutputCommitCoordinator
21/05/04 06:56:05 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/05/04 06:56:05 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://DESKTOP-IBFFKH9:4040
21/05/04 06:56:05 INFO SparkContext: Added JAR file:/C:/Users/Piyush/IdeaProjects/BDA/target/BDA-1.0-SNAPSHOT.jar at spark://DESKTOP-IBFFKH9:63708/jars/BDA-1.0-SNAPSHOT.jar with timestamp 1620091565160
21/05/04 06:56:05 INFO Executor: Starting executor ID driver on host localhost
21/05/04 06:56:05 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 63723.
21/05/04 06:56:05 INFO NettyBlockTransferService: Server created on DESKTOP-IBFFKH9:63723
21/05/04 06:56:05 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/05/04 06:56:05 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, DESKTOP-IBFFKH9, 63723, None)
21/05/04 06:56:05 INFO BlockManagerMasterEndpoint: Registering block manager DESKTOP-IBFFKH9:63723 with 366.3 MB RAM, BlockManagerId(driver, DESKTOP-IBFFKH9, 63723, None)
21/05/04 06:56:05 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, DESKTOP-IBFFKH9, 63723, None)
21/05/04 06:56:05 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, DESKTOP-IBFFKH9, 63723, None)
Exception in thread "main" java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
at org.apache.hadoop.fs.Path.<init>(Path.java:183)
at org.apache.hadoop.fs.Path.getParent(Path.java:356)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:517)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:504)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:531)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:504)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:531)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:504)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:531)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:504)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirsWithOptionalPermission(RawLocalFileSystem.java:531)
at org.apache.hadoop.fs.RawLocalFileSystem.mkdirs(RawLocalFileSystem.java:504)
at org.apache.hadoop.fs.ChecksumFileSystem.mkdirs(ChecksumFileSystem.java:694)
at org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.setupJob(FileOutputCommitter.java:313)
at org.apache.hadoop.mapred.FileOutputCommitter.setupJob(FileOutputCommitter.java:131)
at org.apache.hadoop.mapred.OutputCommitter.setupJob(OutputCommitter.java:265)
at org.apache.spark.internal.io.HadoopMapReduceCommitProtocol.setupJob(HadoopMapReduceCommitProtocol.scala:162)
at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:74)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1096)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:1067)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply$mcV$sp(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:957)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply$mcV$sp(RDD.scala:1544)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1523)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1523)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:385)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1523)
at org.apache.spark.api.java.JavaRDDLike$class.saveAsTextFile(JavaRDDLike.scala:550)
at org.apache.spark.api.java.AbstractJavaRDDLike.saveAsTextFile(JavaRDDLike.scala:45)
at org.example.TopViewedCategories.main(TopViewedCategories.java:46)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
21/05/04 06:56:06 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:UsersPiyushAppDataLocalTempspark-2bac840b-8170-477d-a9ec-dd5f1f9283c2
java.io.IOException: Failed to delete: C:UsersPiyushAppDataLocalTempspark-2bac840b-8170-477d-a9ec-dd5f1f9283c2userFiles-897873ea-324a-432c-85a1-786e5797243aBDA-1.0-SNAPSHOT.jar
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1062)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
21/05/04 06:56:06 ERROR ShutdownHookManager: Exception while deleting Spark temp dir: C:UsersPiyushAppDataLocalTempspark-2bac840b-8170-477d-a9ec-dd5f1f9283c2userFiles-897873ea-324a-432c-85a1-786e5797243a
java.io.IOException: Failed to delete: C:UsersPiyushAppDataLocalTempspark-2bac840b-8170-477d-a9ec-dd5f1f9283c2userFiles-897873ea-324a-432c-85a1-786e5797243aBDA-1.0-SNAPSHOT.jar
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:144)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursivelyUsingJavaIO(JavaUtils.java:128)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:118)
at org.apache.spark.network.util.JavaUtils.deleteRecursively(JavaUtils.java:91)
at org.apache.spark.util.Utils$.deleteRecursively(Utils.scala:1062)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:65)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1$$anonfun$apply$mcV$sp$3.apply(ShutdownHookManager.scala:62)
at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:186)
at org.apache.spark.util.ShutdownHookManager$$anonfun$1.apply$mcV$sp(ShutdownHookManager.scala:62)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:214)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1$$anonfun$apply$mcV$sp$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1945)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply$mcV$sp(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anonfun$runAll$1.apply(ShutdownHookManager.scala:188)
at scala.util.Try$.apply(Try.scala:192)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:188)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:178)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:54)
这是代码
package org.example;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaPairRDD;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import scala.Tuple2;
import java.util.List;
public class TopViewedCategories {
public static void main(String[] args) throws Exception {
long timeElapsed = System.currentTimeMillis();
System.out.println("Started Processing");
SparkConf conf = new SparkConf()
.setMaster("local")
.setAppName("YouTubeDM");
JavaSparkContext sc = new JavaSparkContext(conf);
//Valid log levels include: ALL, DEBUG, ERROR, FATAL, INFO, OFF, TRACE, WARN
sc.setLogLevel("ERROR");
JavaRDD<String> mRDD = sc.textFile("C:/Users/Piyush/Desktop/bda/INvideos"); //directory where the files are
JavaPairRDD<Double,String> sortedRDD = mRDD
// .filter(line -> line.split(" ").length > 6)
.mapToPair(
line -> {
String[] lineArr = line.split(" ");
String category = lineArr[5];
Double views = Double.parseDouble(lineArr[1]);
Tuple2<Double, Integer> viewsTuple = new Tuple2<>(views, 1);
return new Tuple2<>(category, viewsTuple);
})
.reduceByKey((x, y) -> new Tuple2<>(x._1 + y._1, x._2 + y._2)) .mapToPair(x -> new Tuple2<>(x._1, (x._2._1 / x._2._2)))
.mapToPair(Tuple2::swap)
.sortByKey(false);
// .take(10);
long count = sortedRDD.count();
List<Tuple2<Double, String>> topTenTuples = sortedRDD.take(10);
JavaPairRDD<Double, String> topTenRdd = sc.parallelizePairs(topTenTuples); String output_dir = "C:output/spark/TopViewedCategories";
//remove output directory if already there
FileSystem fs = FileSystem.get(sc.hadoopConfiguration());
fs.delete(new Path(output_dir), true); // delete dir, true for recursive
topTenRdd.saveAsTextFile(output_dir);
timeElapsed = System.currentTimeMillis() - timeElapsed;
System.out.println("Done.Time taken (in seconds): " + timeElapsed/1000f); System.out.println("Processed Records: " + count);
sc.stop();
sc.close();
}
}
请帮我解决这个问题
推荐答案
似乎output_dir
变量包含错误路径:
String output_dir = "C:output/spark/TopViewedCategories";
因为它fs.delete(new Path(output_dir), true)
引发
java.lang.IllegalArgumentException: Can not create a Path from an empty string
at org.apache.hadoop.fs.Path.checkPathArg(Path.java:126)
这篇关于电光提交java.lang.IllegalArgumentException:无法从空字符串创建路径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!