我无法连接写在eclipse上的spark代码。

以下是代码,请指导我如何做。
一切都会有所帮助

>
>     import java.util.Arrays;
>
>     import org.apache.spark.SparkConf;
>     import org.apache.spark.api.java.JavaPairRDD;
>     import org.apache.spark.api.java.JavaRDD;
>     import org.apache.spark.api.java.JavaSparkContext;
>
>      public class SparkTest {
>
        public static void main(String[] args) {

>          SparkConf conf = new SparkConf()
              .setAppName("JD Word Counter").setMaster("local");

>
>          JavaSparkContext sc = new JavaSparkContext(conf);
>               //hdfs://localhost:8020/user/root/textfile/test.txt
           JavaRDD<String> inputFile = sc.textFile("hdfs://localhost:8020/user/root/textfile/test.txt");

>          System.out.println("Hello start");
>          System.out.println(inputFile.collect());
           JavaRDD<String> wordsFromFile = inputFile.flatMap(content ->
            Arrays.asList(content.split(" ")).iterator());

>          System.out.println("hello end");
>
>
>          //JavaPairRDD countData = wordsFromFile.mapToPair(t -> new Tuple2(t, 1)).reduceByKey((x, y) -> (int) x + (int) y);
          //wordsFromFile
           .saveAsTextFile("hdfs://localhost:8020/user/root/fileTest/");

>
>          System.out.println(" This java program is complete");

       }
>
>     }
>

错误:
> I/O error constructing remote block reader.
> org.apache.hadoop.net.ConnectTimeoutException: 60000 millis timeout
> while waiting for  channel to be ready for connect. ch :
> java.nio.channels.SocketChannel[connection-pending
> remote=/172.18.0.2:50010] at org.apache.hadoop.net.NetUtils.c

最佳答案

localhost更改为hdp沙箱的ip address或将hdfs-site.xml文件放入您的类路径中,并确保所有端口均已打开并且可以从外部计算机访问。

关于eclipse - 无法从Eclipse将Spark与Hortonworks Sandbox连接,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/62108472/

10-12 00:12
查看更多