本文介绍了Apache Spark Function2,未正确声明的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

(这是基于尝试将Integer RDD映射到TholdDropResult RDD的,但是我们需要初始化一个SparkDoDrop来生成所有(10 ^ 8)TholdDropResults,因此使用 mapPartitionsWithIndex mapPartition在Java中的风格,它将提供我们所需的函数类型(方法).)

(This is based on trying to map a Integer RDD to a TholdDropResult RDD, but we need to initialize a single SparkDoDrop to generate all the (10^8) TholdDropResults, hence the use of mapPartitionsWithIndex, the only flavor in Java of mapPartition that will provide the type of function we need, methinks.)

问题:使用 org.apache.spark.api.java.function.Function2

我无法弄清楚如何将布尔值"转换为新Function2

I am not able to figure out how to work the "boolean" into a new Function2

当我尝试此代码时,向右滚动以查看似乎给我带来麻烦的 new Function2 声明(从答案中添加了构建器样式的格式):

When I try this code, scroll right to see the new Function2 declaration that appears to be giving me trouble (added builder-style formatting from answer):

JavaRDD<TholdDropResult> dropResultsN = dataSetN.mapPartitionsWithIndex(
                                      new Function2<Integer, 
                                      Iterator<Integer>, 
                                      Iterator<TholdDropResult>>(){

        @Override
        public Iterator<TholdDropResult> call(Integer partitionID, Iterator<Integer> integerIterator) throws Exception {
            //
            SparkDoDrop standin = makeNewSparkDoDrop();
            standin.initializeLI();
            List<TholdDropResult> rddToReturn = new ArrayList<>();
            while (integerIterator.hasNext()){
                rddToReturn.add(standin.call(integerIterator.next()));
            }
            return rddToReturn.iterator();

        }});
    dropResultsN.persist(StorageLevel.MEMORY_ONLY());

这是我运行 gradle build 时的全部错误:

Here's the full error when I run gradle build:

JavaRDD<TholdDropResult> dropResultsN = dataSetN.mapPartitionsWithIndex(new Function2<Integer, Iterator<Integer>, Iterator<TholdDropResult>>(){
required: Function2<Integer,Iterator<Integer>,Iterator<R>>,boolean
  found: <anonymous Function2<Integer,Iterator<Integer>,Iterator<TholdDropResult>>>
  reason: cannot infer type-variable(s) R
    (actual and formal argument lists differ in length)
  where R,T,This are type-variables:
    R extends Object declared in method <R>mapPartitionsWithIndex(Function2<Integer,Iterator<T>,Iterator<R>>,boolean)
    T extends Object declared in class AbstractJavaRDDLike
    This extends JavaRDDLike<T,This> declared in class AbstractJavaRDDLike

当我尝试在其中放置布尔arg时,如下所示: new Function2< Integer,Iterator< Integer> ;, Iterator< TholdDropResult> ;, Boolean>()我收到错误消息:

When I try to place the Boolean arg in there like so:new Function2<Integer, Iterator<Integer>, Iterator<TholdDropResult>, Boolean>()I get an error:

error: wrong number of type arguments; required 3
            JavaRDD<TholdDropResult> dropResultsN = dataSetN.mapPartitionsWithIndex(new Function2<Integer, Iterator<Integer>, Iterator<TholdDropResult>, Boolean>(){

最后,如果我使用 boolean 而不是 Boolean ,则会出现另一个错误:

Finally if I use boolean instead of Boolean I get another error:

error: unexpected type
            JavaRDD<TholdDropResult> dropResultsN = dataSetN.mapPartitionsWithIndex(new Function2<Integer, Iterator<Integer>, Iterator<TholdDropResult>, boolean>(){
                                                                                                                                                         ^
  required: reference
  found:    boolean

error: wrong number of type arguments; required 3
            JavaRDD<TholdDropResult> dropResultsN = dataSetN.mapPartitionsWithIndex(new Function2<Integer, Iterator<Integer>, Iterator<TholdDropResult>, boolean>(){

推荐答案

您需要在 Boolean > 关闭 Function2 代码>:

You need the close the Function2 with an additional > before the Boolean:

JavaRDD<TholdDropResult> dropResultsN =
   dataSetN.mapPartitionsWithIndex(new Function2<Integer, 
                                                 Iterator<Integer>,
                                                 Iterator<TholdDropResult>>, Boolean>

mapPartitionsWithIndex 的签名如下:

<R> JavaRDD<R> mapPartitionsWithIndex(Function2<java.lang.Integer,
                                                java.util.Iterator<T>,
                                                java.util.Iterator<R>> f,
                                                boolean preservesPartitioning)

Function2 接受一个 Integer 和一个 Iterator< T> ,并返回一个 Iterator< R> .预期的 boolean 是在 Function2 中未定义的参数.

The Function2 takes an Integer and a Iterator<T> and returns an Iterator<R>. The boolean expected is a parameter not defined inside the Function2.

这篇关于Apache Spark Function2,未正确声明的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-25 11:33