问题描述
我在Java中使用过RDD.flatMap函数.现在尝试使用DataFrames.
I have worked with RDD.flatMap function in java. Now trying my hands on DataFrames.
他们说:
public <R> RDD<R> flatMap(scala.Function1<org.apache.spark.sql.Row,
scala.collection.TraversableOnce<R>> f, scala.reflect.ClassTag<R> evidence$4)
指定者:RDDApi接口中的flatMap
Specified by: flatMap in interface RDDApi
但是当我尝试使用 Function1
时,它迫使我重写了很多未实现的方法.这就是我得到的:
But when I tried this, Function1
, is forcing me to override lots and lots of unimplemented methods. This is what I get:
RDD<Row> res = df.flatMap(new Function1<Row, TraversableOnce<Row>>() {
@Override
public <A> Function1<Row, A> andThen(
Function1<TraversableOnce<Row>, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcDD$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcDF$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcDI$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcDJ$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcFD$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcFF$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcFI$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcFJ$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcID$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcIF$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcII$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcIJ$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcJD$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcJF$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcJI$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcJJ$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcVD$sp(
Function1<BoxedUnit, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcVF$sp(
Function1<BoxedUnit, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcVI$sp(
Function1<BoxedUnit, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcVJ$sp(
Function1<BoxedUnit, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcZD$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcZF$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcZI$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<Object, A> andThen$mcZJ$sp(
Function1<Object, A> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public TraversableOnce<Row> apply(Row arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public double apply$mcDD$sp(double arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public double apply$mcDF$sp(float arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public double apply$mcDI$sp(int arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public double apply$mcDJ$sp(long arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public float apply$mcFD$sp(double arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public float apply$mcFF$sp(float arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public float apply$mcFI$sp(int arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public float apply$mcFJ$sp(long arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public int apply$mcID$sp(double arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public int apply$mcIF$sp(float arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public int apply$mcII$sp(int arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public int apply$mcIJ$sp(long arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public long apply$mcJD$sp(double arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public long apply$mcJF$sp(float arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public long apply$mcJI$sp(int arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public long apply$mcJJ$sp(long arg0) {
// TODO Auto-generated method stub
return 0;
}
@Override
public void apply$mcVD$sp(double arg0) {
// TODO Auto-generated method stub
}
@Override
public void apply$mcVF$sp(float arg0) {
// TODO Auto-generated method stub
}
@Override
public void apply$mcVI$sp(int arg0) {
// TODO Auto-generated method stub
}
@Override
public void apply$mcVJ$sp(long arg0) {
// TODO Auto-generated method stub
}
@Override
public boolean apply$mcZD$sp(double arg0) {
// TODO Auto-generated method stub
return false;
}
@Override
public boolean apply$mcZF$sp(float arg0) {
// TODO Auto-generated method stub
return false;
}
@Override
public boolean apply$mcZI$sp(int arg0) {
// TODO Auto-generated method stub
return false;
}
@Override
public boolean apply$mcZJ$sp(long arg0) {
// TODO Auto-generated method stub
return false;
}
@Override
public <A> Function1<A, TraversableOnce<Row>> compose(
Function1<A, Row> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcDD$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcDF$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcDI$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcDJ$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcFD$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcFF$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcFI$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcFJ$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcID$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcIF$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcII$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcIJ$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcJD$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcJF$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcJI$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcJJ$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, BoxedUnit> compose$mcVD$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, BoxedUnit> compose$mcVF$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, BoxedUnit> compose$mcVI$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, BoxedUnit> compose$mcVJ$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcZD$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcZF$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcZI$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
@Override
public <A> Function1<A, Object> compose$mcZJ$sp(
Function1<A, Object> arg0) {
// TODO Auto-generated method stub
return null;
}
}, evidence$4);
这看起来很奇怪,但是我继续将 evidence $ 4
设为:
This looks weird, but I went on to make evidence$4
as:
ClassTag<Row> evidence$4 = scala.reflect.ClassTag$.MODULE$.apply(Row.class);
我的意图是随便使用 flatMap
(当然,在DataFrame上而不是RDD上).因此,我不需要在 Row
上进行任何更改.可以按原样返回输入.
My intention is to just play around with flatMap
(of-course on DataFrames not on RDD). So I don't need any changes on Row
. Can return the input as is without any changes.
所以我想我只需要在 apply
方法中进行更改.
So I guess I need to make change only in apply
method.
@Override
public TraversableOnce<Row> apply(Row arg0) {
// TODO Auto-generated method stub
return null;
}
但是同样,我应该如何从 Row
获取 TraversableOnce< Row>
?
But again, how should I get TraversableOnce<Row>
from Row
?
我尝试的方法是否正确?还是我错过了什么?
Also, is the method I am trying correct? Or am I missing something?
我正在使用Apache Spark 1.3.1
I am using Apache Spark 1.3.1
推荐答案
您应该执行以下操作:
RDD<Row> res = df.flatMap(new AbstractFunction1<Row, TraversableOnce<Row>>() {
public TraversableOnce<Row> apply(Row row) {
return new ListSet<Row>().$plus(row); //Note the updated list is returned from $plus()
}
}, evidence$4);
这与 map
类似,但具有更大的更改自由.例如,要过滤掉某些内容,可以在想要返回空的 new ListSet< Row>()
时保持不变,或者保持当前行为. flatMap
非常灵活.
This would work similarly to map
, just with more freedom to change. For example to filter out things, you could return the empty new ListSet<Row>()
when you want to return it, or keep with the current behaviour. flatMap
is very flexible.
(似乎从Java集合到Scala集合的转换并不容易.)
(It seems the conversion from Java collections is not trivial to Scala collections.)
这篇关于Spark DataFrame.flatMap在Java中的用法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!