我正在尝试创建一个动态 map 缩减应用程序,该应用程序从外部属性文件获取尺寸。主要问题在于以下事实:变量即键将是复合的,并且可以是任意数字,例如,一对3个键,一对4个键等。
我的 map 绘制者:
public void map(AvroKey<flumeLogs> key, NullWritable value, Context context) throws IOException, InterruptedException{
Configuration conf = context.getConfiguration();
int dimensionCount = Integer.parseInt(conf.get("dimensionCount"));
String[] dimensions = conf.get("dimensions").split(","); //this gets the dimensions from the run method in main
Text[] values = new Text[dimensionCount]; //This is supposed to be my composite key
for (int i=0; i<dimensionCount; i++){
switch(dimensions[i]){
case "region": values[i] = new Text("-");
break;
case "event": values[i] = new Text("-");
break;
case "eventCode": values[i] = new Text("-");
break;
case "mobile": values[i] = new Text("-");
}
}
context.write(new StringArrayWritable(values), new IntWritable(1));
}
这些值稍后将具有良好的逻辑。
我的StringArrayWritable:
public class StringArrayWritable extends ArrayWritable {
public StringArrayWritable() {
super(Text.class);
}
public StringArrayWritable(Text[] values){
super(Text.class, values);
Text[] texts = new Text[values.length];
for (int i = 0; i < values.length; i++) {
texts[i] = new Text(values[i]);
}
set(texts);
}
@Override
public String toString(){
StringBuilder sb = new StringBuilder();
for(String s : super.toStrings()){
sb.append(s).append("\t");
}
return sb.toString();
}
}
我得到的错误:
Error: java.io.IOException: Initialization of all the collectors failed. Error in last collector was :class StringArrayWritable
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:414)
at org.apache.hadoop.mapred.MapTask.access$100(MapTask.java:81)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.<init>(MapTask.java:698)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:770)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.lang.ClassCastException: class StringArrayWritable
at java.lang.Class.asSubclass(Class.java:3165)
at org.apache.hadoop.mapred.JobConf.getOutputKeyComparator(JobConf.java:892)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1005)
at org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:402)
... 9 more
任何帮助将不胜感激。
非常感谢。
最佳答案
您正在尝试使用可写对象作为键。在mapreduce中,密钥必须实现WritableComparable
接口(interface)。 ArrayWritable
仅实现Writable
接口(interface)。
两者之间的区别在于,comaprable接口(interface)要求您实现compareTo
方法,以便mapreduce能够正确地对键进行排序和分组。
关于hadoop - ArrayWritable作为Hadoop MapReduce中的键,我们在Stack Overflow上找到一个类似的问题:https://stackoverflow.com/questions/39248620/