使用 MapReduce 进行排列

本文介绍了使用 MapReduce 进行排列的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

有没有办法使用 MapReduce 生成排列?

Is there a way to generate permutations with MapReduce?

输入文件:

1  title1
2  title2
3  title3

我的目标:

1,2  title1,title2
1,3  title1,title3
2,3  title2,title3

推荐答案

由于文件将有 n 输入，排列应该有 n^2 输出.您可以让 n 个任务执行其中的 n 个操作，这是有道理的.我相信你可以做到这一点(假设只有一个文件):

Since a file will have n inputs, the permutations should have n^2 outputs. It makes sense that you could have n tasks perform n of those operations. I believe you could do this (assuming only for one file):

将您的输入文件放入 DistributedCache 以只读方式访问您的 Mapper/Reducers.在文件的每一行上进行输入拆分(如在 WordCount 中).因此，映射器将收到一行(例如您的示例中的 title1 ).然后从 DistributedCache 中的文件中读取行并发出您的键/值对:将键作为输入，将值作为来自 DistributedCache 的文件中的每一行.

Put your input file into the DistributedCache to be accessible as read-only to your Mapper/Reducers. Make an input split on each line of the file (like in WordCount). The mapper will thus recieve one line (e.g. title1 in your example). Then read the lines out of the file in the DistributedCache and emit your key/value pairs: with the key as your input and the values as each line from the file from DistributedCache.

在此模型中，您应该只需要一个 Map 步骤.

In this model, you should only need a Map step.

类似:

  public static class PermuteMapper
       extends Mapper<Object, Text, Text, Text>{

    private static final IN_FILENAME="file.txt";

    public void map(Object key, Text value, Context context
                    ) throws IOException, InterruptedException {

      String inputLine = value.toString();

      // set the property mapred.cache.files in your
      // configuration for the file to be available
      Path[] cachedPaths = DistributedCache.getLocalCacheArchives(conf);
      if ( cachedPaths[0].getName().equals(IN_FILENAME) ) {
         // function defined elsewhere
         String[] cachedLines = getLinesFromPath(cachedPaths[0]);
         for (String line : cachedLines)
           context.emit(inputLine, line);
      }
    }
  }

这篇关于使用 MapReduce 进行排列的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！