本文介绍了使用分组,计数和过滤操作收集流的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试收集丢弃很少使用的项目的流,例如:

I'm trying to collect stream throwing away rarely used items like in this example:

import java.util.*;
import java.util.function.Function;
import static java.util.stream.Collectors.*;
import static org.hamcrest.MatcherAssert.assertThat;
import static org.hamcrest.Matchers.containsInAnyOrder;
import org.junit.Test;

@Test
public void shouldFilterCommonlyUsedWords() {
    // given
    List<String> allWords = Arrays.asList(
       "call", "feel", "call", "very", "call", "very", "feel", "very", "any");

    // when
    Set<String> commonlyUsed = allWords.stream()
            .collect(groupingBy(Function.identity(), counting()))
            .entrySet().stream().filter(e -> e.getValue() > 2)
            .map(Map.Entry::getKey).collect(toSet());

    // then
    assertThat(commonlyUsed, containsInAnyOrder("call", "very"));
}

我觉得有可能做得更简单 - 我是对吗?

I have a feeling that it is possible to do it much simpler - am I right?

推荐答案

前一段时间我方法到StreamEx 0.3.1。它是使用自定义分裂器。基准测试显示,对于顺序流,此实现比上述状态过滤快得多,并且在许多情况下,它也比本主题中提出的其他解决方案更快。如果在流中遇到 null ,它也能很好地工作( groupingBy 收藏家不支持 null 作为类,因此如果遇到 null groupingBy -based解决方案将失败。

Update (2015/05/31): I added the distinct(atLeast) method to the StreamEx 0.3.1. It's implemented using custom spliterator. Benchmarks showed that this implementation is significantly faster for sequential streams than stateful filtering described above and in many cases it's also faster than other solutions proposed in this topic. Also it works nicely if null is encountered in the stream (the groupingBy collector doesn't support null as class, thus groupingBy-based solutions will fail if null is encountered).

这篇关于使用分组,计数和过滤操作收集流的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!