问题描述
我有一个包含100万个对象的列表,我需要将其填充到Map中。现在,我想减少将其填充到Map中的时间,为此我计划使用Java 8 parallelstream(),如下所示:
I have a list of 1 million objects, and I need to populate that into a Map. Now, I want to reduce the time for populating this into a Map, and for this I am planning on using Java 8 parallelstream() like this:
List<Person> list = new LinkedList<>();
Map<String, String> map = new HashMap<>();
list.parallelStream().forEach(person ->{
map.put(person.getName(), person.getAge());
});
我想问一下,通过并行线程填充这样的Map是否安全。难道不可能出现并发问题,并且某些数据可能会在Map中丢失吗?
I want to ask is it safe to populate a Map like this through parallel threads. Isn't it possible to have concurrency issues, and some data may get lost in the Map ?
推荐答案
将 parallelStream()
用于收集到 HashMap
。但是,使用 parallelStream()
, forEach
并将消费者添加到<$ c $是不安全的。 c> HashMap 。
It is very safe to use parallelStream()
to collect into a HashMap
. However, it is not safe to use parallelStream()
, forEach
and a consumer adding things to a HashMap
.
HashMap
不是同步类,并试图放其中的元素同时无法正常工作。这是将执行,它将调用给定的使用者,该使用者将元素放入 HashMap
,来自多个线程,可能同时。如果你想要一个简单的代码来证明这个问题:
HashMap
is not a synchronized class, and trying to put elements in it concurrently will not work properly. This is what forEach
will do, it will invoke the given consumer, which puts elements into the HashMap
, from multiple threads, possibly at the same time. If you want a simple code demonstrating the issue:
List<Integer> list = IntStream.range(0, 10000).boxed().collect(Collectors.toList());
Map<Integer, Integer> map = new HashMap<>();
list.parallelStream().forEach(i -> {
map.put(i, i);
});
System.out.println(list.size());
System.out.println(map.size());
确保运行几次。操作后打印的地图大小不是10000,这是列表的大小,但稍微少一点,这是一个非常好的机会(并发的乐趣)。
Make sure to run it a couple of times. There's a very good chance (the joy of concurrency) that the printed map size after the operation is not 10000, which is the size of the list, but slightly less.
这里的解决方案一如既往不是使用 forEach
,而是使用方法,使用 collect
方法和内置:
The solution here, as always, is not to use forEach
, but to use a mutable reduction approach with the collect
method and the built-in toMap
:
Map<Integer, Integer> map = list.parallelStream().collect(Collectors.toMap(i -> i, i -> i));
在上面的示例代码中使用该行代码,您可以放心,地图大小将会始终为10000. Stream API确保收集到非线程安全容器,即使是并行的。这也意味着您不需要使用为了安全起见,如果你特别想要这个收藏家是必需的一个 ConcurrentMap
结果,而不是一般 Map
;但就线程安全而言,关于收集
,你可以使用两者。
Use that line of code in the sample code above, and you can rest assured that the map size will always be 10000. The Stream API ensures that it is safe to collect into a non-thread safe container, even in parallel. Which also means that you don't need to use toConcurrentMap
to be safe, this collector is needed if you specifically want a ConcurrentMap
as result, not a general Map
; but as far as thread safety is concerned with regard to collect
, you can use both.
这篇关于使用parallelstream()在Java 8中填充Map是否安全的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!