问题描述
有一种情况,地图会被构建,一旦初始化,就再也不会被修改了.但是,它将被多个线程访问(仅通过 get(key)).以这种方式使用 java.util.HashMap
是否安全?
There is a case where a map will be constructed, and once it is initialized, it will never be modified again. It will however, be accessed (via get(key) only) from multiple threads. Is it safe to use a java.util.HashMap
in this way?
(目前,我很高兴使用 java.util.concurrent.ConcurrentHashMap
,并且没有衡量提高性能的需要,但我只是好奇一个简单的 HashMap
就足够了.因此,这个问题不是我应该使用哪个?"也不是性能问题.而是它安全吗?")
(Currently, I'm happily using a java.util.concurrent.ConcurrentHashMap
, and have no measured need to improve performance, but am simply curious if a simple HashMap
would suffice. Hence, this question is not "Which one should I use?" nor is it a performance question. Rather, the question is "Would it be safe?")
推荐答案
你的习惯用法是安全的当且仅当对 HashMap
的引用是安全发布的.安全发布处理的是构造线程如何使对映射的引用对其他线程可见.
Your idiom is safe if and only if the reference to the HashMap
is safely published. Rather than anything relating the internals of HashMap
itself, safe publication deals with how the constructing thread makes the reference to the map visible to other threads.
基本上,这里唯一可能的竞争是 HashMap
的构造和在它完全构造之前可能访问它的任何读取线程之间的竞争.大多数讨论是关于 map 对象的状态会发生什么,但这无关紧要,因为您从不修改它 - 所以唯一有趣的部分是 HashMap
引用是如何发布的.
Basically, the only possible race here is between the construction of the HashMap
and any reading threads that may access it before it is fully constructed. Most of the discussion is about what happens to the state of the map object, but this is irrelevant since you never modify it - so the only interesting part is how the HashMap
reference is published.
例如,假设您像这样发布地图:
For example, imagine you publish the map like this:
class SomeClass {
public static HashMap<Object, Object> MAP;
public synchronized static setMap(HashMap<Object, Object> m) {
MAP = m;
}
}
... 并且在某些时候 setMap()
使用地图调用,其他线程正在使用 SomeClass.MAP
访问地图,并检查 null像这样:
... and at some point setMap()
is called with a map, and other threads are using SomeClass.MAP
to access the map, and check for null like this:
HashMap<Object,Object> map = SomeClass.MAP;
if (map != null) {
.. use the map
} else {
.. some default behavior
}
这是不安全,即使它看起来好像是.问题是没有 happens-before SomeObject.MAP
的集合与随后在另一个线程上读取之间的关系,因此读取线程可以自由地看到部分构造的映射.这几乎可以做任何事情,甚至在实践中它也可以做 将读取线程放入无限循环.
This is not safe even though it probably appears as though it is. The problem is that there is no happens-before relationship between the set of SomeObject.MAP
and the subsequent read on another thread, so the reading thread is free to see a partially constructed map. This can pretty much do anything and even in practice it does things like put the reading thread into an infinite loop.
为了安全地发布地图,您需要在编写参考到HashMap
之间建立happens-before关系(即、出版物)和该参考文献的后续读者(即消费).方便的是,完成只有几种容易记住的方法:
To safely publish the map, you need to establish a happens-before relationship between the writing of the reference to the HashMap
(i.e., the publication) and the subsequent readers of that reference (i.e., the consumption). Conveniently, there are only a few easy-to-remember ways to accomplish that:
- 通过正确锁定的字段交换引用 (JLS 17.4.5)
- 使用静态初始化器进行初始化存储(JLS 12.4)
- 通过可变字段交换引用 (JLS 17.4.5),或者作为此规则的结果,通过 AtomicX 类
- 将值初始化为最终字段 (JLS 17.5).
- Exchange the reference through a properly locked field (JLS 17.4.5)
- Use static initializer to do the initializing stores (JLS 12.4)
- Exchange the reference via a volatile field (JLS 17.4.5), or as the consequence of this rule, via the AtomicX classes
- Initialize the value into a final field (JLS 17.5).
您的场景中最有趣的是 (2)、(3) 和 (4).特别是,(3)直接适用于我上面的代码:如果您将 MAP
的声明转换为:
The ones most interesting for your scenario are (2), (3) and (4). In particular, (3) applies directly to the code I have above: if you transform the declaration of MAP
to:
public static volatile HashMap<Object, Object> MAP;
那么一切都是 kosher:看到 non-null 值的读者必然与商店有 happens-before 关系到 MAP
和因此查看与地图初始化相关的所有商店.
then everything is kosher: readers who see a non-null value necessarily have a happens-before relationship with the store to MAP
and hence see all the stores associated with the map initialization.
其他方法会更改方法的语义,因为 (2)(使用静态初始化程序)和 (4)(使用 final)都暗示您无法设置 MAP在运行时动态.如果您不需要 这样做,那么只需将
MAP
声明为 static final HashMap 即可保证安全发布.
The other methods change the semantics of your method, since both (2) (using the static initalizer) and (4) (using final) imply that you cannot set
MAP
dynamically at runtime. If you don't need to do that, then just declare MAP
as a static final HashMap<>
and you are guaranteed safe publication.
在实践中,安全访问从未修改的对象"的规则很简单:
In practice, the rules are simple for safe access to "never-modified objects":
如果您发布的对象不是本质上不可变的(在所有声明为
final
的字段中)并且:
If you are publishing an object which is not inherently immutable (as in all fields declared
final
) and:
您已经可以创建将在声明时分配的对象:只需使用
final
字段(包括static final
对于静态成员).您想稍后在引用已经可见后分配对象:使用可变字段.
You already can create the object that will be assigned at the moment of declaration: just use a
final
field (includingstatic final
for static members).You want to assign the object later, after the reference is already visible: use a volatile field.
就是这样!
在实践中,它非常有效.例如,
static final
字段的使用允许 JVM 假定该值在程序的生命周期内保持不变并对其进行大量优化.final
成员字段的使用允许大多数架构以与普通字段读取等效的方式读取该字段,并且不会抑制进一步的优化.
最后,
volatile
的使用确实会产生一些影响:在许多架构(例如 x86,特别是那些不允许读取通过读取的架构上)不需要硬件屏障,但进行了一些优化并且重新排序可能不会在编译时发生 - 但这种影响通常很小.作为交换,您实际上得到的比您所要求的要多——您不仅可以安全地发布一个 HashMap
,还可以存储任意数量的未修改 HashMap
参考相同的参考资料,并确保所有读者都能看到安全发布的地图.
Finally, the use of
volatile
does have some impact: no hardware barrier is needed on many architectures (such as x86, specifically those that don't allow reads to pass reads), but some optimization and reordering may not occur at compile time - but this effect is generally small. In exchange, you actually get more than what you asked for - not only can you safely publish one HashMap
, you can store as many more not-modified HashMap
s as you want to the same reference and be assured that all readers will see a safely published map.
更多血腥细节,请参考Shipilev或Manson 和 Goetz 的这个常见问题解答.
For more gory details, refer to Shipilev or this FAQ by Manson and Goetz.
[1] 直接引用自shipilev.
这听起来很复杂,但我的意思是您可以在构造时分配引用 - 在声明点或构造函数(成员字段)或静态初始化程序(静态字段).
That sounds complicated, but what I mean is that you can assign the reference at construction time - either at the declaration point or in the constructor (member fields) or static initializer (static fields).
可选地,您可以使用
synchronized
方法来获取/设置,或者使用 AtomicReference
或其他东西,但我们正在谈论你能做的最少的工作.
Optionally, you can use a
synchronized
method to get/set, or an AtomicReference
or something, but we're talking about the minimum work you can do.
c 一些具有非常弱内存模型的架构(我正在查看 you,Alpha)在
final
读取之前可能需要某种类型的读取屏障 - 但这些是今天非常罕见.
c Some architectures with very weak memory models (I'm looking at you, Alpha) may require some type of read barrier before a
final
read - but these are very rare today.
这篇关于从多个线程(不修改)从 java.util.HashMap 获取值是否安全?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!