问题描述
我需要一个空间有效的概率数据结构来存储我已经计算的值。对我来说,计算价格便宜但空间不大 - 所以如果这个数据结构返回一个假的负数,我可以在一段时间内每隔一段时间重做一些工作,但是误报是不可接受的。所以我正在寻找的是与相反的。
I need a space efficient probabilistic data structure to store values that I have already computed. For me computation is cheap but space is not - so if this data structure returns a false negative, I am okay with redoing some work every once in a while but false positives are unacceptable. So what I am looking for is sort of the opposite of a Bloom filter.
推荐答案
对于假阴性,您可以使用有损哈希表或LRUCache。
它是一个具有快速O(1)查找的数据结构,只会产生错误的否定。
如果您问是否运行测试X,它会告诉您是的,你肯定有,或我不记得了。
For false negative you can use lossy hash table or a LRUCache.It is a data structure with fast O(1) look-up that will only give false negatives.if you ask if "Have I run test X", it will tell you either "Yes, you definitely have", or "I can't remember".
伪码:
setup_test_table():
create test_table( some large number of entries )
clear each entry( test_table, NEVER )
return test_table
has_test_been_run_before( new_test_details, test_table ):
index = hash( test_details , test_table.length )
old_details = test_table[index].detail
// unconditionally overwrite old details with new details, LRU fashion.
// perhaps some other collision resolution technique might be better.
test_table[index].details = new_test_details
if ( old_details === test_details ) return YES
else if ( old_details === NEVER ) return NEVER
else return PERHAPS
main()
test_table = setup_test_table();
loop
test_details = generate_random_test()
status = has_test_been_run_before( test_details, test_table )
case status of
YES: do nothing;
NEVER: run test (test_details);
PERHAPS: if( rand()&1 ) run test (test_details);
next loop
end.
类似的Bloom过滤器假阳性
Similarly Bloom filter for false positive
这篇关于有没有任何概率数据结构假阴性而不是假阳性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!