问题描述
我希望找到在特定时间段内未更新的列.
I have the desire to find the columns that have not been updated for more than a specific time period.
所以我想对具有时间范围的列进行扫描.HBase的正常行为是您可以在该时间范围内获得最新值(这不是我想要的).
So I want to do a scan against the columns with a timerange.The normal behaviour of HBase is that you then get the latest value in that time range (which is not what I want).
据我了解,HBase的工作方式是,如果将列族中的值的最大版本数设置为"1",则它应仅保留输入到单元格中的最后一个值.
As far as I understand the way HBase should work is that if you set the maximum number of versions for the values in a column family to '1' it should retain only the last value that was put into the cell.
我发现的与众不同.
如果我在hbase shell中执行以下命令
If I do the following commands into the hbase shell
create 't1', {NAME => 'c1', VERSIONS => 1}
put 't1', 'r1', 'c1', 'One', 1000
put 't1', 'r1', 'c1', 'Two', 2000
put 't1', 'r1', 'c1', 'Three', 3000
get 't1', 'r1'
get 't1', 'r1' , {TIMERANGE => [0,1500]}
结果是这样的:
get 't1', 'r1'
COLUMN CELL
c1: timestamp=3000, value=Three
1 row(s) in 0.0780 seconds
get 't1', 'r1' , {TIMERANGE => [0,1500]}
COLUMN CELL
c1: timestamp=1000, value=One
1 row(s) in 0.1390 seconds
即使我将最高版本设置为仅1,为什么第二个查询仍返回一个值?
Why does the second query return a value eventhough I've set the max versions to only 1?
我当前在此处安装的HBase版本是HBase 0.94.6-cdh4.4.0
The HBase version I currently have installed here is HBase 0.94.6-cdh4.4.0
推荐答案
原来是hbase中的错误. https://issues.apache.org/jira/browse/HBASE-10102
It turns out to be a bug in hbase.https://issues.apache.org/jira/browse/HBASE-10102
这篇关于即使最大版本= 1,HBase get也会返回旧值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!