I'm going through some HBase Architecture notes here: https://mapr.com/blog/in-depth-look-hbase-architecture/ and saw it saidMy question is two-fold.Why do we flush all MemStores at once? Couldn't we just flush the MemStore that's full? Let's say we have two MemStores: 1 and 2. If 1 is flushed than for future Gets we can still check 2 before checking disk (HFiles) for 2's Column Family, right?What does "last written sequence number" mean? I'm trying to visualize how flushing MemStores would happen but maybe a visual example would help. Let's say I have MemStore 1 with row keys a, b, and d and I flush them. What's the "last written sequence number"? 解决方案 Let's start from how write operations handled by HBase. When you performing a write to HBase, it will do following(simplified view):append KV write to WALfsync WALapply write to MemStoreEach write operation is marked by 'sequence number'. This is some sort of MVCC transaction ID.Quote from HBase docs:Sequence number is written into WAL as part of write operation along with new KV. After successful write into WAL, HBase applies changes into MemStore and respond to client about successful write. From this point, new KV persisted and will not be lost if RegionServer dies.Because each write is increase size of WAL, HBase should truncate it to reduce disk usage. To accomplish this job, WAL must ensure that changes described by it's entries are durably persisted to disk(to not lose updates if server will crash). For that purpose, WAL tracks aforementioned "last written sequence number"(LWSN) of each region which belongs to RegionServer.These LWSN represent most recent writes which was flushed to disk. All write operations with greater seqnum reside only in MemStore, not on disk yet. WAL uses value of region's LWSN to find entries which 'seqnum' is less that regions's LWSN. Such entries can be removed from WAL because they were flushed to disk and will not be losed during server crash.Let's see example of how LWSN is tracked by HBase. Suppose you have a 2 column families 'a' and 'b'. You perform 200 write operations: first 100 will be written to 'a' and other 100 to 'b'. 'seqnum''s of operations related to col.family 'a' is in range [1..100] and for 'b' will be [101..201]. Suppose writes to 'b' is more heavy sized and cause a flush of MemStore of 'b', but not an 'a'. During this flush, HBase should update LWSN of region. It's not correct to update it to value of 201, because writes with 'seqnum's [1..100] are not persisted(and must not be truncated from WAL).That's why HBase flushes MemStores of all column families at once: if it flushes only full MemStore, it can't update LWSN of region and will delay WAL truncation(which can cause long server repair in case of crash). 这篇关于为什么我们要同时刷新HBase中的所有MemStore?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 上岸,阿里云!
07-29 15:33
查看更多