问题描述
我正在一个对象存储项目中,我需要了解
I am working on a object storage project where I need to understand Reed Solomon error correction algorithm,I have gone through this Doc as a starter and also some thesis paper.
1. content.sakai.rutgers.edu
2. theseus.fi
but I can't seem to understand the lower part of the identity matrix (red box), where it is coming from. How this calculation is done?
任何人都可以解释一下.
Can anyone please explain this.
推荐答案
编码矩阵是一个6 x 4 Vandermonde矩阵,使用评估点{0 1 2 3 4 5}进行了修改,以使矩阵是单位矩阵.要创建矩阵,将生成一个6 x 4的Vandermonde矩阵(其中matrix [r] [c] = pow(r,c)),然后乘以矩阵的上4 x 4部分的逆,以生成编码矩阵.这等效于图2的系统编码".里德·所罗门(Reed Solomon)的原始观点"如您在上面链接到的Wikipedia文章中所述,这与链接1.和2的Reed Solomon的"BCH视图"不同.Wikipedia的示例系统编码矩阵是问题中使用的编码矩阵的转置版本.
The encoding matrix is a 6 x 4 Vandermonde matrix using the evaluation points {0 1 2 3 4 5} modified so that the upper 4 x 4 portion of the matrix is the identity matrix. To create the matrix, a 6 x 4 Vandermonde matrix is generated (where matrix[r][c] = pow(r,c) ), then multiplied with the inverse of the upper 4 x 4 portion of the matrix to produce the encoding matrix. This is the equivalent of "systematic encoding" with Reed Solomon's "original view" as mentioned in the Wikipedia article you linked to above, which is different than Reed Solomon's "BCH view", which links 1. and 2. refer to. The Wikipedia's example systematic encoding matrix is a transposed version of the encoding matrix used in the question.
https://en.wikipedia.org/wiki/Vandermonde_matrix
生成编码矩阵的代码在此github源文件的底部附近:
The code to generate the encoding matrix is near the bottom of this github source file:
Vandermonde inverse upper encoding
matrix part of matrix matrix
01 00 00 00 01 00 00 00
01 01 01 01 01 00 00 00 00 01 00 00
01 02 04 08 x 7b 01 8e f4 = 00 00 01 00
01 03 05 0f 00 7a f4 8e 00 00 00 01
01 04 10 40 7a 7a 7a 7a 1b 1c 12 14
01 05 11 55 1c 1b 14 12
解码分为两个步骤.首先,恢复所有丢失的数据行,然后使用现在恢复的数据重新生成所有丢失的奇偶校验行.
Decoding is performed in two steps. First, any missing rows of data are recovered, then any missing rows of parity are regenerated using the now recovered data.
对于2个丢失的数据行,将删除编码矩阵的2个相应行,并反转4 x 4个子矩阵,并使用4个无丢失数据行和奇偶校验乘以恢复2个丢失的行数据的.如果只有1个数据丢失行,则选择第二个数据行就好像它丢失一样,以便生成倒置矩阵.实际的数据再生仅需要将倒置矩阵的相应行用于矩阵乘法.
For 2 missing rows of data, the 2 corresponding rows of the encoding matrix are removed, and the 4 x 4 sub-matrix inverted and used the multiply the 4 non-missing rows of data and parity to recover the 2 missing rows of data. If there is only 1 missing row of data, then a second data row is chosen as if it was missing, in order to generate the inverted matrix. The actual regeneration of data only requires the corresponding rows of the inverted matrix to be used for a matrix multiply.
一旦恢复了数据,就使用编码矩阵的相应行从现在恢复的数据中重新生成所有丢失的奇偶校验行.
Once the data is recovered, then any missing parity rows are regenerated from the now recovered data, using the corresponding rows of the encoding matrix.
基于所显示的数据,数学基于有限域GF(2 ^ 8)模0x11D.例如,使用编码矩阵的最后一行和数据矩阵的最后一列进行编码是(0x1c·0x44)+(0x1b·0x48)+(0x14·0x4c)+(0x12·0x50)= 0x25(使用有限域数学).
Based on the data shown, the math is based on finite field GF(2^8) modulo 0x11D. For example, encoding using the last row of the encoding matrix with the last column of the data matrix is (0x1c·0x44)+(0x1b·0x48)+(0x14·0x4c)+(0x12·0x50) = 0x25 (using finite field math).
问题示例并未明确说明6 x 4编码矩阵可以对4 x n矩阵进行编码,其中n是每行的字节数.n == 8的示例:
The question example doesn't make it clear that the 6 x 4 encode matrix can encode a 4 x n matrix, where n is the number of bytes per row. Example where n == 8:
encode data encoded data
01 00 00 00 31 32 33 34 35 36 37 38
00 01 00 00 31 32 33 34 35 36 37 38 41 42 43 44 45 46 47 48
00 00 01 00 x 41 42 43 44 45 46 47 48 = 49 4a 4b 4c 4d 4e 4f 50
00 00 00 01 49 4a 4b 4c 4d 4e 4f 50 51 52 53 54 55 56 57 58
1b 1c 12 14 51 52 53 54 55 56 57 58 e8 eb ea ed ec ef ee dc
1c 1b 14 12 f5 f6 f7 f0 f1 f2 f3 a1
assume rows 0 and 4 are erasures and deleted from the matrices:
00 01 00 00 41 42 43 44 45 46 47 48
00 00 01 00 49 4a 4b 4c 4d 4e 4f 50
00 00 00 01 51 52 53 54 55 56 57 58
1c 1b 14 12 f5 f6 f7 f0 f1 f2 f3 a1
invert encode sub-matrix:
inverse encode identity
46 68 8f a0 00 01 00 00 01 00 00 00
01 00 00 00 x 00 00 01 00 = 00 01 00 00
00 01 00 00 00 00 00 01 00 00 01 00
00 00 01 00 1c 1b 14 12 00 00 00 01
reconstruct data using sub-matrices:
inverse encoded data restored data
46 68 8f a0 41 42 43 44 45 46 47 48 31 32 33 34 35 36 37 38
01 00 00 00 x 49 4a 4b 4c 4d 4e 4f 50 = 41 42 43 44 45 46 47 48
00 01 00 00 51 52 53 54 55 56 57 58 49 4a 4b 4c 4d 4e 4f 50
00 00 01 00 f5 f6 f7 f0 f1 f2 f3 a1 51 52 53 54 55 56 57 58
The actual process only uses the rows of the matrices that correspond
to the erased rows that need to be reconstructed.
First data is reconstructed:
sub-inverse encoded data reconstructed data
41 42 43 44 45 46 47 48
46 68 8f a0 x 49 4a 4b 4c 4d 4e 4f 50 = 31 32 33 34 35 36 37 38
51 52 53 54 55 56 57 58
f5 f6 f7 f0 f1 f2 f3 a1
Once data is reconstructed, reconstruct parity
sub-encode data reconstruted parity
31 32 33 34 35 36 37 38
1b 1c 12 14 x 41 42 43 44 45 46 47 48 = e8 eb ea ed ec ef ee dc
49 4a 4b 4c 4d 4e 4f 50
51 52 53 54 55 56 57 58
这种方法的一种替代方法是使用 BCH 查看 Reed Solomon.对于奇数个奇偶校验,例如RS(20,17)(3个奇偶校验),需要2个矩阵乘法,并且需要一个XOR进行编码,而对于单个擦除,仅需要XOR.对于e> 1擦除,进行(e-1)x n矩阵乘法,然后进行XOR.对于偶数个奇偶校验,如果将XOR用作编码的一部分,则需要e x n矩阵乘法来校正,或者需要e x n矩阵用于编码,从而允许一个XOR进行校正.
One alternate to this approach is to use BCH view Reed Solomon. For an odd number of parities, such as RS(20,17) (3 parities), 2 matrix multiplies and one XOR is needed for encoding, and for a single erasure only XOR is needed. For e>1 erasures, a (e-1) by n matrix multiply is done, followed by an XOR. For an even number of parities, if an XOR is used as part of the encode, then a e by n matrix multiply is needed to correct, or the e x n matrix used for encode, allowing one XOR for the correction.
另一种选择是Raid 6,其中综合征"是指.会附加到数据矩阵中,但不会形成正确的代码字.校正子行中的一个称为P,仅是XOR,另一行称为Q,是GF(2 ^ 8)中2的连续幂.对于3奇偶校验Raid 6,第三行称为R,是GF(2 ^ 8)中4的连续幂.与标准 BCH 视图不同,如果 Q 或 R 行丢失,则必须重新计算(而不是使用 XOR 来纠正它).通过使用对角线模式,如果n个磁盘中有1个发生故障,那么更换磁盘时仅需要重新生成Q和R的1/n.
Another alternative is Raid 6, where "syndromes" are appended to the matrix of data, but do not form a proper code word. One of the syndrome rows, called P, is just XOR, the other row called Q, is successive powers of 2 in GF(2^8). For a 3 parity Raid 6, the third row is called R, is successive powers of 4 in GF(2^8). Unlike standard BCH view, if a Q or R row is lost, it has to be recalculated (as opposed to using XOR to correct it). By using a diagonal pattern, if 1 of n disks fail, then only 1/n of the Q's and R's need to be regenerated when the disk is replaced.
http://alamos.math.arizona.edu/RTG16/ECC/raid6.pdf
请注意,此pdf文件的替代方法使用与上述方法相同的有限域,即GF(2 ^ 8)mod 0x11D,这可能使比较这些方法更加容易.
Note that this pdf file's alternate method uses the same finite field as the method above, GF(2^8) mod 0x11D, which may make it easier to compare the methods.
这篇关于您能解释一下Reed Solomon编码部分的身份矩阵吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!