问题描述
您好,我正在尝试实现规范霍夫曼编码,但我不理解Wiki和Google指南,
我需要更抽象地说明...
Hello I am trying to implement Canonical huffman encoding but i dont understand wiki and google guides,I need explain more abstractly...
此:
1.获取常规霍夫曼编码长度的代码列表。像这样:
I tried this:1. Get list of regular huffman encoding length's codes. like this:
A - code: 110, length: 3.
B - code: 111, length: 3.
C - code: 10, length 2.
D - code: 01, length 2.
E - code: 00, length 2.
- 我通过符号和长度对表格进行排序,如下所示:
C - code: 10, length 2.
D - code: 01, length 2.
E - code: 00, length 2.
A - code: 110, length: 3.
B - code: 111, length: 3.
现在我不知道该如何进行...
now i dont know how to proceed...
tnx很多
推荐答案
扔掉从霍夫曼算法中获得的代码。您不需要那些。只需保持长度即可。
Throw out the codes you get from the Huffman algorithm. You don't need those. Just keep the lengths.
现在,根据长度和符号分配代码。按长度排序,从最短到最长,并在每个长度内,按升序对符号进行排序。 (只要每个符号严格小于或大于任何其他符号,并且编码器和解码器都同意如何操作,您怎么做就没关系。)
Now assign the codes based on the lengths and the symbols. Sort by length, from shortest to longest, and within each length, sort the symbols in ascending order. (How you do that exactly doesn't matter, so long as every symbol is strictly less than or greater than any other symbol, and the encoder and decoder agree on how to do it.)
所以我们进行订购:
C - 2
D - 2
E - 2
A - 3
B - 3
二人先于三人,
Now 我们在2的顺序中,C,D,E顺序一致,在3的顺序中,A,B顺序。在每个长度内按整数顺序分配代码,每次增加长度时在末尾添加一个零位:
Now we assign the code in integer order within each length, adding a zero bit at the end each time we go up a length:
C - 2 - 00
D - 2 - 01
E - 2 - 10
A - 3 - 110 <- after incrementing to 11, a zero was added to make 110
B - 3 - 111
这是规范代码。
如果您愿意并且仍然可以规范,则可以使用其他方式进行操作,例如从11开始倒数,只要编码器和解码器同意该方法即可。重点是只需要将每个符号的长度从编码器传输到解码器,而不必传输占用更多空间的代码本身。
You could do it other ways if you like and still be canonical, e.g. counting backwards from 11, so long as the encoder and decoder agree on the approach. The whole point is to only have to transmit the lengths for each symbol from the encoder to the decoder, so as to not have to transmit the codes themselves which take more space.
这篇关于规范霍夫曼编码算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!