我的霍夫曼树代码有问题。在main方法中,我输入了符号字符串,并且还输入了包含符号频率的Integer数组。它应该打印出每个Symbol及其Huffman代码,但是我认为它是错误的...
这是代码:
package huffman;
import java.util.*;
abstract class HuffmanTree implements Comparable<HuffmanTree> {
public final int frequency; // the frequency of this tree
public HuffmanTree(int freq) { frequency = freq; }
// compares on the frequency
public int compareTo(HuffmanTree tree) {
return frequency - tree.frequency;
}
}
class HuffmanLeaf extends HuffmanTree {
public final char value; // the character this leaf represents
public HuffmanLeaf(int freq, char val) {
super(freq);
value = val;
}
}
class HuffmanNode extends HuffmanTree {
public final HuffmanTree left, right; // subtrees
public HuffmanNode(HuffmanTree l, HuffmanTree r) {
super(l.frequency + r.frequency);
left = l;
right = r;
}
}
public class Huffman {
// input is an array of frequencies, indexed by character code
public static HuffmanTree buildTree(int[] charFreqs, char[] test2) {
PriorityQueue<HuffmanTree> trees = new PriorityQueue<HuffmanTree>();
// initially, we have a forest of leaves
// one for each non-empty character
for (int i = 0; i < charFreqs.length; i++)
if (charFreqs[i] > 0)
trees.offer(new HuffmanLeaf(charFreqs[i], test2[i]));
assert trees.size() > 0;
// loop until there is only one tree left
while (trees.size() > 1) {
// two trees with least frequency
HuffmanTree a = trees.poll();
HuffmanTree b = trees.poll();
// put into new node and re-insert into queue
trees.offer(new HuffmanNode(a, b));
}
return trees.poll();
}
public static void printCodes(HuffmanTree tree, StringBuffer prefix) {
assert tree != null;
if (tree instanceof HuffmanLeaf) {
HuffmanLeaf leaf = (HuffmanLeaf)tree;
// print out character, frequency, and code for this leaf (which is just the prefix)
System.out.println(leaf.value + "\t" + leaf.frequency + "\t" + prefix);
} else if (tree instanceof HuffmanNode) {
HuffmanNode node = (HuffmanNode)tree;
// traverse left
prefix.append('0');
printCodes(node.left, prefix);
prefix.deleteCharAt(prefix.length()-1);
// traverse right
prefix.append('1');
printCodes(node.right, prefix);
prefix.deleteCharAt(prefix.length()-1);
}
}
public static void main(String[] args) {
//Symbols:
String str = "12345678";
char[] test2 = str.toCharArray();
//Frequency (of the symbols above):
int[] charFreqs = {36,18,12,9,7,6,5,4};
// build tree
HuffmanTree tree = buildTree(charFreqs,test2);
// print out results
System.out.println("SYMBOL\tFREQ\tHUFFMAN CODE");
printCodes(tree, new StringBuffer());
}
}
我得到的输出是:
SYMBOL FREQ HUFFMAN CODE
1 36 0
3 12 100
6 6 1010
5 7 1011
2 18 110
4 9 1110
8 4 11110
7 5 11111
太奇怪了,例如,符号7应该是:11110,符号8应该是:11111
你能帮我吗?
最佳答案
位模式的分配与代码的优化无关紧要。您的作业将正常进行。没有什么奇怪的。您可能还对2:110、3:100或4:1110、5:1011表示了担忧,但也可以。
在代码上施加顺序的唯一原因是减少了将代码从压缩器传输到解压缩器所需的位数。除了发送代码外,您还可以发送每个符号的代码长度,只要代码的长度和长度在两侧相同。
在那种情况下,方法通常是按数字顺序将代码分配给已排序的符号列表。然后,如果分配的顺序确实是符号7,则其代码“值”比符号8小。
对于您的示例,这样的规范代码将是:
1: 1 - 0
2: 3 - 100
3: 3 - 101
4: 4 - 1100
5: 4 - 1101
6: 4 - 1110
7: 5 - 11110
8: 5 - 11111
您只需取长度,并在相同的长度内对符号进行排序。然后分配以0开头并递增的代码,随着长度的增加,在末尾添加位。
请注意,这是一个不寻常的示例,其中符号顺序也是频率顺序。通常情况并非如此。