问题描述
我已经在C ++中实现了 Huffman 编码算法,并且运行良好.我想创建一个文本压缩算法.
I have implemented the Huffman coding algorithm in C++, and it's working fine. I want to create a text compression algorithm.
数字世界中每个文件或数据的后面都是0/1.
behind every file or data in the digital world, there is 0/1.
我想将霍夫曼编码算法生成的位序列( 0/1 )保留在文件中.
I want to persist the sequence of bits(0/1) that are generated by the Huffman encoding algorithm in the file.
我的目标是保存文件中要存储的位数.我将元数据存储在单独的文件中以进行解码.我想一点一点地将数据写入文件,然后在c ++中一点一点地读取相同的数据.
my goal is to save the number of bits used in the file to store. I'm storing metadata for decoding in a separate file. I want to write bit by bit data to file, and then read the same bit by bit in c++.
二进制模式面临的问题是它不允许我一点一点地放置数据.我想把"10101"放在逐位保存到文件中,但每次只能输入asci值或每个字符的8位.
the problem I'm facing with the binary mode is that it not allowing me to put data bit by bit.I want to put "10101" as bit by bit to file but it put asci values or 8-bits of each character at a time.
#include "iostream"
#include "fstream"
using namespace std;
int main(){
ofstream f;
f.open("./one.bin", ios::out | ios::binary);
f<<"10101";
f.close();
return 0;
}
输出
任何帮助或帮助的指针,我们感激不尽.谢谢.
any help or pointer to help is appreciated. thank you.
推荐答案
二进制模式";仅表示您已请求写入的实际字节不被行尾转换损坏.(这只是Windows上的一个问题.没有其他系统需要故意破坏您的数据.)
"Binary mode" means only that you have requested that the actual bytes you write are not corrupted by end-of-line conversions. (This is only a problem on Windows. No other system has the need to deliberately corrupt your data.)
您仍在以二进制模式一次写入一个字节.
You are still writing a byte at a time in binary mode.
要写入位,请将它们累加成整数.为方便起见,使用无符号整数.这是您的位缓冲区.您需要确定是从最低位到最高位还是从最高位到最低位累积它们.一旦累积了八位或更多位,就将一个字节写到文件中,然后从缓冲区中删除这八位.
To write bits, you accumulate them in an integer. For convenience, in an unsigned integer. This is your bit buffer. You need to decide whether to accumulate them from the least to most or from the most to least significant positions. Once you have eight or more bits accumulated, you write out one byte to your file, and remove those eight bits from the buffer.
完成后,如果缓冲区中还剩一些位,则将最后1到7位写到一个字节.您需要仔细考虑操作的精确度,以及如何知道有多少位,以便可以正确解码另一端的位.
When you're done, if there are bits left in your buffer, you write out those last one to seven bits to one byte. You need to carefully consider how exactly you do that, and how to know how many bits there were, so that you can properly decode the bits on the other end.
使用您的语言中的位操作完成累加和提取.在C ++(和许多其他语言)中,它们是&
(和), |
(或),>>
(右移)和<<
(左移).
The accumulation and extraction are done using the bit operations in your language. In C++ (and many other languages), those are &
(and), |
(or), >>
(right shift), and <<
(left shift).
例如,要在缓冲区中插入一个 x
位,然后在 y
中插入三位,最后以最重要的位置插入最早的位:
For example, to insert one bit, x
, into your buffer, and later three bits in y
, ending up with the earliest bits in the most significant positions:
unsigned buf = 0, bits = 0;
...
// some loop
{
...
// write one bit (don't need the & if you know x is 0 or 1)
buf = (buf << 1) | (x & 1);
bits++;
...
// write three bits
buf = (buf << 3) | (y & 7);
bits += 3;
...
// write bytes from the buffer before it fills the integer length
if (bits >= 8) { // the if could be a while if expect 16 or more
// out is an ostream -- must be in binary mode if on Windows
bits -= 8;
out.put(buf >> bits);
}
...
}
...
// write any leftover bits (it is assumed here that bits is in 0..7 --
// if not, first repeat if or while from above to clear out bytes)
if (bits) {
out.put(buf << (8 - bits));
bits = 0;
}
...
这篇关于如何在C ++中逐位读取/写入序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!