问题描述
我对zlib压缩类型为 char
的字符串的输入感到有些困惑.下面是发布的代码的输出,我注意到与输入相比, input 字符串的字节数要比短短.
I'm a bit confused by zlib compressing an input of a string of type char
. Below I have the output from the code as posted and what I noticed was that the input string was shorter in bytes compared to the output.
未压缩的大小为 8个字节 ,而 压缩后的大小为12 ?我不是正确地看到了这一点吗?
The uncompressed size was 8 bytes and the compressed is 12? Am I not seeing this correctly instead?
这是代码.
#include <stdio.h>
#include <string.h>
#include <assert.h>
#include <iostream>
#include "zlib.h"
void print( char *array, int length)
{
for(int index = 0; index < length; index++)
std::cout<<array[index];
std::cout<<std::endl;
}
void clear( char *array, int length)
{
for(int index = 0; index < length; index++)
array[index] = 0;
}
int main()
{
const int length = 30;
char a[length] = "HHHHHHH";
char b[length] = "";
char c[length] = "";
print( a, length);
std::cout<<std::endl;
uLong ucompSize = strlen(a)+1; // "string" + NULL delimiter.
std::cout<<"ucompSize: "<<ucompSize<<std::endl;
uLong compSize = compressBound(ucompSize);
std::cout<<"compSize: "<<compSize<<std::endl;
std::cout<<std::endl;
// Deflate
compress((Bytef *)b, &compSize, (Bytef *)a, ucompSize);
std::cout<<"ucompSize: "<<ucompSize<<std::endl;
std::cout<<"compSize: "<<compSize<<std::endl;
print( b, length);
std::cout<<std::endl;
// Inflate
uncompress((Bytef *)c, &ucompSize, (Bytef *)b, compSize);
std::cout<<"ucompSize: "<<ucompSize<<std::endl;
std::cout<<"compSize: "<<compSize<<std::endl;
print( c, length);
return 0;
}
这是输出.
HHHHHHH
ucompSize: 8
compSize: 21
ucompSize: 8
compSize: 12
x�� ��
ucompSize: 8
compSize: 12
HHHHHHH
Process returned 0 (0x0) execution time : 0.013 s
Press ENTER to continue.
推荐答案
compress()
函数使用zlib格式,该格式将两个字节的标头和四个字节的尾部放在原始压缩后数据.即使原始压缩数据小于原始字符串,您也将从包装器中获得另外六个字节.对于一个空字符串,根本没有字节,原始压缩数据是两个字节.因此,zlib流的最小大小为8个字节.八个重复的输入字节可导致原始压缩数据短至四个字节,因此zlib包装的最小结果为十个字节.
The compress()
function uses the zlib format, which puts a two-byte header and four-byte trailer around the raw compressed data. Even if the raw compressed data is smaller than the original string, you will get six more bytes from the wrapper. For an empty string, no bytes at all, the raw compressed data is two bytes. So the minimum size of a zlib stream is eight bytes. Eight repeated input bytes can result in raw compressed data as short as four bytes, so the minimum zlib-wrapped result is ten bytes.
通常,您需要大得多的输入量才能使无损压缩有效.
In general you need much larger inputs for lossless compression to be effective.
这篇关于Zlib压缩后的输入是否大于原始输入字符串的char?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!