问题描述
我目前正在基于png文件格式开发专有文件格式.到目前为止,我已经完成了,除了它不起作用:-p我实现的deflate解压缩器的工作原理很吸引人,但是png解码器不想表现良好,所以我看了看原始的png文件.
I am currently developing a proprietary file format based on the png file format. I am done so far, except it doesn't work :-p The deflate decompressor I implemented works like a charm but the png decoder doesn't want to perform nicely so I took a look at the original png file.
该标准说,在IDAT标头之后,紧随其后的是压缩数据.因此,由于数据是deflate流,所以IDAT之后的第一个字符为0x78 == 01111000,这意味着模式为一个块(未压缩),而不是最后一个.
The standard says that after a IDAT header, the compressed data is following immediatly. So as the data is a deflate stream the first char after IDAT is 0x78 == 01111000 which means, a mode one block (uncompressed) and its not the final one.
但是很奇怪-我很难想象PNG编码器不使用动态霍夫曼编码来压缩过滤后的原始图像数据.放气标准表示,在模式一下,当前字节的其余部分将被跳过.
Strange though - its hard for me to imagine that a PNG encoder doesn't use dynamic huffman coding for compressing the filtered raw image data. The deflate standard says that the rest of the current byte is skipped in mode one.
因此,接下来的四个字节指示未压缩块的大小及其一个补码.但是0x59FD不是0xECDA的一个补码.即使我搞砸了字节顺序:0xFD59也不是0xDAEC的一个补码.
So the next four bytes indicate the size of the uncompressed block and its one complement.But 0x59FD is not the one complement of 0xECDA. Even if I screwed up the byte ordering: 0xFD59 is not the one complement of 0xDAEC either.
好吧,敲除字节紧随其后. 0x97被视为未压缩但仍已过滤的原始png图像数据的第一个字节,因此必须为filtertype.但是0x97 == 10010111不是有效的过滤器类型.如果我搞砸了位打包命令11101001 == 0xe9,该事件仍然不是有效的过滤器类型.
Well, the knockout byte just follows. 0x97 is considered to be the first byte of the uncompressed but still filtered raw png image data and as such must be the filtertype. But 0x97 == 10010111 is not a valid filter type. Event if I screwed up bit packing order 11101001 == 0xe9 is still no valid filter type.
我不再关注RFC 1951,因为到目前为止,我可以使用deflate解压缩器实现对各种文件的膨胀,因此我怀疑我在理解PNG标准方面存在一些误解.
I didn't focus on RFC 1951 much anymore as I am able to inflate all kind of files so far using my implementation of the deflate decompressor, so I suspect some misunderstanding on my part concering the PNG standard.
我一遍又一遍地阅读RFC 2083,但是我在这里看到的数据与RFC不匹配,对我来说这没有意义,一定有一个丢失的部分!
I read the RFC 2083 over and over again but the data I see here don't match the RFC, it doesn't make sense to me, there must be a missing piece!
当我查看以下字节时,实际上在附近没有看到有效的过滤器类型字节,这使我认为过滤后的png数据流毕竟还是被压缩了.
When I look at the following bytes, I can actually not see a valid filter type byte anywhere near which makes me think that the filtered png data stream is nevertheless compressed after all.
如果将0x78(IDAT之后的第一个字节)从MSB读取到LSB,则是有意义的,但RFC 1951则相反.另一个想法(对我而言更可能)是在IDAT字符串和压缩的deflate流的开始之间存在一些数据,但RFC 2083则相反.布局清晰
It would make sense if 0x78 (the first byte after IDAT) would be read from MSB to LSB but RFC 1951 says otherwise. Another idea (more likely to me) is that there some data between the IDAT string and the start of the compressed deflate stream but RFC 2083 says otherwise. The Layout is clear
4Bytes大小4字节的块名(IDAT)[大小]字节(压缩的放气流)4Bytes CRC Checksum
4Bytes Size4Bytes ChunkName (IDAT)[Size] Bytes (compressed deflate stream)4Bytes CRC Checksum
因此,IDAT之后的第一个字节必须是压缩的deflate流的第一个字节-指示模式1未压缩的数据块.这意味着0x97必须是未压缩但已过滤的png图像数据的第一个字节-这意味着0x97是第一行的过滤器类型-无效...
So the first byte after IDAT must be the first byte of the compressed deflate stream - which indicates a mode 1 uncompressed data block. Which means that 0x97 must be the first byte of uncompressed but filtered png image data - which means 0x97 is the filtertype for the first row - which is invalid...
我就是不明白,我是愚蠢的还是什么?
I just don't get it, am I stupid or what??
摘要:可能性1:在IDAT和压缩的deflate流的有效开始之间还有一些其他数据,如果渲染为真,则在RFC2083和我所读过的任何有关图像压缩的书中都没有提及.
Summary:Possibility 1:There is some other data between IDAT and the effective start of the compressed deflate stream which, if renders to be true, is not meantioned in the RFC2083 nor in any book I read about image compression.
可能性2:数字0x78解释为MSB-> LSB,它将指示模式3块(动态霍夫曼编码),但这与RF1951矛盾,RF1951在位打包方面非常清楚:(LSB-> MSB)
Possibility 2:The number 0x78 is interpreted MSB -> LSB which would indicate a mode 3 block (dynamic huffman coding), but this contradicts with RF1951 which is very clear about Bit packing: (LSB -> MSB)
我已经知道,缺少的部分一定是很愚蠢的,如果Stack Overflow中只有一个删除按钮,我会感到迫切需要卖掉我的灵魂:-p
I know already, the missing piece must be something very stupid and I will feel the urgend need to sell my soul if there was only a delete button in Stack Overflow :-p
推荐答案
两项更正可能会帮助您上路:
Two corrections which may help you get you on your way:
- 标志中的
zlib
字节数为2,而不是1-请参见 RFC 1950 一个>.第一个是CMF
,第二个是FLG
.
- The number of
zlib
bytes in the flags is 2, not 1 -- see RFC 1950. The first isCMF
, the nextFLG
.
在您的数据中:
78 DA
---CMF--- ---FLG---
0111.1000 1101.0101
CINF -CM- +-||
| |+- FCHECK
| +-- FDICT
+---- FLEVEL
CINF
为7,表示标准的32Kb压缩窗口.CM
是8,表示压缩算法的确是DEFLATE.FCHECK
只是校验和;我没有检查它是否正确(但我敢打赌它是正确的).FDICT
清除,表示没有存储任何预设词典.FLEVEL
为3,表示最大压缩率.
CINF
is 7, indicating the standard 32Kb compression window.CM
is 8, indicating the compression algorithm is, indeed, DEFLATE.FCHECK
is just a checksum; I didn't check if it's correct (but I'd bet it is).FDICT
is clear, meaning there is no preset dictionary stored.FLEVEL
is 3, indicating Maximum Compression.
另请参见尝试了解PNG文件中的zlib/deflate ,尤其是博士阿德勒的答案.
See also Trying to understand zlib/deflate in PNG files, esp. dr. Adler's answer.
-
LEN
和NLEN
仅为未压缩的块设置;这就是为什么您找不到它们的原因. (另外,部分原因是因为您看错了字节.)
LEN
andNLEN
are only set for uncompressed blocks; that's why you didn't find them. (Also, partially, because you were looking at the wrong byte(s).)
流中的下一个 byte 是EC
;按位,这是1110 1100
,但是请记住从低到高读取位.因此,下一个读取的位是0
,表示不是最终,而接下来的读取的2个位是10
(按此顺序!),表示规则的动态霍夫曼编码数据块.
The next byte in the stream is EC
; bitwise, this is 1110 1100
but remember to read bits from low to high. So the next bit read is 0
, meaning not FINAL, and the next 2 bits read are 10
(in that order!), indicating a regular dynamic Huffman encoded data block.
这篇关于PNG文件格式的IDAT块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!