本文介绍了NVIDIA NVENC(媒体基金会)编码的h.264帧无法使用VideoToolbox正确解码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正面临与此处相同的问题尝试在iPad Pro OS v14.3上解码帧时(我也在使用):

I am facing the same problem as described here when trying to decode a frame on iPad Pro OS v14.3 (I am also using Olivia Stork's example):

正确解码了25%的图片数据,其余图片只是绿色.

25% of the picture data is decoded correctly, the rest of the picture is just green.

iPad Pro OS v14.3上的解码图像看起来像(如此处所述,图像已转换并保存在解码器回调中.a>,所以这不仅仅是显示问题.

The decoded image on iPad Pro OS v14.3 looks like this (the image was converted and saved in the decoder callback as described here, so it's not just a displaying problem).

原始图像看起来像.

图像在Windows10上使用NVIDIA NVENC(媒体基金会)编码.

The image is encoded with NVIDIA NVENC (Media Foundation) on Windows10.

我按照链接中的描述在帧图像数据中搜索了另外的4字节NALU起始码,但是对于SPS,PPS和IDR图像数据,只有三个预期的代码.

I searched the frame picture data for additional 4-Byte NALU start codes as described in the link, but there are only the three expected ones for SPS, PPS and IDR picture data.

我还有一个在Windows10上运行的Media Foundation解码器应用程序,它可以正确地解码来自完全相同源的帧.

I have another Media Foundation decoder application running on Windows10 which can decode the frames from exactly the same source correctly.

我正在努力寻找问题的原因..有人有什么想法吗?

I am struggling for days now finding the cause of the problem.. anyone any ideas?

先谢谢了.罗布

-编辑2021-01-11 :

我发现在NALU类型5的IDR图片数据块中实际上还有三个附加的3字节起始码(0x000001).

I figured out that there are actually three additional 3-byte start codes (0x000001) within in the IDR picture data block of NALU type 5.

我尝试将这些起始代码替换为以下数据块(大端)的长度,如,但结果相同.

I tried to replace these start codes with the length of the following data block (big endian), as described here, but with the same result.

我还尝试按照,但这也没什么区别.

I also tried adding Emulation Prevention Bytes (0x000001 => 0x000301) as described here, but that also made no difference.

也许我误导了这些起始码与问题无关.至少它们不只是随机图像数据,因为它们始终出现在图像数据块中的同一位置(索引).目前我的想法不多了.有人提示吗?

Maybe I am mislead and these start codes have nothing to do with the issue.. at least they are not just random image data, because they always appear at the same position (index) in the picture data block. Currently I am running out of ideas.. any hint anybody?

-编辑2021-01-14 :

我想出了几件事:

出于绝对的想法,我将图片数据复制到了块开头的最后一个起始代码之后(紧接在4字节的NALU起始代码之后).我曾期望-如果这完全可行的话-可以在解码图像的顶部看到原始图像的最后四分之一,但是令我惊讶的是,解码图像看起来像.

Out of sheer lack of ideas I copied the picture data followed after the last start code at the beginning of the block (right after after the 4-Byte NALU start code).I had expected - if that would work at all - to see the last quarter of the original image at the top of the decoded image, but to my surprise the decoded image looked like this.

我对第二个和第三个起始代码之后的图片数据进行了相同的尝试,并且解码后的图片看起来像:图像数据已正确解码,甚至处于正确位置(与原始图像).

I tried the same with the picture data coming after the second and third start code, and the decoded image looked like this and this:The image data is decoded correctly and it is even at the correct position (compare to original image).

即使我剥离所有3字节的起始代码并复制在4字节的起始代码之后连接的图片数据,结果也是相同的,只有25%的图像被解码.因此,附加的3字节起始代码显然不是问题.必须在某处进行一些设置,告诉解码器仅解码图像的25%.我会提示CMVideoFormatDescription,但据我所知看起来还可以.

Even if I strip off all 3-Byte start codes and copy the picture data concatenated after the 4-Byte start code, it's the same result, only 25% of the image is decoded. So the additional 3-Byte start codes are apparently not the problem. There must be some setting somewhere which tells the decoder to only decode 25% of the image. I would tip on the CMVideoFormatDescription, but as far as I can see it looks okay.

我还想知道解码器如何知道在哪里显示不同的图像数据块.在图像数据中某处定义了偏移量,或者编码器以某种方式添加了每个像素的xy位置.

I am also wondering how the decoder knows where to display the different picture data blocks. Either there is an offset defined somewhere within the picture data or the xy-position of every pixel is added by the encoder somehow..

推荐答案

我设法找到了问题的原因:IDE图片数据块中的3字节起始代码必须替换为4字节起始代码.

I managed to find the cause of the problem: The 3-Byte start codes in the IDE picture data block must be replaced by 4-Byte start codes.

因此,首先将所有3字节的起始代码替换为4字节的起始代码.然后,将4字节的起始代码替换为以下数据块(大字节序)的长度.切片的排列方式应如下(如黑子"提到的此处):

So first replace all 3-Byte start codes by 4-Byte start codes.Then replace the 4-Byte start codes with the length of the following data block (big endian). The slices should be arranged like this (as mentioned here by 'Blackie'):

[4字节slice1大小] [slice1数据] [4字节slice2大小] [slice2数据] ... [4字节slice4大小] [slice4数据]

请记住在切片大小中包含起始代码长度.

Remember to not include the start code length in slice size.

更改后,我的框架完全显示了.

After changing that, my frame was completely displayed.

顺便说一句:在每个NALU的头数据中指定在哪里显示不同图片数据块的信息(参数"first_mb_in_slice").

By the way:The information where to display the different picture data blocks is specified in the header data of each NALU (parameter 'first_mb_in_slice').

有一个很好的c#示例,此处如何提取NALU标头数据.您几乎可以将其复制为1:1.

There is a very good c# example here how to extract the NALU header data. You can almost copy it 1:1.

这篇关于NVIDIA NVENC(媒体基金会)编码的h.264帧无法使用VideoToolbox正确解码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-23 02:38