ios - 使用MTKView显示解码的视频流会导致不良的模糊输出

我设法创建了一个应用程序，该应用程序接收实时h264编码的视频流，然后使用Video Toolbox和AVSampleBufferDisplayLayer解码并显示视频。这可以按预期工作，但是我希望能够对渲染的输出应用滤镜，因此我改为使用Video Toolbox进行解码，并使用MetalKit显示/渲染已解码的视频。我唯一的问题是，使用MetalKit渲染的输出明显比通过AVSampleBufferDisplayLayer接收到的输出更加模糊，而且我还没有找出原因。

这是AVSampleBufferDisplayLayer 的输出

这是MetalKit 的输出

我试过跳过MetalKit并直接渲染到CAMetalLayer，但是仍然存在相同的问题。我正在尝试将CVImageBufferRef转换为可以与UIView一起显示的UIImage。如果这也变得模糊，那么问题可能出在我的VTDecompressionSession上，而不是金属方面。

解码部分非常类似于How to use VideoToolbox to decompress H.264 video stream

我将尝试仅粘贴代码中有趣的片段。

这些是我给我的VTDecompressionSession的选项。

NSDictionary *destinationImageBufferAttributes = [NSDictionary dictionaryWithObjectsAndKeys:
                                                      [NSNumber numberWithInteger:kCVPixelFormatType_420YpCbCr8BiPlanarVideoRange],
                                                      (id)kCVPixelBufferPixelFormatTypeKey,
                                                      nil];

这是我继承自MTKView的视图

@interface StreamView : MTKView

@property id<MTLCommandQueue> commandQueue;
@property id<MTLBuffer> vertexBuffer;
@property id<MTLBuffer> colorConversionBuffer;
@property id<MTLRenderPipelineState> pipeline;
@property CVMetalTextureCacheRef textureCache;

@property CFMutableArrayRef imageBuffers;

-(id)initWithRect:(CGRect)rect withDelay:(int)delayInFrames;
-(void)addToRenderQueue:(CVPixelBufferRef)image renderAt:(int)frame;

@end

这就是我从视图控制器初始化视图的方式。我收到的视频大小相同，即666x374。

streamView = [[StreamView alloc] initWithRect:CGRectMake(0, 0, 666, 374) withDelay:0];
[self.view addSubview:streamView];

这是StreamView的initWithRect方法的内容

id<MTLDevice> device = MTLCreateSystemDefaultDevice();
self = [super initWithFrame:rect device:device];

self.colorPixelFormat = MTLPixelFormatBGRA8Unorm;
self.commandQueue = [self.device newCommandQueue];
[self buildTextureCache];
[self buildPipeline];
[self buildVertexBuffers];

这是buildPipeline方法

- (void)buildPipeline
{
    NSBundle *bundle = [NSBundle bundleForClass:[self class]];
    id<MTLLibrary> library = [self.device newDefaultLibraryWithBundle:bundle error:NULL];

    id<MTLFunction> vertexFunc = [library newFunctionWithName:@"vertex_main"];
    id<MTLFunction> fragmentFunc = [library newFunctionWithName:@"fragment_main"];

    MTLRenderPipelineDescriptor *pipelineDescriptor = [MTLRenderPipelineDescriptor new];
    pipelineDescriptor.vertexFunction = vertexFunc;
    pipelineDescriptor.fragmentFunction = fragmentFunc;
    pipelineDescriptor.colorAttachments[0].pixelFormat = self.colorPixelFormat;

    self.pipeline = [self.device newRenderPipelineStateWithDescriptor:pipelineDescriptor error:NULL];
}

这是我实际绘制纹理的方式

CVImageBufferRef image = (CVImageBufferRef)CFArrayGetValueAtIndex(_imageBuffers, 0);

id<MTLTexture> textureY = [self getTexture:image pixelFormat:MTLPixelFormatR8Unorm planeIndex:0];
id<MTLTexture> textureCbCr = [self getTexture:image pixelFormat:MTLPixelFormatRG8Unorm planeIndex:1];
if(textureY == NULL || textureCbCr == NULL)
   return;

id<CAMetalDrawable> drawable = self.currentDrawable;

id<MTLCommandBuffer> commandBuffer = [_commandQueue commandBuffer];
MTLRenderPassDescriptor *renderPass = self.currentRenderPassDescriptor;
renderPass.colorAttachments[0].clearColor = MTLClearColorMake(0.5, 1, 0.5, 1);

id<MTLRenderCommandEncoder> commandEncoder = [commandBuffer renderCommandEncoderWithDescriptor:renderPass];
[commandEncoder setRenderPipelineState:self.pipeline];
[commandEncoder setVertexBuffer:self.vertexBuffer offset:0 atIndex:0];
[commandEncoder setFragmentTexture:textureY atIndex:0];
[commandEncoder setFragmentTexture:textureCbCr atIndex:1];
[commandEncoder setFragmentBuffer:_colorConversionBuffer offset:0 atIndex:0];
[commandEncoder drawPrimitives:MTLPrimitiveTypeTriangleStrip vertexStart:0 vertexCount:4 instanceCount:1];
[commandEncoder endEncoding];

[commandBuffer presentDrawable:drawable];
[commandBuffer commit];

这就是我将CVPixelBufferRef转换为MTLTexture的方式

- (id<MTLTexture>)getTexture:(CVPixelBufferRef)image pixelFormat:(MTLPixelFormat)pixelFormat planeIndex:(int)planeIndex {
    id<MTLTexture> texture;
    size_t width, height;

    if (planeIndex == -1)
    {
        width = CVPixelBufferGetWidth(image);
        height = CVPixelBufferGetHeight(image);
        planeIndex = 0;
    }
    else
    {
        width = CVPixelBufferGetWidthOfPlane(image, planeIndex);
        height = CVPixelBufferGetHeightOfPlane(image, planeIndex);
        NSLog(@"texture %d, %ld, %ld", planeIndex, width, height);
    }

    CVMetalTextureRef textureRef = NULL;
    CVReturn status = CVMetalTextureCacheCreateTextureFromImage(NULL, _textureCache, image, NULL, pixelFormat, width, height, planeIndex, &textureRef);
    if(status == kCVReturnSuccess)
    {
        texture = CVMetalTextureGetTexture(textureRef);
        CFRelease(textureRef);
    }
    else
    {
        NSLog(@"CVMetalTextureCacheCreateTextureFromImage failed with return stats %d", status);
        return NULL;
    }

    return texture;
}

这是我的片段着色器

fragment float4 fragment_main(Varyings in [[ stage_in ]],
                              texture2d<float, access::sample> textureY [[ texture(0) ]],
                              texture2d<float, access::sample> textureCbCr [[ texture(1) ]],
                              constant ColorConversion &colorConversion [[ buffer(0) ]])
{
    constexpr sampler s(address::clamp_to_edge, filter::linear);
    float3 ycbcr = float3(textureY.sample(s, in.texcoord).r, textureCbCr.sample(s, in.texcoord).rg);

    float3 rgb = colorConversion.matrix * (ycbcr + colorConversion.offset);

    return float4(rgb, 1.0);
}

因为我编码的视图和视频都是666x374，所以我尝试将片段着色器中的采样类型更改为filter::nearest。我以为它会与像素1:1匹配，但仍然很模糊。我注意到的另一件事是，如果您在新标签页中打开上传的图片，会发现它们的尺寸比666x374大得多。我怀疑我在编码方面犯了一个错误，即使我这样做了， AVSampleBufferDisplayLayer仍然设法显示视频而不会模糊，因此它们一定在做我所缺少的正确操作。

最佳答案

看来您已解决了最严重的视图比例问题，其他问题是正确的YCbCr渲染(听起来您在解码时会通过输出BGRA像素来避免这种情况)，然后缩放原始影片以匹配尺寸视图。当请求BGRA像素数据时，数据被编码为sRGB，因此应将纹理中的数据视为sRGB。从sRGB纹理读取时，Metal将自动为您执行从非线性到线性的转换，但是您必须告诉Metal是sRGB像素数据(使用MTLPixelFormatBGRA8Unorm_sRGB)。要实现缩放，只需要使用线性重采样将BGRA数据渲染到视图中即可。如果要查看MetalBT709Decoder的源代码，请参阅上面链接的SO问题，这是我自己的项目，实现了BT.709的正确呈现。