本文介绍了MF SinkWriter mp4文件的播放持续时间是添加音频样本时的时间的一半,图像的播放速度也要快两倍的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我为c#项目创建了一个托管c ++库,以基于MSDN教程 SinkWriter .为了测试结果是否正确,我创建了一种提供600帧的方法.这些帧代表10秒的视频,每秒60帧.

I created a managed c++ library for my c# project to encode images and audio to a mp4 container base on the MSDN tutorial SinkWriter. To test if the result is ok I created a method that provides 600 frames. This frames represent a 10 second video with 60 frames per second.

我提供的图像每秒变化一次,我的音频文件包含的声音数达到10.

The images I provide change every second and my audio file contains a voice that counts to 10.

我面临的问题是输出视频实际上只有5秒长.视频的元数据显示是10秒,但不是10秒.而且声音几乎不超过5.

The problem I am facing is that the output video actualy is only 5 seconds long. The meta data of the video is showing that it is 10 seconds but isn't. Also the voice barely counts up to 5.

如果我只写不带音频部分的图像样本,则视频的持续时间为预期的10秒.

If I only write the image samples without the audio part the duration of the video is the expected 10 seconds.

我在这里想念什么?

这是我的应用程序的某些部分.

这是我用来创建600帧的c#部分,然后在c#部分中也调用PushFrame方法.

This is the c# part I am using to create the 600 frames and then I call the PushFrame method also in the c# part.

var videoFrameCount = 10 * FPS;
SetBinaryImage();

for (int i = 0; i <= videoFrameCount; i++)
{
    // New picture every second
    if (i > 0 &&  i % FPS == 0)
    {
        SetBinaryImage();
    }

    PushFrame();
}

PushFrame方法将图像和音频数据复制到SinkWriter提供的指针.然后,我调用SinkWriter的PushFrame方法.

The PushFrame method copies the image and audio data to the pointer provided by the SinkWriter. Then I call the PushFrame method of the SinkWriter.

private void PushFrame()
{
    try
    {
        encodeStopwatch.Reset();
        encodeStopwatch.Start();

        // Video
        var frameBufferHandler = GCHandle.Alloc(frameBuffer, GCHandleType.Pinned);
        frameBufferPtr = frameBufferHandler.AddrOfPinnedObject();
        CopyImageDataToPointer(BinaryImage, ScreenWidth, ScreenHeight, frameBufferPtr);

        // Audio
        var audioBufferHandler = GCHandle.Alloc(audioBuffer, GCHandleType.Pinned);
        audioBufferPtr = audioBufferHandler.AddrOfPinnedObject();
        var readLength = audioBuffer.Length;

        if (BinaryAudio.Length - (audioOffset + audioBuffer.Length) < 0)
        {
            readLength = BinaryAudio.Length - audioOffset;
        }

        if (!EndOfFile)
        {
            Marshal.Copy(BinaryAudio, audioOffset, (IntPtr)audioBufferPtr, readLength);
            audioOffset += audioBuffer.Length;

        }

        if (readLength < audioBuffer.Length && !EndOfFile)
        {
            EndOfFile = true;
        }

        unsafe
        {
            // Copy video data
            var yuv = SinkWriter.VideoCapturerBuffer();
            SinkWriter.Encode((byte*)frameBufferPtr, ScreenWidth, ScreenHeight, (int)SWPF.SWPF_RGB, yuv);

            // Copy audio data
            var audioDestPtr = SinkWriter.AudioCapturerBuffer();
            SinkWriter.EncodeAudio((byte*)audioBufferPtr, audioDestPtr);

            SinkWriter.PushFrame();
        }

        encodeStopwatch.Stop();
        Console.WriteLine($"YUV frame generated in: {encodeStopwatch.TakeTotalMilliseconds()} ms");
    }
    catch (Exception ex)
    {
    }
}

以下是我在c ++中添加到SinkWriter中的一些部分.我猜音频部分的MediaType是可以的,因为音频的播放有效.

Here are some parts I added to the SinkWriter in c++. The MediaTypes for the audio part are ok I guess because the playback of the audio works.

rtStart和rtDuration的定义如下:

The rtStart and rtDuration are defined like this:

LONGLONG rtStart = 0;
UINT64 rtDuration;
MFFrameRateToAverageTimePerFrame(fps, 1, &rtDuration);

像这样使用编码器的两个缓冲区

The two buffers from the encoders are used like this

int SinkWriter::Encode(Byte * rgbBuf, int w, int h, int pxFormat, Byte * yufBuf)
{
    const LONG cbWidth = 4 * VIDEO_WIDTH;
    const DWORD cbBuffer = cbWidth * VIDEO_HEIGHT;

    // Create a new memory buffer.
    HRESULT hr = MFCreateMemoryBuffer(cbBuffer, &pFrameBuffer);

    // Lock the buffer and copy the video frame to the buffer.
    if (SUCCEEDED(hr))
    {
        hr = pFrameBuffer->Lock(&yufBuf, NULL, NULL);
    }

    if (SUCCEEDED(hr))
    {
        // Calculate the stride
        DWORD bitsPerPixel = GetBitsPerPixel(pxFormat);
        DWORD bytesPerPixel = bitsPerPixel / 8;
        DWORD stride = w * bytesPerPixel;

        // Copy image in yuv pointer
        hr = MFCopyImage(
            yufBuf,                      // Destination buffer.
            stride,                    // Destination stride.
            rgbBuf,     // First row in source image.
            stride,                    // Source stride.
            stride,                    // Image width in bytes.
            h                // Image height in pixels.
        );
    }

    if (pFrameBuffer)
    {
        pFrameBuffer->Unlock();
    }

    // Set the data length of the buffer.
    if (SUCCEEDED(hr))
    {
        hr = pFrameBuffer->SetCurrentLength(cbBuffer);
    }

    if (SUCCEEDED(hr))
    {
        return 0;
    }
    else
    {
        return -1;
    }

    return 0;
}

int SinkWriter::EncodeAudio(Byte * src, Byte * dest)
{
    DWORD samplePerSecond = AUDIO_SAMPLES_PER_SECOND * AUDIO_BITS_PER_SAMPLE * AUDIO_NUM_CHANNELS;
    DWORD cbBuffer = samplePerSecond / 1000;

    // Create a new memory buffer.
    HRESULT hr = MFCreateMemoryBuffer(cbBuffer, &pAudioBuffer);

    // Lock the buffer and copy the video frame to the buffer.
    if (SUCCEEDED(hr))
    {
        hr = pAudioBuffer->Lock(&dest, NULL, NULL);
    }

    CopyMemory(dest, src, cbBuffer);

    if (pAudioBuffer)
    {
        pAudioBuffer->Unlock();
    }

    // Set the data length of the buffer.
    if (SUCCEEDED(hr))
    {
        hr = pAudioBuffer->SetCurrentLength(cbBuffer);
    }

    if (SUCCEEDED(hr))
    {
        return 0;
    }
    else
    {
        return -1;
    }

    return 0;
}

这是SinkWriter的PushFrame方法,它将SinkWriter,streamIndex,audioIndex,rtStart和rtDuration传递给WriteFrame方法.

This is the PushFrame method of the SinkWriter that passes the SinkWriter, streamIndex, audioIndex, rtStart and rtDuration to the WriteFrame method.

int SinkWriter::PushFrame()
{
    if (initialized)
    {
        HRESULT hr = WriteFrame(ptrSinkWriter, stream, audio, rtStart, rtDuration);
        if (FAILED(hr))
        {
            return -1;
        }

        rtStart += rtDuration;

        return 0;
    }

    return -1;
}

这是将视频和音频样本结合在一起的WriteFrame方法.

And here's the WriteFrame method that combines the video and audio sample.

HRESULT SinkWriter::WriteFrame(IMFSinkWriter *pWriter, DWORD streamIndex, DWORD audioStreamIndex, const LONGLONG& rtStart, const LONGLONG& rtDuration)
{
    IMFSample *pVideoSample = NULL;

    // Create a media sample and add the buffer to the sample.
    HRESULT hr = MFCreateSample(&pVideoSample);

    if (SUCCEEDED(hr))
    {
        hr = pVideoSample->AddBuffer(pFrameBuffer);
    }
    if (SUCCEEDED(hr))
    {
        pVideoSample->SetUINT32(MFSampleExtension_Discontinuity, FALSE);
    }
    // Set the time stamp and the duration.
    if (SUCCEEDED(hr))
    {
        hr = pVideoSample->SetSampleTime(rtStart);
    }
    if (SUCCEEDED(hr))
    {
        hr = pVideoSample->SetSampleDuration(rtDuration);
    }

    // Send the sample to the Sink Writer.
    if (SUCCEEDED(hr))
    {
        hr = pWriter->WriteSample(streamIndex, pVideoSample);
    }

    // Audio
    IMFSample *pAudioSample = NULL;

    if (SUCCEEDED(hr))
    {
        hr = MFCreateSample(&pAudioSample);
    }

    if (SUCCEEDED(hr))
    {
        hr = pAudioSample->AddBuffer(pAudioBuffer);
    }

    // Set the time stamp and the duration.
    if (SUCCEEDED(hr))
    {
        hr = pAudioSample->SetSampleTime(rtStart);
    }
    if (SUCCEEDED(hr))
    {
        hr = pAudioSample->SetSampleDuration(rtDuration);
    }
    // Send the sample to the Sink Writer.
    if (SUCCEEDED(hr))
    {
        hr = pWriter->WriteSample(audioStreamIndex, pAudioSample);
    }


    SafeRelease(&pVideoSample);
    SafeRelease(&pFrameBuffer);
    SafeRelease(&pAudioSample);
    SafeRelease(&pAudioBuffer);
    return hr;
}

推荐答案

问题是音频缓冲区大小的计算错误.这是正确的计算:

The problem was that the calculation of the buffer size for the audio was wrong.This is the right calculation:

var avgBytesPerSecond = sampleRate * 2 * channels;
var avgBytesPerMillisecond = avgBytesPerSecond / 1000;
var bufferSize = avgBytesPerMillisecond * (1000 / 60);
audioBuffer = new byte[bufferSize];

在我的问题中,我的缓冲区大小为一毫秒.因此,MF Framework似乎可以加快图像处理速度,因此声音听起来还不错.固定缓冲区大小后,视频将具有我期望的持续时间,声音也没有错误.

In my question I had the buffer size for one millisecond. So it seems the MF Framework speeds up the images so the audio sounds fine. After I fixed the buffer size the video has exactly the duration I expected and the sound also has no errors.

这篇关于MF SinkWriter mp4文件的播放持续时间是添加音频样本时的时间的一半,图像的播放速度也要快两倍的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-24 09:25