本文介绍了对于交错PCM音频设置音频格式的单元并呈现回调的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我正在试图回放这是我收到的一系列UDP数据包的音频。这是德coded进入PCM具有以下属性框:

I'm currently attempting to play back audio which I receive in a series of UDP packets. These are decoded into PCM frames with the following properties:


  • 2通道

  • 交错

  • 每个样品
  • 2个字节在一个单一的通道(因此4
    每帧字节)

  • 与48000的采样率。

每一个UDP数据包包含480帧,所以缓冲区的大小为480 * 2(通道)* 2(每通道字节)。

Every UDP packet contains 480 frames, so the buffer's size is 480 * 2(channels) * 2(bytes per channel).

我需要建立一个音频单元回放这些数据包。所以,我的第一个问题是,我应该怎么设置了音频单元的AudioStreamBasicDescription结构?综观文档我甚至不知道是否交错PCM是一个可以接受的格式。

I need to set up an Audio Unit to play back these packets. So, my first question is, how should I set up the AudioStreamBasicDescription struct for the Audio Unit? Looking at the documentation I'm not even sure if interleaved PCM is an acceptable format.

这是我到目前为止有:

struct AudioStreamBasicDescription {
   Float64 mSampleRate;                 //48000
   UInt32  mFormatID;                   //?????
   UInt32  mFormatFlags;                //?????
   UInt32  mBytesPerPacket;             //Not sure what "packet" means here
   UInt32  mFramesPerPacket;            //Same as above
   UInt32  mBytesPerFrame;              //Same
   UInt32  mChannelsPerFrame;           //2?
   UInt32  mBitsPerChannel;             //16?
   UInt32  mReserved;                   //???
};
typedef struct AudioStreamBasicDescription  AudioStreamBasicDescription;

其次,设置它之后,我不知道如何从UDP回调到实际的音频单元渲染功能得到帧。

Secondly, after setting it up, I'm not sure how to get the frames from the UDP callback to the actual Audio Unit rendering function.

我公司目前拥有从插座上监听我在其中产生INT16 *包含我要播放音频缓冲区的回调函数。据我了解,我也有,以实现对以下形式的音频单元渲染回调:

I currently have a callback function from the socket listener in which I generate the int16 * buffers that contain the audio I want to play. As I understand it, I also have to implement a render callback for the audio unit of the following form:

OSStatus RenderFrames(
    void                        *inRefCon,
    AudioUnitRenderActionFlags  *ioActionFlags,
    const AudioTimeStamp        *inTimeStamp,
    UInt32                      inBusNumber,
    UInt32                      inNumberFrames,
    AudioBufferList             *ioData)
{
    //No idea what I should do here.
    return noErr;
}

全部放在一起,我觉得我的插座接待回调应该做的是去code框架,并把它们放在一个缓冲结构,使RenderFrames回调可以获取从该缓冲区的图像,发挥他们背部。 这是正确的吗而如果是,有一次我取了RenderFrames功能下一帧,如何我实际上是提交播放

Putting it all together, I think what my socket reception callback should do is decode the frames, and put them in a buffer structure, so that the RenderFrames callback can fetch the frames from that buffer, and play them back. Is this correct? And if it is, once I fetch the next frame in the RenderFrames function, how do I actually "submit it" for playback?

推荐答案

采取这一节在时间

苹果的ASBD文档here.为了澄清:

Apple's documentation for the ASBD is here. To clarify:


  • 音频帧是音频样本的时间重合的集合。换句话说,每个通道一个样本。对于立体声因此,这 2

  • 对于PCM格式,没有packetisation。据称, mBytesPerPacket = mBytesPerFrame mFramesPerPacket = 1 ,但我不知道这是否是真正被使用。

  • mReserved 未使用,必须 0

  • 请参照The为 mFormatID 文档和 mFormatFlags 。有一个方便的辅助函数 CalculateLPCMFlags 在CoreAudioTypes.h计算这些后者 CoreAudioTypes.h

  • 多声道音频一般交错(可以设置在 mFormatFlags 一点,如果你真的不希望它是)。

  • 还有另外一个辅助功能,可以填写整个ASBD - FillOutASBDForLPCM()为线性PCM的常见的情况

  • 大量的组合 mFormatID mFormatFlags 不受remoteIO单位的支持 - 我发现实验是在必要的iOS版。

  • A frame of audio is a time-coincident set of audio samples. In other words, one sample per channel. For Stereo this is therefore 2.
  • For PCM formats, there is no packetisation. Supposedly, mBytesPerPacket = mBytesPerFrame, mFramesPerPacket=1 but I'm not sure whether this is actually ever used.
  • mReserved isn't used and must be 0
  • Refer to The documentation for mFormatID and mFormatFlags. There is a handy helper function CalculateLPCMFlags in CoreAudioTypes.h for computing the latter of these in CoreAudioTypes.h.
  • Multi-channel audio is generally interleaved (you can set a bit in mFormatFlags if you really don't want it to be).
  • There's another helper function that can fill out the entire ASBD - FillOutASBDForLPCM() for the common cases of linear PCM.
  • Lots of combinations of mFormatID and mFormatFlags are not supported by remoteIO units - I found experimentation to be necessary on iOS.

下面是从我的项目之一,一些工作code:

Here's some working code from one of my projects:

AudioStreamBasicDescription inputASBL = {0};

inputASBL.mSampleRate =          static_cast<Float64>(sampleRate);
inputASBL.mFormatID =            kAudioFormatLinearPCM;
inputASBL.mFormatFlags =         kAudioFormatFlagIsPacked | kAudioFormatFlagIsSignedInteger,
inputASBL.mFramesPerPacket =     1;
inputASBL.mChannelsPerFrame =    2;
inputASBL.mBitsPerChannel =      sizeof(short) * 8;
inputASBL.mBytesPerPacket =      sizeof(short) * 2;
inputASBL.mBytesPerFrame =       sizeof(short) * 2;
inputASBL.mReserved =            0;

渲染回调

CoreAudio的经营苹果公司形容为的的模型。也就是说,该渲染回拨被称为表格时需要CoreAudio的缓冲器填充一个实时线程。从你的问题看来你所期望的相反 - 将数据推到音频输出。

Render Callbacks

CoreAudio operates what Apple describe as a pull model. That is to say, that the render call-back is called form a real-time thread when CoreAudio needs the buffer filling. From your question it appears you are expecting the opposite - pushing the data to the audio output.

基本上有两种实现选择:

There are essentially two implementation choices:


  1. 执行无阻塞从呈现回调UDP套接字读取(作为一般规则,任何你在这里做的应该是快速和非阻塞)。

  2. 保持在其中的样品被插入时接收和呈现回调消耗音频FIFO。

二是可能是更好的选择,但你将需要管理缓冲区过冲和下运行自己。

The second is probably the better choice, but you are going to need to manage buffer over- and under-runs yourself.

ioData 参数指向一个分散 - 集中控制结构。在最简单的情况下,它指向包含所有帧中的一个缓冲区,但可以包含几个在它们之间有足够的帧满足 inNumberFrames 。通常,一个pre-分配足够大的缓冲区为 inNumberFrames ,复制样品到它,然后修改 AudioBufferList 对象指向购买 ioData 指向它。

The ioData argument points to a scatter-gather control structure. In the simplest case, it points to one buffer containing all of the frames, but could contain several that between them have sufficient frames to satisfy inNumberFrames. Normally, one pre-allocates a buffer big enough for inNumberFrames, copies samples into it and then modifies the AudioBufferList object pointed to buy ioData to point to it.

在您的应用程序可能会分散收集有关德codeD音频数据包的方式,分配的缓冲区,因为他们是去codeD。但是,你并不总是得到你想要的延迟和可能无法为 inNumberFrames 安排是一样的你去codeD音频的UDP帧。

In your application you could potentially a scatter-gather approach on your decoded audio packets, allocating buffers as they are decoded. However, you don't always get the latency you wanted and might not be able to arrange for inNumberFrames to be the same as your decoded UDP frames of audio.

这篇关于对于交错PCM音频设置音频格式的单元并呈现回调的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 00:52