问题描述
我正在尝试为项目设置 FFT
并且真的没有清楚地了解事情......
基本上,我正在使用音频单元
从设备的麦克风获取数据。然后,我想对该数据执行 FFT
。这是我到目前为止所理解的:我需要为我的数据设置一个循环缓冲区。在每个填充的缓冲区上,我应用 Hann窗口
然后执行 FFT
。但是,我仍然需要一些重叠的帮助。为了获得更精确的结果,我知道我需要使用它,因为我正在使用窗口。但是,我找不到任何东西...
这是我到目前为止所用的(用于音高检测):
I'm trying to setup FFT
for a project and really didn't get a clear picture on things...Basically, I am using Audio Units
to get the data from the device's microphone. I then want to do FFT
on that data. This is what I understand so far: I need to setup a circular buffer for my data. On each filled buffer, I apply a Hann window
then do an FFT
. However, I still need some help on overlapping. To get more precise results, I understand I need to use this expecially since I am using windowing. However, I can't find anything on this...Here's what I have so far (used for pitch detection):
// Setup -------------
UInt32 log2N = 10; // 1024 samples
UInt32 N = (1 << log2N);
FFTSetup FFTSettings = vDSP_create_fftsetup(log2N, kFFTRadix2);
COMPLEX_SPLIT FFTData;
FFTData.realp = (float *) malloc(sizeof(float) * N/2);
FFTData.imagp = (float *) malloc(sizeof(float) * N/2);
float * hannWindow = (float *) malloc(sizeof(float) * N);
// create an array of floats to represent a hann window
vDSP_hann_window(hannWindow, N, 0);
// FFT Time ----------
// Moving data from A to B via hann window
vDSP_vmul(A, 1, hannWindow, 1, B, 1, N);
// Converting data in B into split complex form
vDSP_ctoz((COMPLEX *) B, 2, &FFTData, 1, N/2);
// Doing the FFT
vDSP_fft_zrip(FFTSettings, &FFTData, 1, log2N, kFFTDirection_Forward);
// calculating square of magnitude for each value
vDSP_zvmags(&FFTData, 1, FFTData.realp, 1, N/2);
// Inverse FFT
vDSP_fft_zrip(FFTSettings, &FFTData, 1, log2N, kFFTDirection_Inverse);
// Storing the autocorrelation results in B
vDSP_ztoc(&FFTData, 1, (COMPLEX *)B, 2, N/2);
vDSP_Length lastZeroCrosssing;
vDSP_Length zeroCrossingCount;
vDSP_nzcros(B, 1, N, &lastZeroCrossing, &zeroCrossingCount, N);
// Cleanup -----------
vDSP_destroy_fftsetup(FFTSettings);
free(FFTOutput.realp);
free(FFTOutput.imagp);
free(hannWindow);
那么我在哪里以及如何包含重叠?此外,任何代码片段都会受到欢迎。谢谢
So where and how would I include overlapping? Also, any code snippets would be more then welcome. Thanks
更新:
此项目的最终目标是对音频进行指纹识别,尽可能接近尽可能实时,所以我需要尽可能准确的结果 - 因此重叠。为了这个目的,我想我实际上可以将所有部分从反向清理掉。
The final goal for this project is to do a fingerprinting of the audio, as close to real-time as possible so I need the results as accurate as possible - thus the overlapping. For this purpose I think I could actually drop all the part from inverse to cleanup.
推荐答案
你实际上并不是需要重叠 - 通常帧重叠以在时间轴上提供更高的分辨率,例如用于绘制光谱图或用于估计音符开始时间。你现在可以让你的代码在没有重叠的情况下工作,因为它不那么复杂,然后决定你以后是否需要在时间轴上获得更高的分辨率。
You don't actually need to overlap - typically frames are overlapped to give higher resolution in the time axis, e.g. for plotting spectrograms or for estimating note onset times. You could just get your code working without overlapping for now, as it's less complicated, and then decide whether you need higher resolution on the time axis later.
如果你决定了我想要添加重叠然后你需要保存前一个缓冲区的一大块(例如50%),然后对于每个新的缓冲区,你将处理两个完整的缓冲区,如下所示:
If you decide you do want to add overlapping then you will need to save a chunk of the previous buffer (e.g. 50%) and then for each new buffer you will process two complete buffers as follows:
- 处理旧缓冲区的最后50%+新缓冲区的前50%
- 处理100%的新缓冲区
- 为下一次迭代保存新缓冲区的最后50%
对于不同的重叠百分比,类似的逻辑适用。
For different overlap percentages a similar logic applies.
请注意,增加重叠超过某一点可能会适得其反,因为所需的处理带宽会大大增加而分辨率几乎没有增加。
Note that increasing overlap beyond a certain point can become counterproductive as the required processing bandwidth increases greatly with little gain in resolution.
这篇关于使用Apple的Accelerate框架,FFT,Hann窗口和重叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!