利用帧之间的相位变化从FFT箱提取precise频率

本文介绍了利用帧之间的相位变化从FFT箱提取precise频率的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我一直在寻找通过这个梦幻般的文章：http://www.dspdimension.com/admin/pitch-shifting-using-the-ft/

I have been looking through this fantastic article: http://www.dspdimension.com/admin/pitch-shifting-using-the-ft/

虽然是太棒了，这是非常辛苦和艰难的事。这种材料实在是拉伸了我。

While being fantastic, it is extremely hard and heavy going. This material is really stretching me.

我已经提取从计算给定仓的精确频率Stefan的code模块的数学。但我不明白，最后的计算。有人可以在最后给我解释的数学结构？

I have extracted the maths from Stefan's code module that calculates the exact frequency for a given bin. But I don't understand the last calculation. Can someone explain to me the mathematical construction at the end?

在挖掘到code，让我现场设置：

Before digging into the code, let me set the scene:

比方说，我们设置fftFrameSize = 1024，所以我们处理的512 + 1箱

Let's say we set fftFrameSize = 1024, so we are dealing with 512+1 bins

作为一个例子，滨[1]的理想频率装配在框架上的单波。在40KHz的采样率，tOneFrame = 1024 / 40K秒= 1/40秒，所以斌[1]理想地收集40Hz的信号。

As an example, Bin[1]'s ideal frequency fits a single wave in the frame. At a sample rate of 40KHz, tOneFrame = 1024/40K seconds = 1/40s, so Bin[1] would ideally be collecting a 40Hz signal.

设置OSAMP（过采样）= 4，我们沿着256步我们的输入信号进展，以便先分析到1023字节检查零，然后通过256 1279等。注意每个浮点得到处理的4倍。

Setting osamp (overSample) = 4, we progress along our input signal in steps of 256. So the first analysis examines bytes zero through 1023, then 256 through 1279, etc. Note each float gets processed 4 times.

...

void calcBins(
              long fftFrameSize,
              long osamp,
              float sampleRate,
              float * floats,
              BIN * bins
              )
{
    /* initialize our static arrays */
    static float gFFTworksp[2*MAX_FRAME_LENGTH];
    static float gLastPhase[MAX_FRAME_LENGTH/2+1];

    static long gInit = 0;
    if (! gInit)
    {
        memset(gFFTworksp, 0, 2*MAX_FRAME_LENGTH*sizeof(float));
        memset(gLastPhase, 0, (MAX_FRAME_LENGTH/2+1)*sizeof(float));
        gInit = 1;
    }

    /* do windowing and re,im interleave */
    for (long k = 0; k < fftFrameSize; k++)
    {
        double window = -.5*cos(2.*M_PI*(double)k/(double)fftFrameSize)+.5;
        gFFTworksp[2*k] = floats[k] * window;
        printf("sinValue: %f", gFFTworksp[2*k]);
        gFFTworksp[2*k+1] = 0.;
    }

    /* do transform */
    smbFft(gFFTworksp, fftFrameSize, -1);

    printf("\n");

    /* this is the analysis step */
    for (long k = 0; k <= fftFrameSize/2; k++)
    {
        /* de-interlace FFT buffer */
        double real = gFFTworksp[2*k];
        double imag = gFFTworksp[2*k+1];

        /* compute magnitude and phase */
        double magn = 2.*sqrt(real*real + imag*imag);
        double phase = atan2(imag,real);

        /* compute phase difference */
        double phaseDiff = phase - gLastPhase[k];
        gLastPhase[k] = phase;

        /* subtract expected phase difference */
        double binPhaseOffset = M_TWOPI * (double)k / (double)osamp;
        double deltaPhase = phaseDiff - binPhaseOffset;

        /* map delta phase into [-Pi, Pi) interval */
        // better, but obfuscatory...
        //    deltaPhase -= M_TWOPI * floor(deltaPhase / M_TWOPI + .5);

        while (deltaPhase >= M_PI)
            deltaPhase -= M_TWOPI;
        while (deltaPhase < -M_PI)
            deltaPhase += M_TWOPI;

（)现在的我有点不明白：

() Now the bit I don't get:

        // Get deviation from bin frequency from the +/- Pi interval
        // Compute the k-th partials' true frequency

        // Start with bin's ideal frequency
        double bin0Freq = (double)sampleRate / (double)fftFrameSize;
        bins[k].idealFreq = (double)k * bin0Freq;

        // Add deltaFreq
        double sampleTime = 1. / (double)sampleRate;
        double samplesInStep = (double)fftFrameSize / (double)osamp;
        double stepTime = sampleTime * samplesInStep;
        double deltaTime = stepTime;

        // Definition of frequency is rate of change of phase, i.e. f = dϕ/dt
        // double deltaPhaseUnit = deltaPhase / M_TWOPI; // range [-.5, .5)
        double freqAdjust = (1. / M_TWOPI) * deltaPhase / deltaTime;

        // Actual freq <-- WHY ???
        bins[k].freq = bins[k].idealFreq + freqAdjust;
    }
}

我无法看清楚，即使它似乎在脸上盯着。可能有人请解释一步从无到有，一步这一过程？

I just can't see it clearly, even though it seems to be staring in the face. Could someone please explain this process from scratch, step by step?

推荐答案

最后，我已经想通了这一点;我真的不得不从头得到它。我知道会有推导出它的一些简单的方法，我的（平时）错误是试图跟着其他人的逻辑，而不是用自己的常识。

Finally I have figured this out; really I had to derive it from scratch. I knew there would be some simple way to derive it, my (usual) mistake was to attempt to follow other people's logic rather than use my own common sense.

这个谜题需要的两个键即可解锁。

This puzzle takes two keys to unlock it.

第一个关键是了解如何过采样介绍了仓阶段的旋转。

The first key is to understand how over-sampling introduces a rotation on bin phase.

第二个关键来自图3.3和放大器; 3.4在这里：http://www.dspdimension.com/admin/pitch-shifting-using-the-ft/

The second key comes from Graph 3.3 & 3.4 here: http://www.dspdimension.com/admin/pitch-shifting-using-the-ft/

...

for (int k = 0; k <= fftFrameSize/2; k++)
{
    // compute magnitude and phase
    bins[k].mag = 2.*sqrt(fftBins[k].real*fftBins[k].real + fftBins[k].imag*fftBins[k].imag);
    bins[k].phase = atan2(fftBins[k].imag, fftBins[k].real);

    // Compute phase difference Δϕ fo bin[k]
    double deltaPhase;
    {
        double measuredPhaseDiff = bins[k].phase - gLastPhase[k];
        gLastPhase[k] = bins[k].phase;

        // Subtract expected phase difference <-- FIRST KEY
        // Think of a single wave in a 1024 float frame, with osamp = 4
        //   if the first sample catches it at phase = 0, the next will
        //   catch it at pi/2 ie 1/4 * 2pi
        double binPhaseExpectedDiscrepancy = M_TWOPI * (double)k / (double)osamp;
        deltaPhase = measuredPhaseDiff - binPhaseExpectedDiscrepancy;

        // Wrap delta phase into [-Pi, Pi) interval
        deltaPhase -= M_TWOPI * floor(deltaPhase / M_TWOPI + .5);
    }

    // say sampleRate = 40K samps/sec, fftFrameSize = 1024 samps in FFT giving bin[0] thru bin[512]
    // then bin[1] holds one whole wave in the frame, ie 44 waves in 1s ie 44Hz ie sampleRate / fftFrameSize
    double bin0Freq = (double)sampleRate / (double)fftFrameSize;
    bins[k].idealFreq = (double)k * bin0Freq;

    // Consider Δϕ for bin[k] between hops.
    // write as 2π / m.
    // so after m hops, Δϕ = 2π, ie 1 extra cycle has occurred   <-- SECOND KEY
    double m = M_TWOPI / deltaPhase;

    // so, m hops should have bin[k].idealFreq * t_mHops cycles.  plus this extra 1.
    //
    // bin[k].idealFreq * t_mHops + 1 cycles in t_mHops seconds
    //   => bins[k].actualFreq = bin[k].idealFreq + 1 / t_mHops
    double tFrame = fftFrameSize / sampleRate;
    double tHop = tFrame / osamp;
    double t_mHops = m * tHop;

    bins[k].freq = bins[k].idealFreq + 1. / t_mHops;
}

这篇关于利用帧之间的相位变化从FFT箱提取precise频率的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！