问题描述
我看过几个例子,表明如果输入长度是2、3、5、7等的乘积,那么numpy的fft实现很快.但是,这里仍然被认为是小"的最大素数是什么?
I've seen several examples showing that if the input length is a product of 2,3,5,7 etc. then numpy's fft implementation is fast. But what is the largest prime number that is still considered "small" here?
推荐答案
请注意scipy的FFT的半径为2、3、4和5(参考).我认为numpy可能有类似的实现方式,这将使FFT长度中的5是最大的有效素数.
Note that scipy's FFT has radices of 2, 3, 4, and 5 (reference) . I assume numpy may have a similar implementation, which would make 5 the largest efficient prime factor in FFT lengths.
根据经验,出于FFT性能的考虑,我认为最大的小"素数是11.但是,任何小于30的输入长度出于实用目的都会非常快. Python的执行开销肯定会使任何算法性能的提高相形见war.随着输入长度的增加,事情变得越来越有趣.
Empirically, the largest prime I'd consider "small" for the purpose of FFT performance is 11. But any input length of less than about 30 is going to be pretty fast for practical purposes. Any algorithmic performance gains are certainly going to be dwarved by Python's execution overhead. Things are getting more interesting for higher input lengths.
以下是一些小FFT的性能结果(中位执行时间超过500个批次,每个批次1000个FFT):
Here are some performance results for small FFTs (median execution time over 500 batches of 1000 FFTs each):
我用红色标记了素数n
,用绿色标记了2的幂.
I have marked prime valued n
in red and power-of-twos in green.
标记以下观察结果:
-
通常,FFT的质数较慢,但幂为2时较快.这几乎是可以预期的,并且可以验证结果.
in general the FFT is slow for primes but fast for power of twos. This is pretty much expected and validates the results.
n <=11
的性能差异没有可测量的.这可能是由于FFT的实现或执行开销.
no performance difference for n <=11
was measurable. This may be due to FFT implementation or execution overhead.
底价31(可能为29)或更高的底币显然比附近的其他值慢.
Primes of 31 (maybe 29) and higher are clearly slower than other nearby values.
有些非2的幂的值也可以提供良好的性能.这可能是高度合成的数字.
There are some non-power-of-two values that also give good performance. This are probably highly composite numbers.
测量是这样进行的:
import numpy as np
import matplotlib.pyplot as plt
from time import time
N = np.arange(2, 65)
times = np.empty((500, N.size))
for i, n in enumerate(N):
for r in range(times.shape[0]):
x = np.random.randn(1000, n)
t = time()
y = np.fft.fft(x, axis=-1)
t = time() - t
times[r, i] = t
med = np.median(times, axis=0)
plt.plot(N, med, 'k')
primes = np.array([2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61])
plt.plot(primes, med[primes-2]+0.0005, 'rx', label='n = prime')
ptwos = np.array([2, 4, 8, 16, 32, 64])
plt.plot(ptwos, med[ptwos-2]-0.0005, 'gx', label='n = 2**k')
plt.legend(loc='best')
plt.xlabel('n')
plt.ylabel('time')
plt.grid()
plt.show()
这篇关于numpy fft对于小的素数乘积的长度很快,但是又有多小呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!