问题描述
我在使用 scipy interp2d
函数时遇到了无效输入错误.结果证明问题来自 bisplrep
函数,如下所示:
I've been having invalid input errors when working with scipy interp2d
function. It turns out the problem comes from the bisplrep
function, as showed here:
import numpy as np
from scipy import interpolate
# Case 1
x = np.linspace(0,1)
y = np.zeros_like(x)
z = np.ones_like(x)
tck = interpolate.bisplrep(x,y,z) # or interp2d
返回:ValueError:无效输入
事实证明,我给 interp2d
提供的测试数据只包含一个不同的第二轴值,如上面的测试样本.interp2d
中的 bisplrep
函数将其视为无效输出:这可以被认为是可接受的行为:interp2d
&bisplrep
需要一个 2D 网格,我只给它们一行的值.
It turned out the test data I was giving interp2d
contained only one distinct value for the 2nd axis, as in the test sample above. The bisplrep
function inside interp2d
considers it as an invalid output:This may be considered as an acceptable behaviour: interp2d
& bisplrep
expect a 2D grid, and I'm only giving them values along one line.
顺便说一句,我发现错误消息很不清楚.可以在 interp2d
中包含一个测试来处理这种情况:类似于
On a side note, I find the error message quite unclear. One could include a test in interp2d
to deal with such cases: something along the lines of
if len(np.unique(x))==1 or len(np.unique(y))==1:
ValueError ("Can't build 2D splines if x or y values are all the same")
可能足以检测这种无效输入,并引发更明确的错误消息,甚至直接调用更合适的interp1d
函数(在这里完美运行)
may be enough to detect this kind of invalid input, and raise a more explicit error message, or even directly call the more appropriate interp1d
function (which works perfectly here)
我以为我已经正确理解了这个问题.但是,请考虑以下代码示例:
I thought I had correctly understood the problem. However, consider the following code sample:
# Case 2
x = np.linspace(0,1)
y = x
z = np.ones_like(x)
tck = interpolate.bisplrep(x,y,z)
在那种情况下,y
与 x
成正比,我还沿着一行向 bisplrep
提供数据.但是,令人惊讶的是,bisplrep
在这种情况下能够计算 2D 样条插值.我绘制了它:
In that case, y
being proportional to x
, I'm also feeding bisplrep
with data along one line. But, surprisingly, bisplrep
is able to compute a 2D spline interpolation in that case. I plotted it:
# Plot
def plot_0to1(tck):
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
X = np.linspace(0,1,10)
Y = np.linspace(0,1,10)
Z = interpolate.bisplev(X,Y,tck)
X,Y = np.meshgrid(X,Y)
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1, cmap=cm.coolwarm,
linewidth=0, antialiased=False)
plt.show()
plot_0to1(tck)
结果如下:
其中 bisplrep
似乎用 0 填充了空白,当我扩展下面的图时可以更好地显示:
where bisplrep
seems to fill the gaps with 0's, as better showed when I extend the plot below:
关于是否需要添加 0,我真正的问题是:为什么 bisplrep
在案例 2 中有效,而在案例 1 中无效?
Regarding of whether adding 0 is expected, my real question is: why does bisplrep
work in Case 2 but not in Case 1?
或者,换句话说:当 2D 插值仅沿一个方向输入时(案例 1 和 2 失败),我们是否希望它返回错误?(案例 1 和 2 应该返回一些东西,即使是不可预测的).
Or, in other words: do we want it to return an error when 2D interpolation is fed with input along one direction only (Case 1 & 2 fail), or not? (Case 1 & 2 should return something, even if unpredicted).
推荐答案
我原本打算向您展示,如果您的输入数据沿坐标轴而不是某个一般方向定向,它对 2d 插值的影响有多大,但结果证明结果比我预想的还要混乱.我尝试在插值矩形网格上使用随机数据集,并将其与相同的 x
和 y
坐标旋转 45 度进行插值的情况进行比较.结果很糟糕.
I was originally going to show you how much of a difference it makes for 2d interpolation if your input data are oriented along the coordinate axes rather than in some general direction, but it turns out that the result would be even messier than I had anticipated. I tried using a random dataset over an interpolated rectangular mesh, and comparing that to a case where the same x
and y
coordinates were rotated by 45 degrees for interpolation. The result was abysmal.
然后我尝试与更平滑的数据集进行比较:结果 scipy.interpolate.interp2d
有很多问题.所以我的底线是使用 scipy.interpolate.griddata
".
I then tried doing a comparison with a smoother dataset: turns out scipy.interpolate.interp2d
has quite a few issues. So my bottom line will be "use scipy.interpolate.griddata
".
出于指导目的,这是我的(相当混乱的)代码:
For instructive purposes, here's my (quite messy) code:
import numpy as np
import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D
import matplotlib.cm as cm
n = 10 # rough number of points
dom = np.linspace(-2,2,n+1) # 1d input grid
x1,y1 = np.meshgrid(dom,dom) # 2d input grid
z = np.random.rand(*x1.shape) # ill-conditioned sample
#z = np.cos(x1)*np.sin(y1) # smooth sample
# first interpolator with interp2d:
fun1 = interp.interp2d(x1,y1,z,kind='linear')
# construct twice finer plotting and interpolating mesh
plotdom = np.linspace(-1,1,2*n+1) # for interpolation and plotting
plotx1,ploty1 = np.meshgrid(plotdom,plotdom)
plotz1 = fun1(plotdom,plotdom) # interpolated points
# construct 45-degree rotated input and interpolating meshes
rotmat = np.array([[1,-1],[1,1]])/np.sqrt(2) # 45-degree rotation
x2,y2 = rotmat.dot(np.vstack([x1.ravel(),y1.ravel()])) # rotate input mesh
plotx2,ploty2 = rotmat.dot(np.vstack([plotx1.ravel(),ploty1.ravel()])) # rotate plotting/interp mesh
# interpolate on rotated mesh with interp2d
# (reverse rotate by using plotx1, ploty1 later!)
fun2 = interp.interp2d(x2,y2,z.ravel(),kind='linear')
# I had to generate the rotated points element-by-element
# since fun2() accepts only rectangular meshes as input
plotz2 = np.array([fun2(xx,yy) for (xx,yy) in zip(plotx2.ravel(),ploty2.ravel())])
# try interpolating with griddata
plotz3 = interp.griddata(np.array([x1.ravel(),y1.ravel()]).T,z.ravel(),np.array([plotx1.ravel(),ploty1.ravel()]).T,method='linear')
plotz4 = interp.griddata(np.array([x2,y2]).T,z.ravel(),np.array([plotx2,ploty2]).T,method='linear')
# function to plot a surface
def myplot(X,Y,Z):
fig = plt.figure()
ax = Axes3D(fig)
ax.plot_surface(X, Y, Z,rstride=1, cstride=1,
linewidth=0, antialiased=False,cmap=cm.coolwarm)
plt.show()
# plot interp2d versions
myplot(plotx1,ploty1,plotz1) # Cartesian meshes
myplot(plotx1,ploty1,plotz2.reshape(2*n+1,-1)) # rotated meshes
# plot griddata versions
myplot(plotx1,ploty1,plotz3.reshape(2*n+1,-1)) # Cartesian meshes
myplot(plotx1,ploty1,plotz4.reshape(2*n+1,-1)) # rotated meshes
这是一个结果库.使用随机输入的z
数据和interp2d
,笛卡尔(左)与旋转插值(右):
So here's a gallery of the results. Using random input z
data, and interp2d
, Cartesian (left) vs rotated interpolation (right):
注意右侧的可怕比例,注意输入点在0
和1
之间.甚至它的母亲也无法识别数据集.请注意,在对旋转数据集进行评估期间会出现运行时警告,因此我们被警告说这都是废话.
Note the horrible scale on the right side, noting that the input points are between 0
and 1
. Even its mother wouldn't recognize the data set. Note that there are runtime warnings during the evaluation of the rotated data set, so we're being warned that it's all crap.
现在让我们对 griddata
做同样的事情:
Now let's do the same with griddata
:
我们应该注意到这些数字彼此更接近,而且它们似乎比 interp2d
的输出方式更有意义.例如,请注意第一个数字的比例过冲.
We should note that these figures are much closer to each other, and they seem to make way more sense than the output of interp2d
. For instance, note the overshoot in the scale of the very first figure.
这些伪影总是出现在输入数据点之间.由于它仍然是插值,输入点必须由插值函数重现,但线性插值函数在数据点之间过冲是很奇怪的.很明显,griddata
不会遇到这个问题.
These artifacts always arise between input data points. Since it's still interpolation, the input points have to be reproduced by the interpolating function, but it's pretty weird that a linear interpolating function overshoots between data points. It's clear that griddata
doesn't suffer from this issue.
考虑一个更明确的情况:另一组 z
值,它们是平滑且确定的.interp2d
的表面:
Consider an even more clear case: the other set of z
values, which are smooth and deterministic. The surfaces with interp2d
:
帮助!打电话给插值警察!已经笛卡尔输入壳体具有在它莫名其妙(当然,至少由我)伪特征和旋转后的输入的情况下造成的召唤̐z̻̉ͬͪ̑ͭͨ͊ǟ̼̣̬̗̖ͥl̫̣͔͓̟͛͊̏ͨ͗g̻͇͈͚̟̻͛ͫ͛̅͋͒o͈͓̥̙̫͚̾威胁.
HELP! Call the interpolation police! Already the Cartesian input case has inexplicable (well, at least by me) spurious features in it, and the rotated input case poses the threat of s͔̖̰͕̞͖͇ͣ́̈̒ͦ̀̀ü͇̹̞̳ͭ̊̓̎̈m̥̠͈̣̆̐ͦ̚m̻͑͒̔̓ͦ̇oͣ̐ͣṉ̟͖͙̆͋i͉̓̓ͭ̒͛n̹̙̥̩̥̯̭ͤͤͤ̄g͈͇̼͖͖̭̙ ̐z̻̉ͬͪ̑ͭͨ͊ä̼̣̬̗̖́̄ͥl̫̣͔͓̟͛͊̏ͨ͗̎g̻͇͈͚̟̻͛ͫ͛̅͋͒o͈͓̱̥̙̫͚̾͂.
所以让我们对 griddata
做同样的事情:
So let's do the same with griddata
:
多亏了 scipy.interpolate.griddata
,这一天得以挽救.作业:用 cubic
插值检查相同.
The day is saved, thanks to scipy.interpolate.griddata
. Homework: check the same with cubic
interpolation.
顺便说一句,对您的原始问题的一个非常简短的回答在 help(interp.interp2d)
中:
By the way, a very short answer to your original question is in help(interp.interp2d)
:
| Notes
| -----
| The minimum number of data points required along the interpolation
| axis is ``(k+1)**2``, with k=1 for linear, k=3 for cubic and k=5 for
| quintic interpolation.
对于线性插值,您需要沿插值轴至少有 4 个点,即必须至少有 4 个唯一的 x
和 y
值呈现以获得有意义的结果.检查这些:
For linear interpolation you need at least 4 points along the interpolation axis, i.e. at least 4 unique x
and y
values have to be present to get a meaningful result. Check these:
nvals = 3 # -> RuntimeWarning
x = np.linspace(0,1,10)
y = np.random.randint(low=0,high=nvals,size=x.shape)
z = x
interp.interp2d(x,y,z)
nvals = 4 # -> no problem here
x = np.linspace(0,1,10)
y = np.random.randint(low=0,high=nvals,size=x.shape)
z = x
interp.interp2d(x,y,z)
当然,这一切都与您这样的问题有关:如果您的几何 1d 数据集沿着笛卡尔轴之一,或者如果它以一般方式使坐标值假定各种不同,则会产生巨大的差异值.从几何 1d 数据集中尝试 2d 插值可能毫无意义(或至少非常不明确),但如果您的数据沿 x,y的一般方向,至少算法不应该中断代码>平面.
And of course this all ties in to you question like this: it makes a huge difference if your geometrically 1d data set is along one of the Cartesian axes, or if it's in a general way such that the coordinate values assume various different values. It's probably meaningless (or at least very ill-defined) to try 2d interpolation from a geometrically 1d data set, but at least the algorithm shouldn't break if your data are along a general direction of the x,y
plane.
这篇关于给定一维输入时 scipy interp2d/bisplrep 意外输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!