问题描述
我努力让这个code工作,我想通过一个numpy的数组迭代,并根据结果,指数在另一numpy的数组中的值,然后保存在基于价值的新位置。
I am struggling to get this code to work I want to iterate through an numpy array and based on the result, index to a value in another numpy array and then save that in a new position based on that value.
# Convert the sediment transport and the flow direction rasters into Numpy arrays
sediment_transport_np = arcpy.RasterToNumPyArray(sediment_transport_convert, '#', '#', '#', -9999)
flow_direction_np = arcpy.RasterToNumPyArray(flow_direction_convert, '#', '#', '#', -9999)
[rows,cols]= sediment_transport_np.shape
elevation_change = np.zeros((rows,cols), np.float)
# Main body for calculating elevation change
# Attempt 1
for [i, j], flow in np.ndenumerate(flow_direction_np):
if flow == 32:
elevation_change[i, j] = sediment_transport_np[i - 1, j - 1]
elif flow == 16:
elevation_change[i, j] = sediment_transport_np[i, j - 1]
elif flow == 8:
elevation_change[i, j] = sediment_transport_np[i + 1, j - 1]
elif flow == 4:
elevation_change[i, j] = sediment_transport_np[i + 1, j]
elif flow == 64:
elevation_change[i, j] = sediment_transport_np[i - 1, j]
elif flow == 128:
elevation_change[i, j] = sediment_transport_np[i - 1, j + 1]
elif flow == 1:
elevation_change[i, j] = sediment_transport_np[i, j + 1]
elif flow == 2:
elevation_change[i, j] = sediment_transport_np[i + 1, j + 1]
转换的numpy的阵列回到光栅
elevation_change_raster = arcpy.NumPyArrayToRaster(elevation_change, bottom_left_corner, raster_cell_width, raster_cell_height, -9999)
elevation_change_raster.save(output_raster)
我得到的错误是:
The error I get is:
行书elevation_change ...
Running script elevation_change...
回溯(最近通话最后一个):文件,行606,在执行IndexError:指数(655)超出范围(0℃=指数< 655)的尺寸0
Traceback (most recent call last): File "", line 606, in execute IndexError: index (655) out of range (0<=index<655) in dimension 0
未能执行(elevation_change)
Failed to execute (elevation_change)
推荐答案
这个错误是因为你试图超越 sediment_transport
栅格的界限指标(例如第i + 1和j + 1的部分)。现在,你正在试图获得,当你在网格的边界是不存在的价值。此外,它不是引发错误,但你抓住当前相对边缘,当你在I = 0或j = 0(由于I-1和J-1的部分)。
Cause of the Problem
The error is because you're trying to index beyond the bounds of the sediment_transport
grid (e.g. the i+1 and j+1 portions). Right now, you're trying to get a value that doesn't exist when you're at a boundary of the grid. Also, it's not raising an error, but you're currently grabbing the opposite edge when you're at i=0 or j=0 (due to the i-1 and j-1 parts).
您提到,您想 elevation_change
的值是在边界0(这当然似乎是合理的)。另一种常见的边界条件是包装的价值观和抓住从相对边缘的值。它可能没有多大意义,在这种情况下,但我会在一两个例子说明它,因为它很容易用一些方法来实现。
You mentioned that you wanted the values of elevation_change
to be 0 at the boundaries (which certainly seems reasonable). Another common boundary condition is to "wrap" the values and grab a value from the opposite edge. It probably doesn't make much sense in this case, but I'll show it in a couple of examples because it's easy to implement with some of the methods.
这是诱人的,只是捕获异常并将其值设置为0。例如:
It's tempting to just catch the exception and set the value to 0. For example:
for [i, j], flow in np.ndenumerate(flow_direction_np):
try:
if flow == 32:
...
elif ...
...
except IndexError:
elevation_change[i, j] = 0
但是,这种方法实际上是不正确。负索引是有效的,并且将返回格的相对边缘。因此,这将基本上实现在右侧和电网的底部边缘的零的边界条件,并在左和顶部边缘的环绕式边界条件
However, this approach is actually incorrect. Negative indexing is valid, and will return the opposite edge of the grid. Therefore, this would basically implement a "zero" boundary condition on the right and bottom edges of the grid, and a "wrap-around" boundary condition on the left and top edges.
在零的边界条件的情况下,有一个很简单的方法来避免索引问题:垫在 sediment_transport
电网零。这样,如果我们的索引超出了原有电网的优势,我们会得到一个0(或任何恒定值要垫阵列)。
In the case of "zero" boundary conditions, there's a very simple way to avoid indexing problems: Pad the sediment_transport
grid with zeros. This way, if we index beyond the edge of the original grid, we'll get a 0. (Or whatever constant value you'd like to pad the array with.)
的附注:这是使用 numpy.pad
的理想场所。然而,它在V1.7溶液。我要在这里用它来跳过,因为OP提到的ArcGIS和圆弧不与先进的最新版本numpy的出货。的
Side note: This is the perfect place to use numpy.pad
. However, it was added in v1.7. I'm going to skip using it here, as the OP mentions ArcGIS, and Arc doesn't ship with an up-to-date version of numpy.
例如:
padded_transport = np.zeros((rows + 2, cols + 2), float)
padded_transport[1:-1, 1:-1] = sediment_transport
# The two lines above could be replaced with:
#padded_transport = np.pad(sediment_transport, 1, mode='constant')
for [i, j], flow in np.ndenumerate(flow_direction):
# Need to take into account the offset in the "padded_transport"
r, c = i + 1, j + 1
if flow == 32:
elevation_change[i, j] = padded_transport[r - 1, c - 1]
elif flow == 16:
elevation_change[i, j] = padded_transport[r, c - 1]
elif flow == 8:
elevation_change[i, j] = padded_transport[r + 1, c - 1]
elif flow == 4:
elevation_change[i, j] = padded_transport[r + 1, c]
elif flow == 64:
elevation_change[i, j] = padded_transport[r - 1, c]
elif flow == 128:
elevation_change[i, j] = padded_transport[r - 1, c + 1]
elif flow == 1:
elevation_change[i, j] = padded_transport[r, c + 1]
elif flow == 2:
elevation_change[i, j] = padded_transport[r + 1, c + 1]
DRY(不要重复自己)
我们可以用更简洁写这篇code有点字典
:
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
padded_transport = np.zeros((nrows + 2, ncols + 2), float)
padded_transport[1:-1, 1:-1] = sediment_transport
for [i, j], flow in np.ndenumerate(flow_direction):
# Need to take into account the offset in the "padded_transport"
r, c = i + 1, j + 1
# This also allows for flow_direction values not listed above...
dr, dc = lookup.get(flow, (0,0))
elevation_change[i,j] = padded_transport[r + dr, c + dc]
此时,它的原始阵列的位多余的垫。实施不同的边界条件,如果您使用的填充是非常容易 numpy.pad
,但我们可以刚出直接写的逻辑:
At this point, it's a bit superfluous to pad the original array. Implement different boundary conditions by padding is very easy if you use numpy.pad
, but we could just write the logic out directly:
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for [i, j], flow in np.ndenumerate(flow_direction):
dr, dc = lookup.get(flow, (0,0))
r, c = i + dr, j + dc
if not ((0 <= r < nrows) & (0 <= c < ncols)):
elevation_change[i,j] = 0
else:
elevation_change[i,j] = sediment_transport[r, c]
矢量化的计算
通过迭代在python numpy的阵列是原因,我不会深入到这里相当缓慢。因此,也有在numpy的实现该更有效的方法。诀窍是使用 numpy.roll
与布尔索引一起。
有关环绕式的边界条件,它是那样简单:
For "wrap-around" boundary conditions, it's as simple as:
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for value, (row, col) in lookup.iteritems():
mask = flow_direction == value
shifted = np.roll(mask, row, 0)
shifted = np.roll(shifted, col, 1)
elevation_change[mask] = sediment_transport[shifted]
return elevation_change
如果你不熟悉numpy的,这可能看起来有点像希腊。有两个部分此。首先是使用布尔索引。由于这是什么做一个简单的例子:
If you're not familiar with numpy, this probably looks a bit like greek. There are two parts to this. The first is using boolean indexing. As a quick example of what this does:
In [1]: import numpy as np
In [2]: x = np.arange(9).reshape(3,3)
In [3]: x
Out[3]:
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
In [4]: mask = np.array([[False, False, True],
... [True, False, False],
... [True, False, False]])
In [5]: x[mask]
Out[5]: array([2, 3, 6])
正如你所看到的,如果我们的索引具有相同形状的布尔网格阵列,它是真正的价值,将被退回。同样,您可以设置的值这个样子。
As you can see, if we index an array with a boolean grid of the same shape, the values where it is True, will be returned. Similarly, you can set values this way.
接下来的诀窍是 numpy.roll
。这将值由给定的量在一个方向上移动。他们将环绕式的边缘。
The next trick is numpy.roll
. This will shift the values by a given amount in one direction. They'll "wrap-around" at the edges.
In [1]: import numpy as np
In [2]: np.array([[0,0,0],[0,1,0],[0,0,0]])
Out[2]:
array([[0, 0, 0],
[0, 1, 0],
[0, 0, 0]])
In [3]: x = _
In [4]: np.roll(x, 1, axis=0)
Out[4]:
array([[0, 0, 0],
[0, 0, 0],
[0, 1, 0]])
In [5]: np.roll(x, 1, axis=1)
Out[5]:
array([[0, 0, 0],
[0, 0, 1],
[0, 0, 0]])
希望这使得有点感觉,在任何速度。
Hopefully that makes a bit of sense, at any rate.
要实现(使用 numpy.pad
或任意边界条件),我们会做这样的事零的边界条件:
To implement "zero" boundary conditions (or arbitrary boundary conditions using numpy.pad
), we'd do something like this:
def vectorized(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
# Initialize an array for the "shifted" mask
shifted = np.zeros((nrows+2, ncols+2), dtype=bool)
# Pad "sediment_transport" with zeros
# Again, `np.pad` would be better and more flexible here, as it would
# easily allow lots of different boundary conditions...
tmp = np.zeros((nrows+2, ncols+2), sediment_transport.dtype)
tmp[1:-1, 1:-1] = sediment_transport
sediment_transport = tmp
for value, (row, col) in lookup.iteritems():
mask = flow_direction == value
# Reset the "shifted" mask
shifted.fill(False)
shifted[1:-1, 1:-1] = mask
# Shift the mask by the right amount for the given value
shifted = np.roll(shifted, row, 0)
shifted = np.roll(shifted, col, 1)
# Set the values in elevation change to the offset value in sed_trans
elevation_change[mask] = sediment_transport[shifted]
return elevation_change
优势向矢量
在矢量的版本快很多,但是会使用更多的内存。
Advantage to Vectorization
The "vectorized" version is much faster, but will use more RAM.
对于1000通过1000网:
For a 1000 by 1000 grid:
In [79]: %timeit vectorized(flow_direction, sediment_transport)
10 loops, best of 3: 170 ms per loop
In [80]: %timeit iterate(flow_direction, sediment_transport)
1 loops, best of 3: 5.36 s per loop
In [81]: %timeit lookup(flow_direction, sediment_transport)
1 loops, best of 3: 3.4 s per loop
这些结果是从比较用随机生成的数据如下实现:
These results are from comparing the following implementations with randomly generated data:
import numpy as np
def main():
# Generate some random flow_direction and sediment_transport data...
nrows, ncols = 1000, 1000
flow_direction = 2 ** np.random.randint(0, 8, (nrows, ncols))
sediment_transport = np.random.random((nrows, ncols))
# Make sure all of the results return the same thing...
test1 = vectorized(flow_direction, sediment_transport)
test2 = iterate(flow_direction, sediment_transport)
test3 = lookup(flow_direction, sediment_transport)
assert np.allclose(test1, test2)
assert np.allclose(test2, test3)
def vectorized(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
sediment_transport = np.pad(sediment_transport, 1, mode='constant')
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for value, (row, col) in lookup.iteritems():
mask = flow_direction == value
shifted = np.pad(mask, 1, mode='constant')
shifted = np.roll(shifted, row, 0)
shifted = np.roll(shifted, col, 1)
elevation_change[mask] = sediment_transport[shifted]
return elevation_change
def iterate(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
padded_transport = np.pad(sediment_transport, 1, mode='constant')
for [i, j], flow in np.ndenumerate(flow_direction):
r, c = i + 1, j + 1
if flow == 32:
elevation_change[i, j] = padded_transport[r - 1, c - 1]
elif flow == 16:
elevation_change[i, j] = padded_transport[r, c - 1]
elif flow == 8:
elevation_change[i, j] = padded_transport[r + 1, c - 1]
elif flow == 4:
elevation_change[i, j] = padded_transport[r + 1, c]
elif flow == 64:
elevation_change[i, j] = padded_transport[r - 1, c]
elif flow == 128:
elevation_change[i, j] = padded_transport[r - 1, c + 1]
elif flow == 1:
elevation_change[i, j] = padded_transport[r, c + 1]
elif flow == 2:
elevation_change[i, j] = padded_transport[r + 1, c + 1]
return elevation_change
def lookup(flow_direction, sediment_transport):
elevation_change = np.zeros_like(sediment_transport)
nrows, ncols = flow_direction.shape
lookup = {32: (-1, -1),
16: (0, -1),
8: (1, -1),
4: (1, 0),
64: (-1, 0),
128:(-1, 1),
1: (0, 1),
2: (1, 1)}
for [i, j], flow in np.ndenumerate(flow_direction):
dr, dc = lookup.get(flow, (0,0))
r, c = i + dr, j + dc
if not ((0 <= r < nrows) & (0 <= c < ncols)):
elevation_change[i,j] = 0
else:
elevation_change[i,j] = sediment_transport[r, c]
return elevation_change
if __name__ == '__main__':
main()
这篇关于通过numpy的数组迭代,然后在另一个数组索引值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!