本文介绍了使用numpy.take进行类型转换错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个查询表(LUT),存储了65536个uint8值:

I have a look-up table (LUT) that stores 65536 uint8 values:

lut = np.random.randint(256, size=(65536,)).astype('uint8')

我想使用此LUT转换uint16 s数组中的值:

I want to use this LUT to convert the values in an array of uint16s:

arr = np.random.randint(65536, size=(1000, 1000)).astype('uint16')

,我想就地进行转换,因为最后一个数组可能会变得很大.当我尝试时,会发生以下情况:

and I want to do the conversion in place, because this last array can get pretty big. When I try it, the following happens:

>>> np.take(lut, arr, out=arr)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Python27\lib\site-packages\numpy\core\fromnumeric.py", line 103, in take
    return take(indices, axis, out, mode)
TypeError: array cannot be safely cast to required type

我不知道发生了什么.我知道,如果没有out参数,则返回的dtype与lut相同,因此uint8.但是为什么不能将uint8强制转换为uint16?如果您问numpy:

And I don't understand what is going on. I know that, without an out argument, the return is of the same dtype as lut, so uint8. But why can't a uint8 be cast to a uint16? If you ask numpy:

>>> np.can_cast('uint8', 'uint16')
True

显然,以下作品:

>>> lut = lut.astype('uint16')
>>> np.take(lut, arr, out=arr)
array([[173, 251, 218, ..., 110,  98, 235],
       [200, 231,  91, ..., 158, 100,  88],
       [ 13, 227, 223, ...,  94,  56,  36],
       ..., 
       [ 28, 198,  80, ...,  60,  87, 118],
       [156,  46, 118, ..., 212, 198, 218],
       [203,  97, 245, ...,   3, 191, 173]], dtype=uint16)

但这也可以:

>>> lut = lut.astype('int32')
>>> np.take(lut, arr, out=arr)
array([[ 78, 249, 148, ...,  77,  12, 167],
       [138,   5, 206, ...,  31,  43, 244],
       [ 29, 134, 131, ..., 100, 107,   1],
       ..., 
       [109, 166,  14, ...,  64,  95, 102],
       [152, 169, 102, ..., 240, 166, 148],
       [ 47,  14, 129, ..., 237,  11,  78]], dtype=uint16)

这真的没有任何意义,因为现在将int32强制转换为uint16,这绝对不是安全的事情:

This really makes no sense, since now int32s are being cast to uint16s, which is definitely not a safe thing to do:

>>> np.can_cast('int32', 'uint16')
False

如果将lut的dtype设置为uint16uint32uint64int32int64中的任何内容,我的代码将起作用,但对于uint8int8int16.

My code works if I set the lut's dtype to anything in uint16, uint32, uint64, int32 or int64, but fails for uint8, int8 and int16.

我错过了什么吗,还是只是在numpy中弄坏了?

Am I missing something, or is this simply broken in numpy?

也欢迎变通方法...由于LUT没那么大,所以我想让它的类型与数组的类型匹配并没有什么坏,即使它占用了两倍的空间,但是感觉不对.做到这一点...

Workarounds are also welcome... Since the LUT is not that big, I guess it is not that bad to have its type match the array's, even if that takes twice the space, but it just doesn't feel right to do that...

有没有办法让numpy不用担心铸造安全?

Is there a way to tell numpy to not worry about casting safety?

推荐答案

有趣的问题. numpy.take(lut, ...)转换为lut.take(...),其来源可以在这里查看:

Interesting problem. numpy.take(lut, ...) gets transformed into lut.take(...) whose source can be looked at here:

https://github.com. com/numpy/numpy/blob/master/numpy/core/src/multiarray/item_selection.c#L28

我认为已引发异常在第105行:

obj = (PyArrayObject *)PyArray_FromArray(out, dtype, flags);
if (obj == NULL) {
    goto fail;
}

在您的情况下,outarr,但dtypelut之一,即uint8.因此,它尝试将arr强制转换为uint8,这将失败.我不得不说我不确定为什么需要这样做,只是指出了它的原因.出于某些原因,take似乎假设您希望输出数组具有与dtype >.

where in your case out is arr but dtype is the one of lut, i.e. uint8. So it tries to cast arr to uint8, which fails. I have to say that I'm not sure why it needs to do that, just pointing out it does... For some reason take seems to assume you want as the output array to have the same dtype as lut.

顺便说一句,在许多情况下,对PyArray_FromArray的调用实际上将创建一个新阵列,并且替换将无法进行.例如,如果 takemode='raise' (默认值,以及示例中发生的情况),或者在任何时候lut.dtype != arr.dtype.好吧,至少应该这样,而且我无法解释为什么,当您将lut转换为int32时,输出数组仍然是uint16!这对我来说是一个谜-也许与 NPY_ARRAY_UPDATEIFCOPY 标志(另请参见此处).

By the way, in many cases the call to PyArray_FromArray will actually create a new array and the replacement will not be in place. This is the case for example if you call take with mode='raise' (the default, and what happens in your examples), or whenever lut.dtype != arr.dtype. Well, at least it should, and I can't explain why, when you cast lut to int32 the output array remains uint16! This is a mystery to me - maybe it has something to do with the NPY_ARRAY_UPDATEIFCOPY flag (see also here).

底线:

  1. 确实很难理解numpy的行为...也许其他人会提供一些了解为什么它会执行其操作的
  2. 我不会尝试就地处理arr-无论如何,似乎在大多数情况下都会在后台创建一个新的数组.我只是选择arr = lut.take(arr)-顺便说一句,它将最终释放arr以前使用的一半内存.
  1. the behavior of numpy is indeed difficult to understand... Maybe someone else will provide some insight into why it does what it does
  2. I would not try to process arr in place - it seems that a new array is created under the hood in most cases anyway. I'd simply go with arr = lut.take(arr) - which by the way will eventually free half of the memory previously used by arr.

这篇关于使用numpy.take进行类型转换错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-09 23:12