本文介绍了numpy-从一维数组中删除最后一个元素的最佳方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

从numpy 1维数组中删除最后一个元素的最有效方法是什么? (如弹出列表)

What is the most efficient way to remove the last element from a numpy 1 dimensional array? (like pop for list)

推荐答案

NumPy数组具有固定大小,因此无法就地删除元素.例如,使用del不起作用:

NumPy arrays have a fixed size, so you cannot remove an element in-place. For example using del doesn't work:

>>> import numpy as np
>>> arr = np.arange(5)
>>> del arr[-1]
ValueError: cannot delete array elements

请注意,索引-1代表最后一个元素.那是因为Python(和NumPy)中的负索引是从头算起的,因此-1是最后一个,-2是最后一个,而-len实际上是第一个元素.仅供参考,以防万一.

Note that the index -1 represents the last element. That's because negative indices in Python (and NumPy) are counted from the end, so -1 is the last, -2 is the one before last and -len is actually the first element. That's just for your information in case you didn't know.

Python列表的大小可变,因此可以轻松添加或删除元素.

Python lists are variable sized so it's easy to add or remove elements.

因此,如果要删除元素,则需要创建一个新的数组或视图.

So if you want to remove an element you need to create a new array or view.

您可以使用切片符号创建一个包含除最后一个元素之外的所有元素的新视图:

You can create a new view containing all elements except the last one using the slice notation:

>>> arr = np.arange(5)
>>> arr
array([0, 1, 2, 3, 4])

>>> arr[:-1]  # all but the last element
array([0, 1, 2, 3])
>>> arr[:-2]  # all but the last two elements
array([0, 1, 2])
>>> arr[1:]   # all but the first element
array([1, 2, 3, 4])
>>> arr[1:-1] # all but the first and last element
array([1, 2, 3])

但是,视图与原始数组共享数据,因此,如果一个视图被修改,另一个视图也将被共享:

However a view shares the data with the original array, so if one is modified so is the other:

>>> sub = arr[:-1]
>>> sub
array([0, 1, 2, 3])
>>> sub[0] = 100
>>> sub
array([100,   1,   2,   3])
>>> arr
array([100,   1,   2,   3,   4])

创建一个新数组

1.复制视图

如果您不喜欢这种内存共享,则必须创建一个新数组,在这种情况下,创建视图然后复制(例如使用 copy() 数组方法)

Creating a new array

1. Copy the view

If you don't like this memory sharing you have to create a new array, in this case it's probably simplest to create a view and then copy (for example using the copy() method of arrays) it:

>>> arr = np.arange(5)
>>> arr
array([0, 1, 2, 3, 4])
>>> sub_arr = arr[:-1].copy()
>>> sub_arr
array([0, 1, 2, 3])
>>> sub_arr[0] = 100
>>> sub_arr
array([100,   1,   2,   3])
>>> arr
array([0, 1, 2, 3, 4])

2.使用整数数组索引[ docs ]

但是,您也可以使用整数数组索引来删除最后一个元素并获得一个新数组.此整数数组索引将始终(不是100%确定存在)创建副本而不是视图:

2. Using integer array indexing [docs]

However, you can also use integer array indexing to remove the last element and get a new array. This integer array indexing will always (not 100% sure there) create a copy and not a view:

>>> arr = np.arange(5)
>>> arr
array([0, 1, 2, 3, 4])
>>> indices_to_keep = [0, 1, 2, 3]
>>> sub_arr = arr[indices_to_keep]
>>> sub_arr
array([0, 1, 2, 3])
>>> sub_arr[0] = 100
>>> sub_arr
array([100,   1,   2,   3])
>>> arr
array([0, 1, 2, 3, 4])

此整数数组索引可用于从数组中删除任意元素(当您需要视图时,这可能很棘手或不可能):

This integer array indexing can be useful to remove arbitrary elements from an array (which can be tricky or impossible when you want a view):

>>> arr = np.arange(5, 10)
>>> arr
array([5, 6, 7, 8, 9])
>>> arr[[0, 1, 3, 4]]  # keep first, second, fourth and fifth element
array([5, 6, 8, 9])

如果您想要一个使用整数数组索引删除最后一个元素的通用函数:

If you want a generalized function that removes the last element using integer array indexing:

def remove_last_element(arr):
    return arr[np.arange(arr.size - 1)]

3.使用布尔数组索引[ docs ]

也可以使用布尔索引,例如:

3. Using boolean array indexing [docs]

There is also boolean indexing that could be used, for example:

>>> arr = np.arange(5, 10)
>>> arr
array([5, 6, 7, 8, 9])
>>> keep = [True, True, True, True, False]
>>> arr[keep]
array([5, 6, 7, 8])

这还会创建一个副本!通用方法可能如下所示:

This also creates a copy! And a generalized approach could look like this:

def remove_last_element(arr):
    if not arr.size:
        raise IndexError('cannot remove last element of empty array')
    keep = np.ones(arr.shape, dtype=bool)
    keep[-1] = False
    return arr[keep]

如果您想了解有关NumPys索引编制的更多信息,请参见有关索引编制"的文档"很好,涵盖了很多情况.

If you would like more information on NumPys indexing the documentation on "Indexing" is quite good and covers a lot of cases.

通常我不建议NumPy函数看起来"像它们正在就地修改数组一样(例如np.appendnp.insert),但确实会返回副本,因为它们通常不必要地缓慢且容易引起误解.您应该尽可能避免使用它们,这就是为什么这是我的回答的最后一点.但是在这种情况下,它实际上是一个完美的选择,因此我不得不提一下:

Normally I wouldn't recommend the NumPy functions that "seem" like they are modifying the array in-place (like np.append and np.insert) but do return copies because these are generally needlessly slow and misleading. You should avoid them whenever possible, that's why it's the last point in my answer. However in this case it's actually a perfect fit so I have to mention it:

>>> arr = np.arange(10, 20)
>>> arr
array([10, 11, 12, 13, 14, 15, 16, 17, 18, 19])
>>> np.delete(arr, -1)
array([10, 11, 12, 13, 14, 15, 16, 17, 18])

5.)使用 np.resize()

NumPy还有另一种方法,听起来像它执行了就地操作,但实际上返回了一个新数组:

5.) Using np.resize()

NumPy has another method that sounds like it does an in-place operation but it really returns a new array:

>>> arr = np.arange(5)
>>> arr
array([0, 1, 2, 3, 4])
>>> np.resize(arr, arr.size - 1)
array([0, 1, 2, 3])

要删除最后一个元素,我只需提供一个比以前小1的新形状,即可有效删除最后一个元素.

To remove the last element I simply provided a new shape that is 1 smaller than before, which effectively removes the last element.

是的,我之前曾写过您不能就地修改数组.但是我说这是因为在大多数情况下,这是不可能的,或者仅通过禁用一些(完全有用的)安全检查来实现.我不确定内部结构,但是取决于旧的大小和新的大小,这可能包括(仅供内部使用)复制操作,因此可能比创建视图要慢.

Yes, I've written previously that you cannot modify an array in place. But I said that because in most cases it's not possible or only by disabling some (completely useful) safety checks. I'm not sure about the internals but depending on the old size and the new size it could be possible that this includes an (internal-only) copy operation so it might be slower than creating a view.

如果该阵列不与其他任何阵列共享其内存,则可以在适当位置调整该阵列的大小:

If the array doesn't share its memory with any other array, then it's possible to resize the array in place:

>>> arr = np.arange(5, 10)
>>> arr.resize(4)
>>> arr
array([5, 6, 7, 8])

但是,如果另外一个数组也实际引用了ValueError,则会抛出ValueError:

However that will throw ValueErrors in case it's actually referenced by another array as well:

>>> arr = np.arange(5)
>>> view = arr[1:]
>>> arr.resize(4)
ValueError: cannot resize an array that references or is referenced by another array in this way.  Use the resize function

您可以通过设置refcheck=False来禁用该安全检查,但这不应该轻易完成,因为在其他引用尝试访问已删除的元素的情况下,您很容易遭受分段错误和内存损坏! refcheck参数应被视为仅专家选项!

You can disable that safety-check by setting refcheck=False but that shouldn't be done lightly because you make yourself vulnerable for segmentation faults and memory corruption in case the other reference tries to access the removed elements! This refcheck argument should be treated as an expert-only option!

创建视图确实非常快,并且不会占用过多的内存,因此,只要有可能,您都应尝试尽可能多地使用视图.但是,根据用例,使用基本切片来删除任意元素并不容易.尽管删除前n个元素和/或后n个元素或删除每个x元素(切片的step参数)很容易,但这是您所能做的.

Creating a view is really fast and doesn't take much additional memory, so whenever possible you should try to work as much with views as possible. However depending on the use-cases it's not so easy to remove arbitrary elements using basic slicing. While it's easy to remove the first n elements and/or last n elements or remove every x element (the step argument for slicing) this is all you can do with it.

但是在删除一维数组的最后一个元素的情况下,我建议:

But in your case of removing the last element of a one-dimensional array I would recommend:

arr[:-1]          # if you want a view
arr[:-1].copy()   # if you want a new array

因为这些内容最清楚地表达了其意图,并且具有Python/NumPy经验的每个人都会意识到这一点.

because these most clearly express the intent and everyone with Python/NumPy experience will recognize that.

基于此 answer 中的计时框架:

# Setup
import numpy as np

def view(arr):
    return arr[:-1]

def array_copy_view(arr):
    return arr[:-1].copy()

def array_int_index(arr):
    return arr[np.arange(arr.size - 1)]

def array_bool_index(arr):
    if not arr.size:
        raise IndexError('cannot remove last element of empty array')
    keep = np.ones(arr.shape, dtype=bool)
    keep[-1] = False
    return arr[keep]

def array_delete(arr):
    return np.delete(arr, -1)

def array_resize(arr):
    return np.resize(arr, arr.size - 1)

# Timing setup
timings = {view: [],
           array_copy_view: [], array_int_index: [], array_bool_index: [],
           array_delete: [], array_resize: []}
sizes = [2**i for i in range(1, 20, 2)]

# Timing
for size in sizes:
    print(size)
    func_input = np.random.random(size=size)
    for func in timings:
        print(func.__name__.ljust(20), ' ', end='')
        res = %timeit -o func(func_input)   # if you use IPython, otherwise use the "timeit" module
        timings[func].append(res)

# Plotting
%matplotlib notebook

import matplotlib.pyplot as plt
import numpy as np

fig = plt.figure(1)
ax = plt.subplot(111)

for func in timings:
    ax.plot(sizes,
            [time.best for time in timings[func]],
            label=func.__name__)
ax.set_xscale('log')
ax.set_yscale('log')
ax.set_xlabel('size')
ax.set_ylabel('time [seconds]')
ax.grid(which='both')
ax.legend()
plt.tight_layout()

我得到以下时序作为对数-对数图来覆盖所有细节,时间越短意味着速度越快,但是两个刻度之间的范围代表一个数量级而不是固定数量.如果您对特定值感兴趣,我将它们复制到此要点:

I get the following timings as log-log plot to cover all the details, lower time still means faster, but the range between two ticks represents one order of magnitude instead of a fixed amount. In case you're interested in the specific values, I copied them into this gist:

根据这些时间,这两种方法也是最快的. (Python 3.6和NumPy 1.14.0)

According to these timings those two approaches are also the fastest. (Python 3.6 and NumPy 1.14.0)

这篇关于numpy-从一维数组中删除最后一个元素的最佳方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-04 07:45
查看更多