结合两个数组并排序

本文介绍了结合两个数组并排序的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

给出两个排序后的数组，如下所示:

Given two sorted arrays like the following:

a = array([1,2,4,5,6,8,9])

b = array([3,4,7,10])

我希望输出为:

c = array([1,2,3,4,5,6,7,8,9,10])

或:

c = array([1,2,3,4,4,5,6,7,8,9,10])

我知道我可以执行以下操作:

I'm aware that I can do the following:

c = unique(concatenate((a,b))

我只是想知道是否有一种更快的方法，因为我要处理的数组具有数百万个元素.

I'm just wondering if there is a faster way to do it as the arrays I'm dealing with have millions of elements.

任何想法都值得欢迎.谢谢

Any idea is welcomed. Thanks

推荐答案

由于您使用了numpy，我怀疑bisec根本无法帮助您...所以我建议您做两件事:

Since you use numpy, I doubt that bisec helps you at all... So instead I would suggest two smaller things:

不要不使用np.sort，而使用c.sort()方法代替，该方法将数组排序到位并避免复制.
np.unique必须使用未安装的np.sort.因此，不要手动使用np.unique进行逻辑处理. IE.首先进行排序(就地)，然后手动执行np.unique方法(还检查其python代码)，并使用flag = np.concatenate(([True], ar[1:] != ar[:-1]))和unique = ar[flag](将ar进行排序).为了更好一点，您可能应该使标志操作本身就位. flag = np.ones(len(ar), dtype=bool)和np.not_equal(ar[1:], ar[:-1], out=flag[1:])，基本上避免了flag的一个完整副本.
对此我不确定.但是.sort具有3种不同的算法，因为您的数组可能已经差不多排序了，所以更改排序方法可能会产生速度差异.

Do not use np.sort, use c.sort() method instead which sorts the array in place and avoids the copy.
np.unique must use np.sort which is not in place. So instead of using np.unique do the logic by hand. IE. first sort (in-place) then do the np.unique method by hand (check also its python code), with flag = np.concatenate(([True], ar[1:] != ar[:-1])) with which unique = ar[flag] (with ar being sorted). To be a bit better, you should probably make the flag operation in place itself, ie. flag = np.ones(len(ar), dtype=bool) and then np.not_equal(ar[1:], ar[:-1], out=flag[1:]) which avoids basically one full copy of flag.
I am not sure about this. But .sort has 3 different algorithms, since your arrays maybe are almost sorted already, changing the sorting method might make a speed difference.

这将使完整的东西接近您所得到的(无需事先做独特的事情):

This would make the full thing close to what you got (without doing a unique beforehand):

def insort(a, b, kind='mergesort'):
    # took mergesort as it seemed a tiny bit faster for my sorted large array try.
    c = np.concatenate((a, b)) # we still need to do this unfortunatly.
    c.sort(kind=kind)
    flag = np.ones(len(c), dtype=bool)
    np.not_equal(c[1:], c[:-1], out=flag[1:])
    return c[flag]

这篇关于结合两个数组并排序的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！