python - 最近邻算法

好的，所以我对编程很陌生。我正在使用 Python 2.7，我的下一个目标是实现最近邻算法的一些轻型版本(请注意，我不是在谈论 k-最近邻)。我尝试了很多方法，其中一些方法很接近，但我似乎仍然无法确定。

首先， - 我使用一个数组来表示我的事件矩阵，这是一个好主意吗？我在这里考虑 numpys 数组和矩阵。

其次， - 我知道算法( Pseudocode )，它非常简单。但我绝对可以使用某种kickstart。我对完整的实现不感兴趣，但我现在有库存，我正在寻求一些帮助。

也许以上内容不足以澄清我的问题，但请随时提问。

先感谢您。

好的，所以现在我又试了一次。看来我已经明白了， - 无论如何，这就是我所做的。我对结果很满意，因为我是新手。但我相信你有一些提示或改进。

import numpy as np
import copy

'''
                        NEAREST NEIGHBOUR ALGORITHM
                        ---------------------------


The algorithm takes two arguments. The first one is an array, with elements
being lists/column-vectors from the given complete incidensmatrix. The second
argument is an integer which represents the startingnode where 1 is the
smallest. The program will only make sense, if the triangle inequality is satisfied.
Furthermore, diagonal elements needs to be inf. The pseudocode is listed below:


1. - stand on an arbitrary vertex as current vertex.
2. - find out the shortest edge connecting current vertex and an unvisited vertex V.
3. - set current vertex to V.
4. - mark V as visited.
5. - if all the vertices in domain are visited, then terminate.
6. - Go to step 2.

The sequence of the visited vertices is the output of the algorithm

Remark - infinity is entered as np.inf
'''



def NN(A, start):

    start = start-1 #To compensate for the python index starting at 0.
    n = len(A)
    path = [start]
    costList = []
    tmp = copy.deepcopy(start)
    B = copy.deepcopy(A)

    #This block eliminates the startingnode, by setting it equal to inf.
    for h in range(n):
        B[h][start] = np.inf

    for i in range(n):

        # This block appends the visited nodes to the path, and appends
        # the cost of the path.
        for j in range(n):
            if B[tmp][j] == min(B[tmp]):
                costList.append(B[tmp][j])
                path.append(j)
                tmp = j
                break

        # This block sets the current node to inf, so it can't be visited again.
        for k in range(n):
            B[k][tmp] = np.inf

    # The last term adds the weight of the edge connecting the start - and endnote.
    cost = sum([i for i in costList if i < np.inf]) + A[path[len(path)-2]][start]

    # The last element needs to be popped, because it is equal to inf.
    path.pop(n)

    # Because we want to return to start, we append this node as the last element.
    path.insert(n, start)

    # Prints the path with original indicies.
    path = [i+1 for i in path]

    print "The path is: ", path
    print "The cost is: ", cost
    print
    return ""

'''
If the desired result is to know the path and cost from every startnode,
then initialize the following method:
'''
def every_node(A):
    for i in range(1, len(A)):
        print NN(A, i)
    return ""

最佳答案

您的解决方案很好，但可以简化并同时提高效率。如果您使用的是 numpy 数组，通常情况下，这些小的内部 for 循环只需几行代码即可替换。结果应该更短并且运行得更快，因为 numpy 使用编译函数来完成它的工作。习惯这种编程风格可能需要一些时间——不是循环遍历数组的每个元素，而是一次对整个数组进行操作。阅读此过程的示例将有所帮助；寻找诸如“如何使 XX 在 numpy 中更高效？”之类的问题。

这是一个示例神经网络实现:

import numpy as np
def NN(A, start):
    """Nearest neighbor algorithm.
    A is an NxN array indicating distance between N locations
    start is the index of the starting location
    Returns the path and cost of the found solution
    """
    path = [start]
    cost = 0
    N = A.shape[0]
    mask = np.ones(N, dtype=bool)  # boolean values indicating which
                                   # locations have not been visited
    mask[start] = False

    for i in range(N-1):
        last = path[-1]
        next_ind = np.argmin(A[last][mask]) # find minimum of remaining locations
        next_loc = np.arange(N)[mask][next_ind] # convert to original location
        path.append(next_loc)
        mask[next_loc] = False
        cost += A[last, next_loc]

    return path, cost

..这里是使用这个函数的一个例子:

# Expected order is 0,2,3,1,4
A = np.array([
    [0, 2, 1, 2, 2],
    [2, 0, 2, 1, 1],
    [1, 2, 0, 1, 2],
    [2, 1, 1, 0, 2],
    [2, 1, 2, 2, 0]])
print NN(A,0)

关于python - 最近邻算法，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/17493494/