问题描述
我正在学习麻省理工学院的本课程.在第一堂课中,教授提出了以下问题:-
I was doing this course on algorithms from MIT. In the very first lecture the professor presents the following problem:-
二维数组中的峰值是一个值,使得它的所有 4 个邻居都小于或等于它,即.对于
A peak in a 2D array is a value such that all it's 4 neighbours are less than or equal to it, ie. for
a[i][j]
为局部最大值,
a[i+1][j] <= a[i][j]
&& a[i-1][j] <= a[i][j]
&& a[i][j+1] <= a[i][j]
&& a[i+1][j-1] <= a[i][j]
现在给定一个 NxN 二维数组,在数组中找到一个峰值.
Now given an NxN 2D array, find a peak in the array.
通过迭代所有元素并返回一个峰值,这个问题可以在 O(N^2)
时间内轻松解决.
This question can be easily solved in O(N^2)
time by iterating over all the elements and returning a peak.
然而,它可以通过使用分而治之的解决方案优化在 O(NlogN)
时间内解决 这里.
However it can be optimized to be solved in O(NlogN)
time by using a divide and conquer solution as explained here.
但是他们说存在一个O(N)
时间算法来解决这个问题.请建议我们如何在 O(N)
时间内解决这个问题.
But they have said that there exists an O(N)
time algorithm that solves this problem. Please suggest how can we solve this problem in O(N)
time.
PS(对于那些懂python的人)课程人员已经解释了一个方法here(问题 1-5.寻峰证明)并且还在他们的问题集中提供了一些 python 代码.但是所解释的方法完全不明显并且很难破译.python 代码同样令人困惑.所以我复制了下面代码的主要部分给那些了解python并且可以从代码中知道使用的是什么算法的人.
PS(For those who know python) The course staff has explained an approach here (Problem 1-5. Peak-Finding Proof) and also provided some python code in their problem sets. But the approach explained is totally non-obvious and very hard to decipher. The python code is equally confusing. So I have copied the main part of the code below for those who know python and can tell what algorithm is being used from the code.
def algorithm4(problem, bestSeen = None, rowSplit = True, trace = None):
# if it's empty, we're done
if problem.numRow <= 0 or problem.numCol <= 0:
return None
subproblems = []
divider = []
if rowSplit:
# the recursive subproblem will involve half the number of rows
mid = problem.numRow // 2
# information about the two subproblems
(subStartR1, subNumR1) = (0, mid)
(subStartR2, subNumR2) = (mid + 1, problem.numRow - (mid + 1))
(subStartC, subNumC) = (0, problem.numCol)
subproblems.append((subStartR1, subStartC, subNumR1, subNumC))
subproblems.append((subStartR2, subStartC, subNumR2, subNumC))
# get a list of all locations in the dividing column
divider = crossProduct([mid], range(problem.numCol))
else:
# the recursive subproblem will involve half the number of columns
mid = problem.numCol // 2
# information about the two subproblems
(subStartR, subNumR) = (0, problem.numRow)
(subStartC1, subNumC1) = (0, mid)
(subStartC2, subNumC2) = (mid + 1, problem.numCol - (mid + 1))
subproblems.append((subStartR, subStartC1, subNumR, subNumC1))
subproblems.append((subStartR, subStartC2, subNumR, subNumC2))
# get a list of all locations in the dividing column
divider = crossProduct(range(problem.numRow), [mid])
# find the maximum in the dividing row or column
bestLoc = problem.getMaximum(divider, trace)
neighbor = problem.getBetterNeighbor(bestLoc, trace)
# update the best we've seen so far based on this new maximum
if bestSeen is None or problem.get(neighbor) > problem.get(bestSeen):
bestSeen = neighbor
if not trace is None: trace.setBestSeen(bestSeen)
# return when we know we've found a peak
if neighbor == bestLoc and problem.get(bestLoc) >= problem.get(bestSeen):
if not trace is None: trace.foundPeak(bestLoc)
return bestLoc
# figure out which subproblem contains the largest number we've seen so
# far, and recurse, alternating between splitting on rows and splitting
# on columns
sub = problem.getSubproblemContaining(subproblems, bestSeen)
newBest = sub.getLocationInSelf(problem, bestSeen)
if not trace is None: trace.setProblemDimensions(sub)
result = algorithm4(sub, newBest, not rowSplit, trace)
return problem.getLocationInSelf(sub, result)
#Helper Method
def crossProduct(list1, list2):
"""
Returns all pairs with one item from the first list and one item from
the second list. (Cartesian product of the two lists.)
The code is equivalent to the following list comprehension:
return [(a, b) for a in list1 for b in list2]
but for easier reading and analysis, we have included more explicit code.
"""
answer = []
for a in list1:
for b in list2:
answer.append ((a, b))
return answer
推荐答案
- 假设数组的宽度大于高度,否则我们将向另一个方向分裂.
- 将数组拆分为三部分:中央列、左侧和右侧.
- 遍历中央列和两个相邻列并寻找最大值.
- 如果它在中央列 - 这就是我们的峰值
- 如果在左边,则在子数组
left_side + central_column
上运行这个算法 - 如果在右侧,则在子数组
right_side + central_column
上运行此算法
- Let's assume that width of the array is bigger than height, otherwise we will split in another direction.
- Split the array into three parts: central column, left side and right side.
- Go through the central column and two neighbour columns and look for maximum.
- If it's in the central column - this is our peak
- If it's in the left side, run this algorithm on subarray
left_side + central_column
- If it's in the right side, run this algorithm on subarray
right_side + central_column
为什么会这样:
对于最大元素位于中央列的情况 - 显而易见.如果不是,我们可以从那个最大值逐步增加元素,并且肯定不会越过中心行,因此对应的一半肯定会存在一个峰值.
For cases where the maximum element is in the central column - obvious. If it's not, we can step from that maximum to increasing elements and will definitely not cross the central row, so a peak will definitely exist in the corresponding half.
为什么这是 O(n):
Why this is O(n):
step #3 需要少于或等于 max_dimension
迭代和 max_dimension
每两个算法步骤至少减半.这给出了 n+n/2+n/4+...
即 O(n)
.重要细节:我们按最大方向分割.对于方形阵列,这意味着拆分方向将是交替的.这与您链接到的 PDF 中的最后一次尝试不同.
step #3 takes less than or equal to max_dimension
iterations and max_dimension
at least halves on every two algorithm steps. This gives n+n/2+n/4+...
which is O(n)
. Important detail: we split by the maximum direction. For square arrays this means that split directions will be alternating. This is a difference from the last attempt in the PDF you linked to.
注意:我不确定它是否与您提供的代码中的算法完全匹配,它可能是也可能不是不同的方法.
A note: I'm not sure if it exactly matches the algorithm in the code you gave, it may or may not be a different approach.
这篇关于O(n) 最坏情况下的 2D 寻峰算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!