python - Quichesort的好处

我为分配该程序创建了程序，在该程序中，我们需要创建 Quichesort 的实现。这是一种混合排序算法，使用Quicksort直到达到一定的递归深度(log2(N)，其中N是列表的长度)，然后切换到Heapsort，以避免超过最大递归深度。

在测试我的实现时，我发现尽管Heapsort通常比常规的Quicksort表现更好，但始终优于两者。 谁能解释为什么Heapsort表现更好，在什么情况下Quichesort会比Quicksort和Heapsort更好？

请注意，由于某种原因，该分配将算法称为“Quipsort”。

编辑:显然，“Quichesort”实际上与
Introsort。

我还注意到medianOf3()函数中的逻辑错误是
导致它为某些输入返回错误的值。这是一个改进
功能版本:

def medianOf3(lst):
    """
    From a lst of unordered data, find and return the the median value from
    the first, middle and last values.
    """

    first, last = lst[0], lst[-1]
    if len(lst) <= 2:
        return min(first, last)
    middle = lst[(len(lst) - 1) // 2]
    return sorted((first, middle, last))[1]

这是否可以解释该算法相对较差的性能？

Quichesort的代码:

import heapSort             # heapSort
import math                 # log2 (for quicksort depth limit)

def medianOf3(lst):
    """
    From a lst of unordered data, find and return the the median value from
    the first, middle and last values.
    """

    first, last = lst[0], lst[-1]
    if len(lst) <= 2:
        return min(first, last)
    median = lst[len(lst) // 2]
    return max(min(first, median), min(median, last))

def partition(pivot, lst):
   """
   partition: pivot (element in lst) * List(lst) ->
        tuple(List(less), List(same, List(more))).
   Where:
        List(Less) has values less than the pivot
        List(same) has pivot value/s, and
        List(more) has values greater than the pivot

   e.g. partition(5, [11,4,7,2,5,9,3]) == [4,2,3], [5], [11,7,9]
   """

   less, same, more = [], [], []
   for val in lst:
      if val < pivot:
         less.append(val)
      elif val > pivot:
         more.append(val)
      else:
         same.append(val)
   return less, same, more

def quipSortRec(lst, limit):
    """
    A non in-place, depth limited quickSort, using median-of-3 pivot.
    Once the limit drops to 0, it uses heapSort instead.
    """

    if lst == []:
        return []

    if limit == 0:
        return heapSort.heapSort(lst)

    limit -= 1
    pivot = medianOf3(lst)
    less, same, more = partition(pivot, lst)
    return quipSortRec(less, limit) + same + quipSortRec(more, limit)

def quipSort(lst):
    """
    The main routine called to do the sort.  It should call the
    recursive routine with the correct values in order to perform
    the sort
    """

    depthLim = int(math.log2(len(lst)))
    return quipSortRec(lst, depthLim)

堆排序代码:

import heapq    # mkHeap (for adding/removing from heap)

def heapSort(lst):
    """
    heapSort(List(Orderable)) -> List(Ordered)
        performs a heapsort on 'lst' returning a new sorted list
    Postcondition: the argument lst is not modified
    """

    heap = list(lst)
    heapq.heapify(heap)
    result = []
    while len(heap) > 0:
        result.append(heapq.heappop(heap))
    return result

最佳答案

基本事实如下:

Heapsort具有最坏情况下的O(n log(n))性能，但在实践中往往很慢。

Quicksort平均具有O(n log(n))性能，但在最坏的情况下为O(n ^ 2)，但在实践中速度很快。

Introsort旨在利用快速排序的快速实践性能，同时仍保证堆排序的最坏情况O(n log(n))行为。

要问的一个问题是why is quicksort faster "in practice" than heapsort?这是一个很难回答的问题，但是大多数答案都指向quicksort如何具有更好的spatial locality，从而导致更少的缓存未命中。但是，我不确定这是否适用于Python，因为它运行在解释器中，并且比其他可能干扰缓存性能的语言(例如C)在后台运行的垃圾要多得多。

至于为什么您的特定introsort实现要比Python的heapsort慢-再次，这很难确定。首先，请注意heapq模块是written in Python，因此它与您的实现相对立足。创建和连接许多较小的列表可能很昂贵，因此您可以尝试重写快速排序以就地执行操作，看看是否有帮助。您还可以尝试调整实现的各个方面，以查看其如何影响性能，或者通过探查器运行代码，查看是否有热点。但最后，我认为您不太可能找到确切的答案。它可能归结为Python解释器中哪些操作特别快或慢。

关于python - Quichesort的好处，我们在Stack Overflow上找到一个类似的问题：https://stackoverflow.com/questions/27592487/