本文介绍了如何在Python 3.x中获得类似2.x的排序行为?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试复制3.x中的Python 2.x的排序行为(并在可能的情况下进行改进),以便像intfloat等这样的可相互排序的类型按预期进行排序,并且相互不可排序类型在输出中分组.

I'm trying to replicate (and if possible improve on) Python 2.x's sorting behaviour in 3.x, so that mutually orderable types like int, float etc. are sorted as expected, and mutually unorderable types are grouped within the output.

这是我正在谈论的一个例子:

Here's an example of what I'm talking about:

>>> sorted([0, 'one', 2.3, 'four', -5])  # Python 2.x
[-5, 0, 2.3, 'four', 'one']
>>> sorted([0, 'one', 2.3, 'four', -5])  # Python 3.x
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: str() < int()

我以前对此的尝试,使用类作为sorted()的关键参数(请参见为什么用于排序异构序列的键类的行为异常?)从根本上被破坏了,因为它的方法

My previous attempt at this, using a class for the key parameter to sorted() (seeWhy does this key class for sorting heterogeneous sequences behave oddly?) is fundamentally broken, because its approach of

  1. 尝试比较值,并且
  2. 如果失败,则退回比较其类型的字符串表示形式

可能导致不及物的顺序,如 BrenBarn的绝妙答案所解释.

can lead to intransitive ordering, as explained by BrenBarn's excellent answer.

天真的方法(最初我甚至没有尝试对其进行编码就拒绝了)是使用返回(type, value)元组的键函数:

A naïve approach, which I initially rejected without even trying to code it, would be to use a key function that returns a (type, value) tuple:

def motley(value):
    return repr(type(value)), value

但是,这不能满足我的要求.首先,它打破了相互可排序类型的自然排序:

However, this doesn't do what I want. In the first place, it breaks the natural ordering of mutually orderable types:

>>> sorted([0, 123.4, 5, -6, 7.89])
[-6, 0, 5, 7.89, 123.4]
>>> sorted([0, 123.4, 5, -6, 7.89], key=motley)
[7.89, 123.4, -6, 0, 5]

第二,当输入包含两个本质上不可排序的类型的对象时,它将引发异常:

Secondly, it raises an exception when the input contains two objects of the same intrinsically unorderable type:

>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()

...诚然是Python 2.x和3.x的标准行为-但理想情况下,我希望将这些类型组合在一起(我并不特别在意它们的顺序,但是看起来为了符合Python的稳定排序保证,它们会保留其原始顺序.

... which admittedly is the standard behaviour in both Python 2.x and 3.x – but ideally I'd like such types to be grouped together (I don't especially care about their ordering, but it would seem in keeping with Python's guarantee of stable sorting that they retain their original order).

我可以通过特殊包装解决数字类型的第一个问题:

I can work around the first of these problems for numeric types by special-casing them:

from numbers import Real
from decimal import Decimal

def motley(value):
    numeric = Real, Decimal
    if isinstance(value, numeric):
        typeinfo = numeric
    else:
        typeinfo = type(value)
    return repr(typeinfo), value

...尽其所能:

>>> sorted([0, 'one', 2.3, 'four', -5], key=motley)
[-5, 0, 2.3, 'four', 'one']

...但是没有考虑到可能存在其他互不相同的(可能是用户定义的)类型的事实,当然,对于本质上不可排序的类型仍然会失败:

... but doesn't account for the fact that there may be other distinct (possibly user-defined) types which are mutually orderable, and of course still fails with intrinsically unorderable types:

>>> sorted([{1:2}, {3:4}], key=motley)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: unorderable types: dict() < dict()

是否有另一种方法可以解决两者的问题,即任意,独特但相互可排序的类型本质上不可排序的类型?

Is there another approach which solves both the problem of arbitrary, distinct-but-mutually-orderable types and that of intrinsically unorderable types?

推荐答案

愚蠢的想法:第一步,将所有不同的项目划分为可以相互比较的组,对各个组进行排序,最后将它们连接起来.我假设一个项目与一个组的第一个成员具有可比性,那么该项目就可以与该组的所有成员具有可比性.像这样的东西(Python3):

Stupid idea: make a first pass to divide all the different items in groups that can be compared between each other, sort the individual groups and finally concatenate them. I assume that an item is comparable to all members of a group, if it is comparable with the first member of a group. Something like this (Python3):

import itertools

def python2sort(x):
    it = iter(x)
    groups = [[next(it)]]
    for item in it:
        for group in groups:
            try:
                item < group[0]  # exception if not comparable
                group.append(item)
                break
            except TypeError:
                continue
        else:  # did not break, make new group
            groups.append([item])
    print(groups)  # for debugging
    return itertools.chain.from_iterable(sorted(group) for group in groups)

在可悲的情况下,这将是二次运行时间,所有项目都不具有可比性,但是我想唯一确定的方法是检查所有可能的组合.对于试图对一长串无法排序的项(例如复数)进行排序的人,将二次行为视为应受的惩罚.在一些字符串和一些整数混合的更常见情况下,速度应类似于常规排序的速度.快速测试:

This will have quadratic running time in the pathetic case that none of the items are comparable, but I guess the only way to know that for sure is to check all possible combinations. See the quadratic behavior as a deserved punishment for anyone trying to sort a long list of unsortable items, like complex numbers. In a more common case of a mix of some strings and some integers, the speed should be similar to the speed of a normal sort. Quick test:

In [19]: x = [0, 'one', 2.3, 'four', -5, 1j, 2j,  -5.5, 13 , 15.3, 'aa', 'zz']

In [20]: list(python2sort(x))
[[0, 2.3, -5, -5.5, 13, 15.3], ['one', 'four', 'aa', 'zz'], [1j], [2j]]
Out[20]: [-5.5, -5, 0, 2.3, 13, 15.3, 'aa', 'four', 'one', 'zz', 1j, 2j]

这似乎也是一种稳定的排序",因为这些组是按照遇到无与伦比的项目的顺序形成的.

It seems to be a 'stable sort' as well, since the groups are formed in the order the incomparable items are encountered.

这篇关于如何在Python 3.x中获得类似2.x的排序行为?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-16 12:25
查看更多