python - 排序字符串以匹配第二个字符串的最快方法-仅允许相邻交换

我想得到转换一个字符串以匹配第二个字符串所需的最小字母交换数。只允许相邻互换。输入是：字符串长度，字符串1，字符串2一些例子：Length | String 1 | String 2 | Output-------+----------+----------+------- 3 | ABC | BCA | 2 7 | AABCDDD | DDDBCAA | 16 7 | ZZZAAAA | ZAAZAAZ | 6这是我的代码：def letters(number, word_1, word_2): result = 0 while word_1 != word_2: index_of_letter = word_1.find(word_2[0]) result += index_of_letter word_1 = word_1.replace(word_2[0], '', 1) word_2 = word_2[1:] return result它给出了正确的结果，但计算应保持在20秒以下。这里有两组输入数据（1000 000个字符长的字符串）：https://ufile.io/8hp46和https://ufile.io/athxu。在我的设置中，第一个在40秒内执行，第二个在4分钟内执行。如何在不到20秒的时间内计算出结果？最佳答案您的算法运行在O（n2）时间：find()调用将占用O（n）时间。调用将创建一个完整的新字符串，占用O（n）时间。外循环执行O（n）次。正如其他人所陈述的，这可以通过使用归并排序计数反转来解决，但是在这个答案中，我尽量靠近算法，保持外循环和replace()，但是改变了result += index_of_letter的方式。改进措施如下：预处理index_of_letter字符串，并注意这些字母中的每一个不同字母的第一个位置。把每一个字母和它的下一个字母联系起来。我认为为这个创建一个列表是最有效的，它具有word_1的大小，在每个索引中存储相同字母的下一个出现的索引。这样，每个不同的字母都有一个链接列表。这个预处理可以在O（n）时间内完成，用它可以用O（1）查找代替word_1调用。每当你这样做时，你就从链接列表中删除匹配的字母，即DICT中的索引移动到下一个事件的索引。前面的更改将给出绝对索引，而不考虑算法中的字母的删除，因此会给出错误的结果。为了解决这个问题，你可以构建一个二叉树（也就是预处理），其中每个节点代表一个索引在word_1中，它给出了给定索引之前的非删除字母的实际数目（包括自身，如果还没有删除）。二进制树中的节点永远不会被删除（这可能是变体解决方案的一个想法），但是计数被调整以反映字符的删除。至多O（Logn）节点需要在这样的删除上获得递减值。但除此之外，任何字符串都不会像find那样重建。这个二叉树可以表示为一个列表，对应于按顺序排列的节点。列表中的值将是该节点（包括自身）之前未删除的字母的数目。初始二叉树可以描述如下：节点中的数字反映了其左侧节点的数量，包括它们自身。它们存储在word_1列表中。另一个列表replace预计算父索引所在的位置。实际代码可能如下所示：def letters(word_1, word_2): size = len(word_1) # No need to pass size as argument # Create a binary tree for word_1, organised as a list # in in-order sequence, and with the values equal to the number of # non-matched letters in the range up to and including the current index: treesize = (1<<size.bit_length()) - 1 numLeft = [(i >> 1 ^ ((i + 1) >> 1)) + 1 for i in range(0, treesize)] # Keep track of parents in this tree (could probably be simpler, I welcome comments). parent = [(i & ~((i^(i+1)) + 1)) | (((i ^ (i+1))+1) >> 1) for i in range(0, treesize)] # Create a linked list for each distinct character next = [-1] * size head = {} for i in range(len(word_1)-1, -1, -1): # go backwards c = word_1[i] # Add index at front of the linked list for this character if c in head: next[i] = head[c] head[c] = i # Main loop counting number of swaps needed for each letter result = 0 for i, c in enumerate(word_2): # Extract next occurrence of this letter from linked list j = head[c] head[c] = next[j] # Get number of preceding characters with a binary tree lookup p = j index_of_letter = 0 while p < treesize: if p >= j: # On or at right? numLeft[p] -= 1 # Register that a letter has been removed at left side if p <= j: # On or at left? index_of_letter += numLeft[p] # Add the number of left-side letters p = parent[p] # Walk up the tree result += index_of_letter return result这在O（nLogn）中运行，其中Logn因子是由二叉树中的向上行走提供的。我测试了数以千计的随机输入，上面的代码在所有情况下都会产生与代码相同的结果。但是…它在较大的输入上运行得更快。