list - Haskell中的此列表置换实现到底有什么用？

我正在研究Data.List模块中的代码，无法完全理解这种置换的实现:

permutations            :: [a] -> [[a]]
permutations xs0        =  xs0 : perms xs0 []
  where
    perms []     _  = []
    perms (t:ts) is = foldr interleave (perms ts (t:is)) (permutations is)
      where interleave    xs     r = let (_,zs) = interleave' id xs r in zs
            interleave' _ []     r = (ts, r)
            interleave' f (y:ys) r = let (us,zs) = interleave' (f . (y:)) ys r
                                     in  (y:us, f (t:y:us) : zs)

有人可以详细解释这些嵌套函数如何相互连接/一起工作吗？

最佳答案

对较晚的答案感到抱歉，写下来的时间比预期的要长一点。

因此，首先要在这样的列表函数中最大程度地增加惰性，有两个目标:

在检查输入列表中的下一个元素之前，产生尽可能多的答案

答案本身必须是惰性的，因此必须保持不变。

现在考虑permutation函数。这里最大的惰性意味着:

在检查输入

的n!元素后，我们应该确定至少有n排列

对于这些n!排列中的每一个，第一个n元素应仅取决于输入的第一个n元素。

第一个条件可以形式为

length (take (factorial n) $ permutations ([1..n] ++ undefined))) `seq` () == ()

David Benbennick将第二个条件形式化为

map (take n) (take (factorial n) $ permutations [1..]) == permutations [1..n]

结合起来，我们有

map (take n) (take (factorial n) $ permutations ([1..n] ++ undefined)) == permutations [1..n]

让我们从一些简单的案例开始。第一个permutation [1..]。我们必须有

permutations [1..] = [1,???] : ???

我们必须有两个要素

permutations [1..] = [1,2,???] : [2,1,???] : ???

请注意，对于前两个元素的顺序没有选择，我们不能将[2,1,...]放在第一位，因为我们已经决定了第一个排列必须以1开头。现在应该很清楚，permutations xs的第一个元素必须等于xs本身。

现在继续执行。

首先，有两种不同的方法可以对列表进行所有排列:

选择样式:从列表中选择元素，直到没有元素为止

permutations []  = [[]]
permutations xxs = [(y:ys) | (y,xs) <- picks xxs, ys <- permutations xs]
  where
    picks (x:xs) = (x,xs) : [(y,x:ys) | (y,ys) <- picks xs]

插入样式:在所有可能的位置插入或交错每个元素

permutations []     = [[]]
permutations (x:xs) = [y | p <- permutations xs, y <- interleave p]
  where
    interleave []     = [[x]]
    interleave (y:ys) = (x:y:ys) : map (y:) (interleave ys)

请注意，这些都不是最大程度的懒惰。第一种情况，此功能要做的第一件事是从整个列表中选择第一个元素，这一点都不懒惰。在第二种情况下，我们需要尾部的排列，然后才能进行任何排列。

首先，请注意interleave可以变得更懒。 interleave yss列表的第一个元素是[x](如果是yss=[])或(x:y:ys)(如果是yss=y:ys)。但是这两个都与x:yss相同，因此我们可以编写

interleave yss = (x:yss) : interleave' yss
interleave' [] = []
interleave' (y:ys) = map (y:) (interleave ys)

Data.List中的实现继续了这个想法，但是使用了更多技巧。

遍历mailing list discussion也许是最容易的。我们从David Benbennick的版本开始，该版本与我在上面编写的版本相同(没有懒惰的交错)。我们已经知道permutations xs的第一个要素应该是xs本身。所以，我们把它放进去

permutations xxs     = xxs : permutations' xxs
permutations' []     = []
permutations' (x:xs) = tail $ concatMap interleave $ permutations xs
  where interleave = ..

对tail的调用当然不是很好。但是，如果我们内联permutations和interleave的定义，我们会得到

permutations' (x:xs)
  = tail $ concatMap interleave $ permutations xs
  = tail $ interleave xs ++ concatMap interleave (permutations' xs)
  = tail $ (x:xs) : interleave' xs ++ concatMap interleave (permutations' xs)
  = interleave' xs ++ concatMap interleave (permutations' xs)

现在我们有

permutations xxs     = xxs : permutations' xxs
permutations' []     = []
permutations' (x:xs) = interleave' xs ++ concatMap interleave (permutations' xs)
  where
   interleave yss = (x:yss) : interleave' yss
   interleave' [] = []
   interleave' (y:ys) = map (y:) (interleave ys)

下一步是优化。一个重要的目标是消除交错中的(++)调用。这不是那么容易，因为最后一行是map (y:) (interleave ys)。我们不能立即使用传递尾巴的foldr / ShowS技巧作为参数。出路是摆脱 map 。如果我们传递参数f作为必须在结果最后映射的函数，我们将得到

permutations' (x:xs) = interleave' id xs ++ concatMap (interleave id) (permutations' xs)
  where
   interleave f yss = f (x:yss) : interleave' f yss
   interleave' f [] = []
   interleave' f (y:ys) = interleave (f . (y:)) ys

现在我们可以通过尾巴了

permutations' (x:xs) = interleave' id xs $ foldr (interleave id) [] (permutations' xs)
  where
   interleave  f yss    r = f (x:yss) : interleave' f yss r
   interleave' f []     r = r
   interleave' f (y:ys) r = interleave (f . (y:)) ys r

它开始看起来像Data.List中的那个，但是还不一样。特别是，它并不像它可能的那样懒惰。
让我们尝试一下:

*Main> let n = 4
*Main> map (take n) (take (factorial n) $ permutations ([1..n] ++ undefined))
[[1,2,3,4],[2,1,3,4],[2,3,1,4],[2,3,4,1]*** Exception: Prelude.undefined

嗯，只有第一个n元素正确，而不是第一个factorial n。
原因是我们仍然尝试在所有可能的位置上放置第一个元素(在上面的示例中为1)，然后再尝试其他操作。

Yitzchak Gale提出了一个解决方案。考虑了将输入分为初始部分，中间元素和尾部的所有方法:

[1..n] == []    ++ 1 : [2..n]
       == [1]   ++ 2 : [3..n]
       == [1,2] ++ 3 : [4..n]

如果您之前没有看到生成这些代码的技巧，则可以使用zip (inits xs) (tails xs)进行此操作。
现在[1..n]的排列将是

[] ++ 1 : [2..n] aka。 [1..n]或

2插入(交错)到[1]的排列中，然后插入[3..n]。但不要在2的末尾插入[1]，因为我们已经在上一个项目符号点进行了运算。

3交错为[1,2]的排列(不在结尾处)，后跟[4..n]。

等

您可以看到这是最大程度的延迟，因为在我们甚至考虑使用3进行操作之前，我们已经给出了所有从[1,2]的排列开始的排列。 Yitzchak给出的代码是

permutations xs = xs : concat (zipWith newPerms (init $ tail $ tails xs)
                                                (init $ tail $ inits xs))
  where
    newPerms (t:ts) = map (++ts) . concatMap (interleave t) . permutations3
    interleave t [y]        = [[t, y]]
    interleave t ys@(y:ys') = (t:ys) : map (y:) (interleave t ys')

请注意对permutations3的递归调用，它可以是一个变体，不必完全懒惰。

如您所见，这比我们以前优化的要差一些。但是我们可以应用一些相同的技巧。

第一步是摆脱init和tail。让我们看看zip (init $ tail $ tails xs) (init $ tail $ inits xs)实际上是什么

*Main> let xs = [1..5] in zip (init $ tail $ tails xs) (init $ tail $ inits xs)
[([2,3,4,5],[1]),([3,4,5],[1,2]),([4,5],[1,2,3]),([5],[1,2,3,4])]

init摆脱了([],[1..n])组合，而tail摆脱了([1..n],[])组合。我们不希望使用前者，因为那样会使newPerms中的模式匹配失败。后者将失败interleave。两者都很容易修复:只需为newPerms []和interleave t []添加一个大小写即可。

permutations xs = xs : concat (zipWith newPerms (tails xs) (inits xs))
  where
    newPerms [] is = []
    newPerms (t:ts) is = map (++ts) (concatMap (interleave t) (permutations is))
    interleave t []         = []
    interleave t ys@(y:ys') = (t:ys) : map (y:) (interleave t ys')

现在我们可以尝试内联tails和inits。他们的定义是

tails xxs = xxs : case xxs of
  []     -> []
  (_:xs) -> tails xs

inits xxs = [] : case xxs of
  []     -> []
  (x:xs) -> map (x:) (inits xs)

问题在于inits不是尾部递归的。但是由于无论如何我们都要对init进行排列，所以我们不在乎元素的顺序。所以我们可以使用一个累加参数

inits' = inits'' []
  where
  inits'' is xxs = is : case xxs of
    []     -> []
    (x:xs) -> inits'' (x:is) xs

现在，我们使newPerms成为xxs和此累加参数的函数，而不是tails xxs和inits xxs。

permutations xs = xs : concat (newPerms' xs [])
  where
    newPerms' xxs is =
      newPerms xxs is :
      case xxs of
        []     -> []
        (x:xs) -> newPerms' xs (x:is)
    newPerms [] is = []
    newPerms (t:ts) is = map (++ts) (concatMap (interleave t) (permutations3 is))

将newPerms内联到newPerms'中，然后给出

permutations xs = xs : concat (newPerms' xs [])
  where
    newPerms' []     is = [] : []
    newPerms' (t:ts) is =
      map (++ts) (concatMap (interleave t) (permutations is)) :
      newPerms' ts (t:is)

内联和展开concat，然后将最终的map (++ts)移动到interleave，

permutations xs = xs : newPerms' xs []
  where
    newPerms' []     is = []
    newPerms' (t:ts) is =
        concatMap interleave (permutations is) ++
        newPerms' ts (t:is)
      where
      interleave []     = []
      interleave (y:ys) = (t:y:ys++ts) : map (y:) (interleave ys)

最后，我们可以重新应用foldr技巧来摆脱(++):

permutations xs = xs : newPerms' xs []
  where
    newPerms' []     is = []
    newPerms' (t:ts) is =
        foldr (interleave id) (newPerms' ts (t:is)) (permutations is)
      where
      interleave f []     r = r
      interleave f (y:ys) r = f (t:y:ys++ts) : interleave (f . (y:)) ys r

等等，我说摆脱(++)。我们摆脱了其中一个，但interleave中没有一个。
为此，我们可以看到我们总是将yys的某些尾部连接到ts。因此，我们可以将计算出的(ys++ts)连同interleave的递归一起展开，并使函数interleave' f ys r返回元组(ys++ts, interleave f ys r)。这给

permutations xs = xs : newPerms' xs []
  where
    newPerms' []     is = []
    newPerms' (t:ts) is =
        foldr interleave (newPerms' ts (t:is)) (permutations is)
      where
      interleave ys r = let (_,zs) = interleave' id ys r in zs
      interleave' f []     r = (ts,r)
      interleave' f (y:ys) r =
        let (us,zs) = interleave' (f . (y:)) ys r
        in  (y:us, f (t:y:us) : zs)

在最大懒惰的优化荣耀中，您就拥有了Data.List.permutations。

Twan的精彩文章!我(@Yitz)将添加一些引用:

Twan在上面开发了此算法的原始电子邮件线程，由Twan链接，非常有趣。

Knuth在Vol.1中对满足这些条件的所有可能算法进行了分类。 4 Fasc。 2秒7.2.1.2。

Twan的permutations3与Knuth的“算法P”相同。据Knuth所知，该算法最早是由英国教堂的钟声在1600年代发布的。