我在阅读有关LINQ的this article,无法理解如何根据惰性评估执行查询。

因此,我将示例从本文简化为以下代码:

void Main()
{
    var data =
        from f in GetFirstSequence().LogQuery("GetFirstSequence")
        from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
        select $"{f} {s}";

    data.Dump(); // I use LINQPAD to output the data
}

static IEnumerable<string> GetFirstSequence()
{
    yield return "a";
    yield return "b";
    yield return "c";
}

static IEnumerable<string> GetSecondSequence()
{
    yield return "1";
    yield return "2";
}

public static class Extensions
{
    private const string path = @"C:\dist\debug.log";

    public static IEnumerable<string> LogQuery(this IEnumerable<string> sequence, string tag, string element = null)
    {
        using (var writer = File.AppendText(path))
        {
            writer.WriteLine($"Executing query {tag} {element}");
        }
        return sequence;
    }
}


执行此代码后,我在debug.log文件中具有以下内容:


执行查询GetFirstSequence
执行查询GetSecondSequence a
执行查询GetSecondSequence b
执行查询GetSecondSequence c


并且可以从逻辑上进行解释。

当我想将前三个元素与后三个元素交织在一起时,事情变得很奇怪:

void Main()
{
    var data =
        from f in GetFirstSequence().LogQuery("GetFirstSequence")
        from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
        select $"{f} {s}";

    var shuffle = data;
    shuffle = shuffle.Take(3).LogQuery("Take")
        .Interleave(shuffle.Skip(3).LogQuery("Skip")).LogQuery("Interleave");

    shuffle.Dump();
}


当然,我需要添加扩展方法来交错两个序列(摘自上述文章):

public static IEnumerable<string> Interleave(this IEnumerable<string> first, IEnumerable<string> second)
    {
        var firstIter = first.GetEnumerator();
        var secondIter = second.GetEnumerator();

        while (firstIter.MoveNext() && secondIter.MoveNext())
        {
            yield return firstIter.Current;
            yield return secondIter.Current;
        }
    }


执行这些代码行后,我在txt文件中得到以下输出:


执行查询GetFirstSequence
执行查询
执行查询跳过
执行查询交错
执行查询GetSecondSequence a
执行查询GetSecondSequence a
执行查询GetSecondSequence b
执行查询GetSecondSequence c
执行查询GetSecondSequence b


这让我感到尴尬,因为我不了解查询执行的顺序。

为什么查询已以这种方式执行?

最佳答案

var data =
    from f in GetFirstSequence().LogQuery("GetFirstSequence")
    from s in GetSecondSequence().LogQuery("GetSecondSequence", f)
    select $"{f} {s}";


只是另一种写作方式

var data = GetFirstSequence()
    .LogQuery("GetFirstSequence")
    .SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}");


让我们逐步看一下代码:

var data = GetFirstSequence() // returns an IEnumerable<string> without evaluating it
    .LogQuery("GetFirstSequence") // writes "GetFirstSequence" and returns the IEnumerable<string> from its this-parameter without evaluating it
    .SelectMany(f => GetSecondSequence().LogQuery("GetSecondSequence", f), (f, s) => $"{f} {s}"); // returns an IEnumerable<string> without evaluating it

var shuffle = data;
shuffle = shuffle
    .Take(3) // returns an IEnumerable<string> without evaluating it
    .LogQuery("Take") // writes "Take" and returns the IEnumerable<string> from its this-parameter without evaluating it
    .Interleave(
        shuffle
            .Skip(3) // returns an IEnumerable<string> without evaluating it
            .LogQuery("Skip") // writes "Skip" and returns the IEnumerable<string> from its this-parameter without evaluating it
    ) // returns an IEnumerable<string> without evaluating it
    .LogQuery("Interleave"); // writes "Interleave" and returns the IEnumerable<string> from its this-parameter without evaluating it


到目前为止的代码负责输出的前四行:

Executing query GetFirstSequence
Executing query Take
Executing query Skip
Executing query Interleave

None of the IEnumerable<string> have been evaluated yet.

Finally, shuffle.Dump() iterates over shuffle and thus evaluates the IEnumerables.

Iterating over data prints the following, because SelectMany() calls GetSecondSequence() and LogQuery() for each element in GetFirstSequence():

Executing query GetSecondSequence a
Executing query GetSecondSequence b
Executing query GetSecondSequence c

Iterating over shuffle is the same as iterating over

Interleave(data.Take(3), data.Skip(3))


Interleave()对来自data的两次迭代的元素进行交织,因此也对由于对其进行迭代而导致的输出进行交织。

firstIter.MoveNext();
// writes "Executing query GetSecondSequence a"
secondIter.MoveNext();
// writes "Executing query GetSecondSequence a"
// skips "a 1" from second sequence
// skips "a 2" from second sequence
// writes "Executing query GetSecondSequence b"
// skips "b 1" from second sequence
yield return firstIter.Current; // "a 1"
yield return secondIter.Current; // "b 2"
firstIter.MoveNext();
secondIter.MoveNext();
// writes "Executing query GetSecondSequence c"
yield return firstIter.Current; // "a 2"
yield return secondIter.Current; // "c 1"
firstIter.MoveNext();
// writes "Executing query GetSecondSequence b"
secondIter.MoveNext();
yield return firstIter.Current; // "b 1"
yield return secondIter.Current; // "c 2"

10-04 18:38