问题描述
什么是最有效的方式来寻找在序列中的的IEnumerable< T>
使用LINQ
What is the most efficient way to find a sequence within a IEnumerable<T>
using LINQ
我希望能够创建一个扩展方法,它允许下面的调用:
I want to be able to create an extension method which allows the following call:
int startIndex = largeSequence.FindSequence(subSequence)
本场比赛必须相邻而为了。
The match must be adjacent and in order.
推荐答案
下面是一个算法,发现一个子序列中的一个实现。我所谓的方法 IndexOfSequence
,因为它使意图更加明确,类似于现有的的IndexOf
方法:
Here's an implementation of an algorithm that finds a subsequence in a sequence. I called the method IndexOfSequence
, because it makes the intent more explicit and is similar to the existing IndexOf
method:
public static class ExtensionMethods
{
public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence)
{
return source.IndexOfSequence(sequence, EqualityComparer<T>.Default);
}
public static int IndexOfSequence<T>(this IEnumerable<T> source, IEnumerable<T> sequence, IEqualityComparer<T> comparer)
{
var seq = sequence.ToArray();
int p = 0; // current position in source sequence
int i = 0; // current position in searched sequence
var prospects = new List<int>(); // list of prospective matches
foreach (var item in source)
{
// Remove bad prospective matches
prospects.RemoveAll(k => !comparer.Equals(item, seq[p - k]));
// Is it the start of a prospective match ?
if (comparer.Equals(item, seq[0]))
{
prospects.Add(p);
}
// Does current character continues partial match ?
if (comparer.Equals(item, seq[i]))
{
i++;
// Do we have a complete match ?
if (i == seq.Length)
{
// Bingo !
return p - seq.Length + 1;
}
}
else // Mismatch
{
// Do we have prospective matches to fall back to ?
if (prospects.Count > 0)
{
// Yes, use the first one
int k = prospects[0];
i = p - k + 1;
}
else
{
// No, start from beginning of searched sequence
i = 0;
}
}
p++;
}
// No match
return -1;
}
}
我没有完全测试,所以可能仍包含错误。我只是做了著名的极端案例了一些测试,以确保我没有陷入明显的陷阱。似乎做工精细迄今为止...
I didn't fully test it, so it might still contain bugs. I just did a few tests on well-known corner cases to make sure I wasn't falling into obvious traps. Seems to work fine so far...
我觉得复杂度接近O(n)的,但我不是大O符号的专家,所以我可能是错的......至少它只枚举源序列一次,永远内部消除回去,所以它应该是相当有效。
I think the complexity is close to O(n), but I'm not an expert of Big O notation so I could be wrong... at least it only enumerates the source sequence once, whithout ever going back, so it should be reasonably efficient.
这篇关于查找序列中的IEnumerable&LT; T&GT;使用LINQ的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!