本文介绍了从列表中删除重复值的最佳算法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

什么是从列表中删除重复值的最佳算法?
我已经尝试过:

What is best algorithm for removing duplicate values from a list ?I've tried this:

for (int i = 0; i < AuthorCounter-1; i++)
{
    for (int j = 0; j < AuthorCounter-1; j++)
    {
        if (i != j)
        {
            if (AuthorGroupNode.Nodes[i].Text == AuthorGroupNode.Nodes[j].Text)
            {
                AuthorGroupNode.Nodes[j].Remove();
                AuthorCounter--;
            }

        }
    }
}

这里, AuthorGroupNodes 是节点上的列表。它在某种程度上做了正确的事情,但并不完美。任何一个有更好的解决方案?

Here, AuthorGroupNodes is a list on nodes. It did things right to some extent, but not perfect. Any one have better solution ???

推荐答案

您当前的算法是O(N平方),对于一个大的列表。

Your current algorithm is O(N-squared), which will perform quite poorly for a large list.

如果空间不是问题,你可以保留一个 HashSet< int> 节点。遍历列表一次。如果节点的散列在HashSet中,那么您知道这是一个重复的节点。跳过它。如果散列不在HashSet中,请将此节点添加到新列表中,并将节点的散列添加到HashSet。

If space is not an issue, you could keep a HashSet<int> of hashes of nodes. Traverse the list once. If the hash of the node is in the HashSet, you know this is a duplicate node. Skip it. If the hash is not in the HashSet, add this node to a new list, and add the hash of the node to the HashSet.

这将执行O(N) ,并需要原始列表的内存,列表的副本减去任何重复项,以及对于HashSet。该算法是非破坏性的。

This will perform O(N), and requires memory for the original list, for a copy of the list less any duplicates, and for the HashSet. The algorithm is non-destructive.

如果可以使用Linq,只需执行

If you can use Linq, simply do

var distinctList = originalList.Distinct().ToList();

更新

发现,这几乎是Jon Skeet如何重新实现了Distinct。

Discovered that's pretty much exactly how Jon Skeet re-implemented Distinct.

public static IEnumerable<TSource> Distinct<TSource>(
    this IEnumerable<TSource> source)
{
    return source.Distinct(EqualityComparer<TSource>.Default);
}

public static IEnumerable<TSource> Distinct<TSource>(
    this IEnumerable<TSource> source,
    IEqualityComparer<TSource> comparer)
{
    if (source == null)
    {
        throw new ArgumentNullException("source");
    }
    return DistinctImpl(source, comparer ?? EqualityComparer<TSource>.Default);
}

private static IEnumerable<TSource> DistinctImpl<TSource>(
    IEnumerable<TSource> source,
    IEqualityComparer<TSource> comparer)
{
    HashSet<TSource> seenElements = new HashSet<TSource>(comparer);
    foreach (TSource item in source)
    {
        if (seenElements.Add(item))
        {
            yield return item;
        }
    }
}

这篇关于从列表中删除重复值的最佳算法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-24 03:48