如何从List< T>中删除重复项?

本文介绍了如何从List< T>中删除重复项?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在关于堆栈重复的上一篇帖子关于删除重复项来自C#中的 List< T> .

I am following a previous post on stackoverflow about removing duplicates from a List<T> in C#.

如果< T> 是某些用户定义的类型，例如:

If <T> is some user defined type like:

class Contact
{
    public string firstname;
    public string lastname;
    public string phonenum;
}

建议的(HashMap)不会删除重复项.我想，我必须重新定义一些用于比较两个对象的方法，不是吗?

The suggested (HashMap) doesn't remove duplicate. I think, I have to redefine some method for comparing two objects, isn't it?

推荐答案

一个 HashSet< T> 可以删除重复项，因为它是一个集合...但仅当您的类型适当地定义了相等性.

A HashSet<T> does remove duplicates, because it's a set... but only when your type defines equality appropriately.

我怀疑重复"的意思是一个对象的字段值与另一个对象的字段值相等"-您需要重写 Equals / GetHashCode 才能正常工作，并且/或实施 IEquatable< Contact> ...或可以向 HashSet< T> 构造函数提供 IEqualityComparer< Contact> .

I suspect by "duplicate" you mean "an object with equal field values to another object" - you need to override Equals/GetHashCode for that to work, and/or implement IEquatable<Contact>... or you could provide an IEqualityComparer<Contact> to the HashSet<T> constructor.

您可以可以调用 Distinct LINQ扩展方法，而不是使用 HashSet< T> .例如:

Instead of using a HashSet<T> you could just call the Distinct LINQ extension method. For example:

list = list.Distinct().ToList();

但是同样，您需要以某种方式提供适当的相等性定义.

But again, you'll need to provide an appropriate definition of equality, somehow or other.

这是一个示例实现.请注意，我是如何使其变得不可变的(可变类型的相等是奇怪的，因为两个对象在一分钟内可以相等，而下一分钟可以不相等)，并且制成具有公共属性的私有字段.最后，我已经密封了类-不可变类型通常应该被密封，这使得相等性更易于讨论.

Here's a sample implementation. Note how I've made it immutable (equality is odd with mutable types, because two objects can be equal one minute and non-equal the next) andmadethe fields private, with public properties. Finally, I've sealed the class - immutable types should generally be sealed, and it makes equality easier to talk about.

using System;
using System.Collections.Generic;

public sealed class Contact : IEquatable<Contact>
{
    private readonly string firstName;
    public string FirstName { get { return firstName; } }

    private readonly string lastName;
    public string LastName { get { return lastName; } }

    private readonly string phoneNumber;
    public string PhoneNumber { get { return phoneNumber; } }

    public Contact(string firstName, string lastName, string phoneNumber)
    {
        this.firstName = firstName;
        this.lastName = lastName;
        this.phoneNumber = phoneNumber;
    }

    public override bool Equals(object other)
    {
        return Equals(other as Contact);
    }

    public bool Equals(Contact other)
    {
        if (object.ReferenceEquals(other, null))
        {
            return false;
        }
        if (object.ReferenceEquals(other, this))
        {
            return true;
        }
        return FirstName == other.FirstName &&
               LastName == other.LastName &&
               PhoneNumber == other.PhoneNumber;
    }

    public override int GetHashCode()
    {
        // Note: *not* StringComparer; EqualityComparer<T>
        // copes with null; StringComparer doesn't.
        var comparer = EqualityComparer<string>.Default;

        // Unchecked to allow overflow, which is fine
        unchecked
        {
            int hash = 17;
            hash = hash * 31 + comparer.GetHashCode(FirstName);
            hash = hash * 31 + comparer.GetHashCode(LastName);
            hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
            return hash;
        }
    }
}

好的，响应对 GetHashCode()实现的解释的请求:

Okay, in response to requests for an explanation of the GetHashCode() implementation:

我们要结合此对象属性的哈希码
我们不在任何地方检查null，因此我们应该假设其中一些可能为null. EqualityComparer< T>.默认始终会处理此问题，这很好...因此，我正在使用它来获取每个字段的哈希码.
乔什·布洛赫(Josh Bloch)推荐的一种将多个哈希码组合为一个的加法和乘法"方法.还有许多其他通用的哈希算法，但是这种算法对大多数应用程序都适用.
我不知道默认情况下您是否在检查环境中进行编译，因此我已将计算放在非检查环境中.我们真的不在乎重复的乘法/加法是否会导致溢出，因为我们不是在寻找这样的量级"……只是一个我们可以反复达到相等的数值对象.

We want to combine the hash codes of the properties of this object
We're not checking for nullity anywhere, so we should assume that some of them may be null. EqualityComparer<T>.Default always handles this, which is nice... so I'm using that to get a hash code of each field.
The "add and multiply" approach to combining several hash codes into one is the standard one recommended by Josh Bloch. There are plenty of other general-purpose hashing algorithms, but this one works fine for most applications.
I don't know whether you're compiling in a checked context by default, so I've put the computation in an unchecked context. We really don't care if the repeated multiply/add leads to an overflow, because we're not looking for a "magnitude" as such... just a number that we can reach repeatedly for equal objects.

另外两种处理无效性的方式:

Two alternative ways of handling nullity, by the way:

public override int GetHashCode()
{
    // Unchecked to allow overflow, which is fine
    unchecked
    {
        int hash = 17;
        hash = hash * 31 + (FirstName ?? "").GetHashCode();
        hash = hash * 31 + (LastName ?? "").GetHashCode();
        hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
        return hash;
    }
}

或

public override int GetHashCode()
{
    // Unchecked to allow overflow, which is fine
    unchecked
    {
        int hash = 17;
        hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
        hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
        hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
        return hash;
    }
}

这篇关于如何从List< T>中删除重复项?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！

如何从List

如何从List&lt; T&gt;中删除重复项?

问题描述

推荐答案

如何从List< T>中删除重复项?