本文介绍了如何从List< T>中删除重复项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在关于堆栈重复的上一篇帖子关于删除重复项来自C#中的 List< T> .

I am following a previous post on stackoverflow about removing duplicates from a List<T> in C#.

如果< T> 是某些用户定义的类型,例如:

If <T> is some user defined type like:

class Contact
{
    public string firstname;
    public string lastname;
    public string phonenum;
}

建议的(HashMap)不会删除重复项.我想,我必须重新定义一些用于比较两个对象的方法,不是吗?

The suggested (HashMap) doesn't remove duplicate. I think, I have to redefine some method for comparing two objects, isn't it?

推荐答案

一个 HashSet< T> 可以删除重复项,因为它是一个集合...但仅当您的类型适当地定义了相等性.

A HashSet<T> does remove duplicates, because it's a set... but only when your type defines equality appropriately.

我怀疑重复"的意思是一个对象的字段值与另一个对象的字段值相等"-您需要重写 Equals / GetHashCode 才能正常工作,并且/或实施 IEquatable< Contact> ...或可以向 HashSet< T> 构造函数提供 IEqualityComparer< Contact> .

I suspect by "duplicate" you mean "an object with equal field values to another object" - you need to override Equals/GetHashCode for that to work, and/or implement IEquatable<Contact>... or you could provide an IEqualityComparer<Contact> to the HashSet<T> constructor.

您可以可以调用 Distinct LINQ扩展方法,而不是使用 HashSet< T> .例如:

Instead of using a HashSet<T> you could just call the Distinct LINQ extension method. For example:

list = list.Distinct().ToList();

但是同样,您需要以某种方式提供适当的相等性定义.

But again, you'll need to provide an appropriate definition of equality, somehow or other.

这是一个示例实现.请注意,我是如何使其变得不可变的(可变类型的相等是奇怪的,因为两个对象在一分钟内可以相等,而下一分钟可以不相等),并且制成具有公共属性的私有字段.最后,我已经密封了类-不可变类型通常应该被密封,这使得相等性更易于讨论.

Here's a sample implementation. Note how I've made it immutable (equality is odd with mutable types, because two objects can be equal one minute and non-equal the next) andmadethe fields private, with public properties. Finally, I've sealed the class - immutable types should generally be sealed, and it makes equality easier to talk about.

using System;
using System.Collections.Generic;

public sealed class Contact : IEquatable<Contact>
{
    private readonly string firstName;
    public string FirstName { get { return firstName; } }

    private readonly string lastName;
    public string LastName { get { return lastName; } }

    private readonly string phoneNumber;
    public string PhoneNumber { get { return phoneNumber; } }

    public Contact(string firstName, string lastName, string phoneNumber)
    {
        this.firstName = firstName;
        this.lastName = lastName;
        this.phoneNumber = phoneNumber;
    }

    public override bool Equals(object other)
    {
        return Equals(other as Contact);
    }

    public bool Equals(Contact other)
    {
        if (object.ReferenceEquals(other, null))
        {
            return false;
        }
        if (object.ReferenceEquals(other, this))
        {
            return true;
        }
        return FirstName == other.FirstName &&
               LastName == other.LastName &&
               PhoneNumber == other.PhoneNumber;
    }

    public override int GetHashCode()
    {
        // Note: *not* StringComparer; EqualityComparer<T>
        // copes with null; StringComparer doesn't.
        var comparer = EqualityComparer<string>.Default;

        // Unchecked to allow overflow, which is fine
        unchecked
        {
            int hash = 17;
            hash = hash * 31 + comparer.GetHashCode(FirstName);
            hash = hash * 31 + comparer.GetHashCode(LastName);
            hash = hash * 31 + comparer.GetHashCode(PhoneNumber);
            return hash;
        }
    }
}

好的,响应对 GetHashCode()实现的解释的请求:

Okay, in response to requests for an explanation of the GetHashCode() implementation:

  • 我们要结合此对象属性的哈希码
  • 我们不在任何地方检查null,因此我们应该假设其中一些可能为null. EqualityComparer< T>.默认始终会处理此问题,这很好...因此,我正在使用它来获取每个字段的哈希码.
  • 乔什·布洛赫(Josh Bloch)推荐的一种将多个哈希码组合为一个的加法和乘法"方法.还有许多其他通用的哈希算法,但是这种算法对大多数应用程序都适用.
  • 我不知道默认情况下您是否在检查环境中进行编译,因此我已将计算放在非检查环境中.我们真的不在乎重复的乘法/加法是否会导致溢出,因为我们不是在寻找这样的量级"……只是一个我们可以反复达到相等的数值对象.
  • We want to combine the hash codes of the properties of this object
  • We're not checking for nullity anywhere, so we should assume that some of them may be null. EqualityComparer<T>.Default always handles this, which is nice... so I'm using that to get a hash code of each field.
  • The "add and multiply" approach to combining several hash codes into one is the standard one recommended by Josh Bloch. There are plenty of other general-purpose hashing algorithms, but this one works fine for most applications.
  • I don't know whether you're compiling in a checked context by default, so I've put the computation in an unchecked context. We really don't care if the repeated multiply/add leads to an overflow, because we're not looking for a "magnitude" as such... just a number that we can reach repeatedly for equal objects.

另外两种处理无效性的方式:

Two alternative ways of handling nullity, by the way:

public override int GetHashCode()
{
    // Unchecked to allow overflow, which is fine
    unchecked
    {
        int hash = 17;
        hash = hash * 31 + (FirstName ?? "").GetHashCode();
        hash = hash * 31 + (LastName ?? "").GetHashCode();
        hash = hash * 31 + (PhoneNumber ?? "").GetHashCode();
        return hash;
    }
}

public override int GetHashCode()
{
    // Unchecked to allow overflow, which is fine
    unchecked
    {
        int hash = 17;
        hash = hash * 31 + (FirstName == null ? 0 : FirstName.GetHashCode());
        hash = hash * 31 + (LastName == null ? 0 : LastName.GetHashCode());
        hash = hash * 31 + (PhoneNumber == null ? 0 : PhoneNumber.GetHashCode());
        return hash;
    }
}

这篇关于如何从List&lt; T&gt;中删除重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-20 17:50