问题描述
我在数据表中有3列
1个James 4345
1 James 4345
2克里斯汀89231
2 Kristen 89231
3詹姆斯599
4 Suneel 317113
4 Suneel 317113
我需要第1行和第3行消失,而新的数据表仅返回第2行和第4行。我在关于SO的建议中找到了一个很好的相关问题-。但是他的解决方案使用哈希表,并且只消除了第3行,而不是消除了1和3。求救!
I need rows 1 and 3 gone, and the new datatable returning only rows 2 and 4. I found a really good related question in the suggestions on SO--this guy. But his solution uses hashtables, and only eliminates row 3, not both 1 and 3. Help!
推荐答案
好的,在潘迪亚(Pandiya)向我指出的博客上。在评论部分,一个叫凯文·莫里斯(Kevin Morris)的家伙用C#词典发布了一个对我有用的解决方案。
Okay, so I looked at the blog pointed out to me by Pandiya. In the comments section, a chap called Kevin Morris has posted a solution using a C# Dictionary, which worked for me.
在我的主句中,我写道:
In my main block, I wrote:
string keyColumn = "Website";
RemoveDuplicates(table1, keyColumn);
我的RemoveDuplicates函数定义为:
And my RemoveDuplicates function was defined as:
private void RemoveDuplicates(DataTable table1, string keyColumn)
{
Dictionary<string, string> uniquenessDict = new Dictionary<string, string>(table1.Rows.Count);
StringBuilder sb = null;
int rowIndex = 0;
DataRow row;
DataRowCollection rows = table1.Rows;
while (rowIndex < rows.Count - 1)
{
row = rows[rowIndex];
sb = new StringBuilder();
sb.Append(((string)row[keyColumn]));
if (uniquenessDict.ContainsKey(sb.ToString()))
{
rows.Remove(row);
if (RemoveAllDupes)
{
row = rows[rowIndex - 1];
rows.Remove(row);
}
}
else
{
uniquenessDict.Add(sb.ToString(), string.Empty);
rowIndex++;
}
}
}
如果您转到,您会发现一个更通用的功能,可以监听骗子在多列上。如果要删除所有重复的行,我添加了一个标记-RemoveAllDupes,但这仍然假定行按名称排序,并且仅包含重复项,不涉及三重,四重等。如果有人可以,请更新此代码以反映该代码的删除。
If you go to the blog, you will find a more generic function that allows sniffing dupes over multiple columns. I've added a flag--RemoveAllDupes--in case I want to remove all duplicate rows, but this still assumes that the rows are ordered by name, and involves only duplicates and not triplicates, quadruplicates and so on. If anyone can, please update this code to reflect removal of such.
这篇关于如何根据列的值从数据表中完全删除重复项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!