问题描述
处理大量数据时,我经常发现自己在执行以下操作:
When processing large amounts of data I often find myself doing the following:
HashSet<String> set = new HashSet<String> ();
//Adding elements to the set
ArrayList<String> list = new ArrayList<String> (set);
类似于转储列表中集合的内容。我通常这样做,因为我添加的元素通常包含我要删除的重复项,这似乎是一种简单的方法来删除它们。
Something like "dumping" the contents of the set in the list. I usually do this since the elements I add often contain duplicates I want to remove, and this seems like an easy way to remove them.
只考虑这个目标(避免重复)我也可以写:
With only that objective in mind (avoiding duplicates) I could also write:
ArrayList<String> list = new ArrayList<String> ();
// Processing here
if (! list.contains(element)) list.add(element);
//More processing here
因此无需将集合转储到集合中名单。但是,在插入每个元素之前我会做一个小的检查(我假设HashSet也是这样)。
And thus no need for "dumping" the set into the list. However, I'd be doing a small check before inserting each element (which I'm assuming HashSet does as well)
两种可能性中的任何一种显然更有效吗?
Is any of the two possibilities clearly more efficient?
推荐答案
该套装将提供更好的表现( O(n)
vs O(n ^ 2)
列表),这是正常的,因为设置成员资格(包含
操作)是一个集合的非常用途。
The set will give much better performance (O(n)
vs O(n^2)
for the list), and that's normal because set membership (the contains
operation) is the very purpose of a set.
包含 HashSet
是 O(1)
与列表中的 O(n)
相比,如果您经常需要运行,则不应使用列表包含
。
Contains for a HashSet
is O(1)
compared to O(n)
for a list, therefore you should never use a list if you often need to run contains
.
这篇关于HashSet与ArrayList包含性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!