1、构造list:
List<HyperLink> list = new ArrayList<>(); for (int i = 0; i < 1600000; i++) { HyperLink hyperLink = new HyperLink(); hyperLink.setName("name" + i); hyperLink.setUrl("url" + i); list.add(hyperLink); }
2、不同方案耗时比较:
方案一:
//方案一:java8 stream() + distinct() Long time1 = System.currentTimeMillis(); List<String> dorgCodes = list.stream().map(o->o.getName()).distinct().collect(Collectors.toList()); List<String> dgoodsCodes = list.stream().map(o->o.getUrl()).distinct().collect(Collectors.toList()); Long time2 = System.currentTimeMillis(); System.out.println("time2 - time1:" + (time2 - time1));
方案二:
//方案二:Set 集合去重 Long time3 = System.currentTimeMillis(); Set<String> orgSets = new HashSet<>(); Set<String> goodsSets = new HashSet<>(); list.forEach(o ->{ orgSets.add(o.getName()); goodsSets.add(o.getUrl()); }); List<String> orgCodes = new ArrayList<>(orgSets); List<String> goodsCodes = new ArrayList<>(goodsSets); System.out.println("time3:" + (System.currentTimeMillis() - time3));
方案三:
//方案三:List.contains()去重 Long time3 = System.currentTimeMillis(); List<String> orgCodes = new ArrayList<>(); List<String> goodsCodes = new ArrayList<>(); list.forEach(o ->{ if(!orgCodes.contains(o.getName())){ orgCodes.add(o.getName()); } if(!goodsCodes.contains(o.getUrl())){ goodsCodes.add(o.getUrl()); } }); System.out.println("time3:" + (System.currentTimeMillis() - time3));
结论: list的数据量是钱以下级别时,方案二和方案三较快;
数据量是十万级别以上,方案二中List的contains方法性能急剧下降;
数据量百万级以上,方法一和方案二耗时接近;
故方案二是目前的最佳方案;