生成随机唯一数的性能问题

生成随机唯一数的性能问题

本文介绍了生成随机唯一数的性能问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到的情况是,我需要创建成千上万的唯一数字.但是,这些数字必须是9位数字,并且不能包含任何0.我当前的方法是生成9位数字(1-9)并将它们连接在一起,如果列表中尚未包含该数字,则将其添加到其中.例如

I have a situation where by I need to create tens of thousands of unique numbers. However these numbers must be 9 digits and cannot contain any 0's. My current approach is to generate 9 digits (1-9) and concatenate them together, and if the number is not already in the list adding it into it. E.g.

public void generateIdentifiers(int quantity)
{
    uniqueIdentifiers = new List<string>(quantity);
    while (this.uniqueIdentifiers.Count < quantity)
    {
        string id = string.Empty;
        id += random.Next(1,10);
        id += random.Next(1,10);
        id += random.Next(1,10);
        id += " ";
        id += random.Next(1,10);
        id += random.Next(1,10);
        id += random.Next(1,10);
        id += " ";
        id += random.Next(1,10);
        id += random.Next(1,10);
        id += random.Next(1,10);
        if (!this.uniqueIdentifiers.Contains(id))
        {
            this.uniqueIdentifiers.Add(id);
        }
    }
}

但是,由于越来越多的生成的数字是重复的,因此该过程确实在减慢约40万的速度.我正在寻找一种更有效的方法来执行此过程,我们将不胜感激.

However at about 400,000 the process really slows down as more and more of the generated numbers are duplicates. I am looking for a more efficient way to perform this process, any help would be really appreciated.

-我正在生成这些内容- http://www .nhs.uk/NHSEngland/thenhs/records/Pages/thenhsnumber.aspx

- I'm generating these - http://www.nhs.uk/NHSEngland/thenhs/records/Pages/thenhsnumber.aspx

推荐答案

正如其他人所提到的,请使用HashSet<T>代替List<T>.
此外,使用StringBuilder代替简单的字符串操作将使您再获得25%的收益.如果您可以使用数字而不是字符串,那么您将获胜,因为它只需要三分之一或四分之一的时间.

As others have mentioned, use a HashSet<T> instead of a List<T>.
Furthermore, using StringBuilder instead of simple string operations will gain you another 25%. If you can use numbers instead of strings, you win, because it only takes a third or fourth of the time.

var quantity = 400000;
var uniqueIdentifiers = new HashSet<int>();
while (uniqueIdentifiers.Count < quantity)
{
    int i=0;
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    i = i*10 + random.Next(1,10);
    uniqueIdentifiers.Add(i);
}

在我的机器上,花40万个数字大约需要270毫秒,而花100万个数字大约需要700毫秒.而且这甚至没有任何并行性.由于使用HashSet<T>而不是List<T>,因此该算法以O(n)运行,即持续时间将线性增长.因此,10,000,000个值大约需要7秒钟.

It takes about 270 ms on my machine for 400,000 numbers and about 700 for 1,000,000. And this even without any parallelism.Because of the use of a HashSet<T> instead of a List<T>, this algorithm runs in O(n), i.e. the duration will grow linear. 10,000,000 values therefore take about 7 seconds.

这篇关于生成随机唯一数的性能问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

07-31 16:40