



我想使用 httpwebrequest 访问网络服务器并从给定范围的页面中获取数千条记录.对网页的每次点击都会获取 15 条记录,而网络服务器上几乎有 8 到 10000 个页面.这意味着服务器总共有 120000 次点击!如果使用单个流程轻松完成,则该任务可能非常耗时.因此,多线程是我想到的直接解决方案.

I want to access a web server using httpwebrequest and fetch thousands of records from a given range of pages. Each hit to a webpage fetches 15 records, and there are almost 8 to 10000 pages on the webserver. That means a total of 120000 hits to the server! If done trivially with a single process, the task can be very time consuming. Hence, multiple threading is the immediate solution that comes to mind.

目前,我创建了一个用于搜索的工作类,该工作类将产生 5 个子工作器(线程)以在给定范围内进行搜索.但是,由于我在线程方面的新手能力,我无法使其工作,因为我无法同步并使它们一起工作.我知道 .NET 中的委托、动作、事件,但是让它们与线程一起工作变得令人困惑..这是我正在使用的代码:

Currently, I have created a worker class for searching purpose, that worker class will spawn 5 subworkers (threads) to search in a given range. But, due to my novice abilities in threading, I am unable to make it work, as I am having trouble synchronizing and making them all work together. I know about delegates, actions, events in .NET but making them to work with threads is getting confusing..This is the code that I am using:

public void Start()
    this.totalRangePerThread = ((this.endRange - this.startRange) / this.subWorkerThreads.Length);
    for (int i = 0; i < this.subWorkerThreads.Length; ++i)
        //theThreads[counter] = new Thread(new ThreadStart(MethodName));
        this.subWorkerThreads[i] = new Thread(() => searchItem(this.startRange, this.totalRangePerThread));
        this.startRange = this.startRange + this.totalRangePerThread;

    for (int threadIndex = 0; threadIndex < this.subWorkerThreads.Length; ++threadIndex)

searchItem 方法:

The searchItem method:

public void searchItem(int start, int pagesToSearchPerThread)
    for (int count = 0; count < pagesToSearchPerThread; ++count)
     //searching routine here


The problem exists between the shared variables of the threads, can anyone guide me how to make it a threadsafe procedure?


您面临的真正问题是 Thread 构造函数中的 labmda 表达式正在捕获外部变量 (startRange).修复它的一种方法是制作变量的副本,如下所示:

the real problem you're facing is that the labmda expression in the Thread constructor is capturing the outer variable (startRange). One way to fix it is to make a copy of the variable, like this:

for (int i = 0; i < this.subWorkerThreads.Length; ++i)
    var copy = startRange;
    this.subWorkerThreads[i] = new Thread(() => searchItem(copy, this.totalRangePerThread));
    this.startRange = this.startRange + this.totalRangePerThread;

有关创建和启动线程的更多信息,请参阅 Joe Albahari 的优秀电子书(有还有一个关于捕获变量的部分,再往下一点).如果您想了解闭包,请参阅这个问题.

for more information on creating and starting threads, see Joe Albahari's excellent ebook (there's also a section on captured variables a bit further down). If you want to learn about closures, see this question.


09-05 11:16