问题描述
我认为我了解"了Java多线程的基础知识.如果我没记错的话,您会做一些大工作,并弄清楚如何将其分解为多个(并发)任务.然后,您将这些任务实现为 Runnable
s或 Callable
s,并将它们全部提交给 ExecutorService
.(因此,首先,如果我在这方面弄错了,请先纠正我!)
I think I "get" the basics of multi-threading with Java. If I'm not mistaken, you take some big job and figure out how you are going to chunk it up into multiple (concurrent) tasks. Then you implement those tasks as either Runnable
s or Callable
s and submit them all to an ExecutorService
. (So, to begin with, if I am mistaken on this much, please start by correcting me!!!)
第二,我必须想象在 run()
或 call()
内部实现的代码必须尽可能使用非阻塞的并行化"算法等.这就是最困难的部分(编写并行代码).正确吗?不正确吗?
Second, I have to imagine that the code you implement inside run()
or call()
has to be as "parallelized" as possible, using non-blocking algorithms, etc. And that this is where the hard part is (writing parallel code). Correct? Not correct?
但是我对Java并发性仍然存在的真正问题(我想一般来说并发性)是这个问题的真正主题,
But the real problem I'm still having with Java concurrency (and I guess concurrency in general), and which is the true subject of this question, is:
我从另一个关于Stack Overflow的问题中看到了一个示例,其中发帖者建议创建多个线程来读取和处理巨大的文本文件(《 Moby Dick》 书),一个回答者评论说多线程为了从磁盘读取的目的是一个可怕的想法.他们之所以这样做,是因为您有多个线程在已经很慢的进程(磁盘访问)的顶部 上引入了上下文切换的开销.
I saw an example from another question on Stack Overflow where the poster proposed creating multiple threads for reading and processing a huge text file (the book Moby Dick), and one answerer commented that multi-threading for the purpose of reading from disk was a terrible idea. Their reasoning for this was because you'd have multiple threads introducing the overhead of context-switching, on top of an already slow process (disk access).
这让我开始思考:哪种问题类别适用于多线程,应该始终序列化哪些类别的问题?预先感谢!
So that got me thinking: what classes of problems are appropriate for multi-threading, what classes of problems should always be serialized? Thanks in advance!
推荐答案
多线程具有IMO的两个主要优点:
Multi-threading has two main advantages, IMO:
- 能够在多个CPU/内核之间分配大量的工作:将问题分成4部分,而不是让4个CPU中的3个闲置并在单个CPU上完成所有工作,而是让每个CPU单独工作.这样可以减少执行CPU密集型任务所需的时间,并证明您花在多CPU硬件上的钱是合理的
- 减少许多任务的延迟.假设有4个用户向Web服务器发出请求,并且所有请求均由单个线程处理.假设第一个请求进行了非常长的数据库查询.线程处于空闲状态,等待查询完成,其他3个用户等待直到该请求完成后才能获取其微型网页.如果您有4个线程(即使只有一个CPU),则在数据库服务器执行长数据库查询时,可以处理第二,第三和第四个请求,并且所有用户都很满意.因此,当您阻塞IO调用时,多线程尤为重要,因为阻塞IO调用使CPU处于空闲状态,而不是执行其他一些等待任务.
注意:从多个线程读取同一磁盘的问题在于,与其依次读取整个长文件,不如在每个上下文切换时强制磁盘在磁盘的各个物理位置之间进行切换.由于所有线程都在等待磁盘读取完成(它们是IO绑定的),因此,与单个线程读取所有内容相比,这会使读取速度变慢.但是一旦数据存储在内存中,就可以在线程之间分配工作.
Note: the problem with reading from the same disk from multiple threads is that instead of reading the whole long file sequentially, it would force the disk to switch between various physical locations of the disk at each context switch. Since all the threads are waiting for the disk-reading to finish (they're IO-bound), this makes the reading slower than if a single thread read everything. But once the data is in memory, it would make sense to split the work between threads.
这篇关于什么时候适合多线程?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!