问题描述
我尝试了一个实验,该实验构建了一个简单的Producer/Consumer程序.它们在单独的线程中运行.生产者生成一些数据,而消费者在另一个线程中将其提取.我实现的消息传递延迟大约为100纳秒.有人可以告诉我这是否合理,或者有明显更快的实现方案吗?
I have tried an experiment where I built a simple Producer/Consumer program. They run in separate threads. The producer generates some data and the consumer picks it up in another thread. The messaging latency I achieved is approximately 100 nano seconds. Can anybody tell me if this is reasonable or are there significantly faster implementations out there?
我不使用锁...只是简单的内存计数器.我的实验描述如下:
I'm not using locks ... just simple memory counters. My experiment is described here:
http://tradexoft. wordpress.com/2012/10/22/how-to-move-data-between-threads-in-100-nanoseconds/
基本上,使用者等待计数器增加,然后调用处理程序函数.因此,实际上没有多少代码.仍然令我惊讶的是,它花了100ns.
Basically the consumer waits on a counter to be incremented and then it calls the handler function. So not much code really. Still I was surprised it took 100ns.
消费者看起来像这样:
void operator()()
{
while (true)
{
while (w_cnt==r_cnt) {};
auto rc=process_data(data);
r_cnt++;
if (!rc)
break;
}
}
生产者在有可用数据时只是增加w_cnt.
The producer simply incremnts w_cnt when it has data available.
有更快的方法吗?
推荐答案
我认为您的延迟是操作系统安排上下文切换而不是自旋锁本身的产物,并且我怀疑您对此可以做很多事情
I imagine your latency is a product of how the operating system schedules context-switching, rather than the spin lock itself, and I doubt you can do much about it.
但是,您可以使用环形缓冲区立即移动更多数据.如果一个线程写入一个线程读取,则可以实现一个无锁的环形缓冲区.从本质上讲,这将是相同的自旋锁方法(一直等到tailidx != headidx
),但是生产者可以在将缓冲区切换到使用者之前将多个值泵入缓冲区.这应该可以改善您的整体延迟(但不能改善单值延迟).
You can, however, move more data at once by using a ring buffer. If one thread writes and one thread reads, you can implement a ring buffer without locks. Essentially it would be the same spin-lock approach (waiting until tailidx != headidx
), but the producer could pump more than a single value into the buffer before it is switched out to the consumer. That ought to improve your overall latency (but not your single-value latency).
这篇关于用C ++将数据从一个线程发送到另一个线程的最快方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!