


I have been using select to handle connections, recently there was a change an our socket library and select was replaced by epoll for linux platform.


my application architecture is such that I make only one or at max 2 socket connections and epoll/select on them in a single thread.


now with recent switch to epoll i noticed that performance of application has diminshed, I was actually surprised and was expecting performance go up or reamin same. I tried looking at various other parts and this is the only peice of code that has changed.


does epoll have performance penalty in terms of speed if used for very small number of sockets (like 1 or 2).


also anoher thing to note that I run around 125 such processes on same box (8 cpu cores).could this be case that too many processes doing epoll_wait on same machine, this setup was similar when i was using select.


i noticed on box that load average is much higher but cpu usage was quite the same which makes me think that more time is spend in I/O and probaly coming from epoll related changes.


any ideas/pointers on what should i look more to identify the problem.


although absolute latency increased is quite small like average 1 millisec but this is a realtime system and this kind of latencies are generally unaccpetable.


在最新的findinds上更新此问题,除了从select切换到epoll之外,我还发现了另一个相关的变化,select的早期超时为10毫秒,而epoll的超时时间则比以前小得多(例如1 micro ..)在select或epoll中设置的超时时间过低会导致性能下降吗?

Updating this question on latest findinds, apart from switching from select to epoll I found another relate change, earlier timeout with select was 10 millis but with epoll the way timeout is way smaller than before (like 1 micro..), can setting too low timeout in select or epoll result on decreased performance in anyway?




From the sounds of it, throughput may be unaffected with epoll() vs select(), but you're finding extra latency in individual requests that seems to be related to the use of epoll().


I think that in the case of watching only one or two sockets, epoll() should perform much like select(). epoll() is supposed to scale linearly as you watch more descriptors, whereas select() scales badly (& may even have a hard limit on #/descriptors). So it's not that epoll() has a penalty for a small # of descriptors, but it loses its performance advantage over select() in this case.

您是否可以更改代码,以便轻松返回&两种事件通知机制之间有什么联系?获取有关性能差异的更多数据.如果您最终发现select()的延迟更短&;在您遇到相同的吞吐量的情况下,那么我会毫不犹豫地切换回旧的和已弃用的" API :)对我来说,如果您衡量此特定代码更改的性能差异,则是相当确定的.也许以前epoll()select()的测试集中在吞吐量与单个请求的延迟之间?

Can you change the code so you can easily go back & forth between the two event notification mechanisms? Get more data about the performance difference. If you conclusively find that select() has less latency & same throughput in your situation, then I'd just switch back to the "old & deprecated" API without hesitation :) To me it's fairly conclusive if you measure a performance difference from this specific code change. Perhaps previous testing of epoll() versus select() has focused on throughput versus latency of individual requests?


09-27 10:53