问题描述
我目前正在研究Windows下的malloc()
实现.但是在研究中,我偶然发现了令我困惑的事情:
I am currently looking into malloc()
implementation under Windows. But in my research I have stumbled upon things that puzzled me:
首先,我知道在API级别,Windows大多使用HeapAlloc()
和VirtualAlloc()
调用来分配内存.我从此处收集到malloc()
的Microsoft实现(包括在CRT中-C运行时)基本上是为480字节以上的块调用HeapAlloc()
,否则为小分配分配一个为VirtualAlloc()
分配的特殊区域,以防止碎片.
First, I know that at the API level, windows uses mostly the HeapAlloc()
and VirtualAlloc()
calls to allocate memory. I gather from here that the Microsoft implementation of malloc()
(that which is included in the CRT - the C runtime) basically calls HeapAlloc()
for blocks > 480 bytes and otherwise manage a special area allocated with VirtualAlloc()
for small allocations, in order to prevent fragmentation.
这一切都很好.但是还有malloc()
的其他实现,例如 nedmalloc ,声称比微软的malloc
快125%.
Well that is all good and well. But then there are other implementation of malloc()
, for instance nedmalloc, which claim to be up to 125% faster than Microsoft's malloc
.
所有这些使我感到奇怪:
All this makes me wonder a few things:
-
为什么我们不能只为小块调用
HeapAlloc()
?在碎片化方面是否表现不佳(例如,通过第一适应"而不是最佳适应")?
Why can't we just call
HeapAlloc()
for small blocks? Does is perform poorly in regard to fragmentation (for example by doing "first-fit" instead of "best-fit")?
- 实际上,有什么方法可以知道各种API分配调用背后的情况吗?那会很有帮助.
是什么使nedmalloc
比Microsoft的malloc
快得多?
What makes nedmalloc
so much faster than Microsoft's malloc
?
从上面,我得到的印象是HeapAlloc()
/VirtualAlloc()
是如此之慢,以至于malloc()
偶尔仅调用一次然后管理分配的内存本身要快得多.这个假设是真的吗?还是由于碎片而仅需要malloc()
包装器"? 人们会认为这样的系统调用会很快-或至少会考虑一些想法以提高效率.
From the above, I got the impression that HeapAlloc()
/VirtualAlloc()
are so slow that it is much faster for malloc()
to call them only once in a while and then to manage the allocated memory itself. Is that assumption true? Or is the malloc()
"wrapper" just needed because of fragmentation? One would think that system calls like this would be quick - or at least that some thoughts would have been put into them to make them efficient.
- 如果是真的,为什么会这样?
平均而言,通过典型的malloc
调用执行多少次(数量级)内存读/写(可能是已分配的段数的函数)?我可以凭直觉说一个普通程序的费用是几十美元,对吗?
On average, how many (an order of magnitude) memory reads/write are performed by a typical malloc
call (probably a function of the number of already allocated segments)? I would intuitively says it's in the tens for an average program, am I right?
推荐答案
- 调用HeapAlloc听起来并不跨平台. MS可以根据需要自由更改其实现;建议远离. :)
- 它可能更有效地使用内存池,就像Loki库的小对象分配器"一样.
- 堆分配本质上是通用的,但通过任何实现总是很慢.分配器越专业化",它将越快.这使我们返回到第二点,该点处理内存池(以及针对您的应用程序使用的分配大小).
- 不知道.
这篇关于Windows内存分配问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!