问题描述
本站有很多关于内存分配的问题,但是我找不到专门解决我关注的问题.这个问题似乎最接近,它让我这个文章,所以...我比较了它包含在(虚拟)桌面 x86 上的三个演示程序的行为Linux系统和基于ARM的系统.
There are a lot of questions about memory allocation on this site, but Icouldn't find one that specifically addresses my concern. Thisquestionseems closest, and it led me to thisarticle, so... I compared thebehavior of the three demo programs it contains on a (virtual) desktop x86Linux system and an ARM-based system.
我的发现很详细这里,但快速总结是:在我的桌面系统上,demo3
程序来自文章似乎表明 malloc()
always 在于内存量已分配——即使禁用交换.例如,它高兴地分配" 3GB 的 RAM,然后在程序开始实际运行时调用 OOM 杀手写入所有内存.禁用交换后,将调用 OOM 杀手仅写入 3 GB malloc()
中的 610 MB 后可用.
My findings are detailed here, butthe quick summary is: on my desktop system, the demo3
program from thearticle seems to show that malloc()
always lies about the amount of memoryallocated—even with swap disabled. For example, it cheerfully 'allocates' 3GB of RAM, and then invokes the OOM killer when the program starts to actuallywrite to all that memory. With swap disabled, the OOM killer gets invokedafter writing to only 610 MB of the 3 GB malloc()
has supposedly madeavailable.
演示程序的目的是演示 Linux 的这个有据可查的特性",所以这一切都不足为奇.但是我们基于 i.MX6 的嵌入式目标在工作时的行为是不同的,malloc()
似乎在说出它有多少 RAM 的真相分配(?)下面的程序(从文章中逐字复制)总是当 i == n
时,在第二个循环中被 OOM 杀死:
The purpose of the demo program is to demonstrate this well-documented 'feature' of Linux, so none of this is too surprising.But the behavior is different on our i.MX6-based embedded target at work,where malloc()
appears to be telling the truth about how much RAM itallocates(?) The program below (reproduced verbatim from the article) alwaysgets OOM-killed in the second loop when i == n
:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define N 10000
int main (void) {
int i, n = 0;
char *pp[N];
for (n = 0; n < N; n++) {
pp[n] = malloc(1<<20);
if (pp[n] == NULL)
break;
}
printf("malloc failure after %d MiB\n", n);
for (i = 0; i < n; i++) {
memset (pp[i], 0, (1<<20));
printf("%d\n", i+1);
}
return 0;
}
所以我的问题,简而言之,是:为什么 demo3
程序——或其他一些不幸的 OOM 杀手受害者——总是在我的 i == n
之前很久就被杀了桌面系统(暗示 malloc()
是个骗子),但它只会被杀死当 i == n
在我们的 i.MX6 ARM 目标上时(暗示 malloc()
可能会告诉真相)?这种差异是 libc 和/或内核版本的函数,还是还有什么?我可以得出结论 malloc()
将 总是 返回 NULL,如果在这个目标上分配失败?
So my question, in a nutshell, is: why does the demo3
program—or some otherunlucky OOM killer victim—always get killed long before i == n
on mydesktop system (implying that malloc()
is a liar), but it only gets killedwhen i == n
on our i.MX6 ARM target (implying that malloc()
may be telling thetruth)? Is this difference a function of the libc and/or kernel version, orsomething else? Can I conclude that malloc()
will always return NULL ifallocation fails on this target?
注意:每个系统的一些详细信息(请注意,overcommit_memory
和 overcommit_ratio
具有相同的值):
NOTE: Some details on each system (please note that overcommit_memory
and overcommit_ratio
have the same values for both):
# Desktop system
% uname -a
Linux ubuntu 3.8.0-33-generic #48-Ubuntu SMP Wed Oct 23 17:26:34 UTC 2013 i686 i686 i686 GNU/Linux
% /lib/i386-linux-gnu/libc.so.6
GNU C Library (Ubuntu EGLIBC 2.17-0ubuntu5.1) stable release version 2.17, by Roland McGrath et al.
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.7.3.
Compiled on a Linux 3.8.13 system on 2013-09-30.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
<https://bugs.launchpad.net/ubuntu/+source/eglibc/+bugs>.
% cat /proc/sys/vm/overcommit_memory
0
% cat /proc/sys/vm/overcommit_ratio
50
# i.MX6 ARM system
# uname -a
Linux acmewidgets 3.0.35-ts-armv7l #2 SMP PREEMPT Mon Aug 12 19:27:25 CST 2013 armv7l GNU/Linux
# /lib/libc.so.6
GNU C Library (GNU libc) stable release version 2.17, by Roland McGrath et al.
Copyright (C) 2012 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Compiled by GNU CC version 4.7.3.
Compiled on a Linux 3.0.35 system on 2013-08-14.
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE
For bug reporting instructions, please see:
<http://www.gnu.org/software/libc/bugs.html>.
# cat /proc/sys/vm/overcommit_memory
0
% cat /proc/sys/vm/overcommit_ratio
50
背景:我们正在尝试决定如何处理我们的内存不足的情况.面向媒体的嵌入式应用程序,想知道我们是否可以——对于这个特定目标——信任 malloc()
在分配失败时提醒我们.我对桌面 Linux 的体验应用程序让我觉得答案肯定是不是,但现在我不太确定了.
BACKGROUND: We're trying to decide how to handle low memory conditions in ourmedia-oriented embedded application, and want to know whether we can—for this specific target—trust malloc()
to alert us when allocation fails. My experience with desktop Linuxapps made me think the answer was certainly not, but now I'm not so sure.
推荐答案
小背景
malloc()
不会说谎,您的内核虚拟内存子系统会说谎,这是大多数现代操作系统的常见做法.当你使用 malloc()
时,真正发生的事情是这样的:
A little background
malloc()
doesn't lie, your kernel Virtual Memory subsystem does, and this a common practice on most modern Operating Systems. When you use malloc()
, what's really happening is something like this:
malloc()
的 libc 实现检查其内部状态,并将尝试使用各种策略优化您的请求(例如尝试使用预先分配的块,分配更多内存比提前要求...).这意味着实现将影响性能并稍微改变内核请求的内存量,但这在检查大数字"时并不真正相关,就像您在测试中所做的那样.
The libc implementation of
malloc()
checks its internal state, and will try to optimize your request by using a variety of strategies (like trying to use a preallocated chunk, allocating more memory than requested in advance...). This means the implementation will impact on the performance and change a little the amount of memory requested from the kernel, but this is not really relevant when checking the "big numbers", like you're doing in your tests.
如果预分配的内存块中没有空间(请记住,内存块通常很小,大约为 128KB 到 1MB),它会向内核请求更多内存.实际的系统调用因内核而异(mmap()
、vm_allocate()
...)但其目的大致相同.
If there's no space in a preallocated chunk of memory (remember, chunks of memory are usually pretty small, in the order of 128KB to 1MB), it will ask the kernel for more memory. The actual syscall varies from one kernel to another (mmap()
, vm_allocate()
...) but its purpose is mostly the same.
内核的VM子系统将处理请求,如果它发现它是可接受的"(稍后会详细介绍这个主题),它将在请求任务的内存映射中创建一个新条目(我使用 UNIX 术语,其中任务是一个进程及其所有状态和线程),并将所述映射条目的起始值返回到 malloc()
.
The VM subsystem of the kernel will process the request, and if it finds it to be "acceptable" (more on this subject later), it will create a new entry in the memory map of the requesting task (I'm using UNIX terminology, where task is a process with all its state and threads), and return the starting value of said map entry to malloc()
.
malloc()
将记录新分配的内存块,并将适当的答案返回给您的程序.
malloc()
will take note of the newly allocated chunk of memory, and will return the appropriate answer to your program.
好的,现在您的程序已成功分配一些内存,但事实是实际上没有分配物理内存的单个页面(x86 中的 4KB)(好吧,这过于简单化了,因为有些页面本来可以用来存储有关内存池状态的信息,但它可以更容易地说明这一点).
OK, so now you're program has successfully malloc'ed some memory, but the truth is that not a single page (4KB in x86) of physical memory has been actually allocated to your request yet (well, this is an oversimplification, as collaterally some pages could have been used to store info about the state of the memory pool, but it makes it easier to illustrate the point).
那么,当您尝试访问这个最近malloc'ed 的内存时会发生什么?分段错误.令人惊讶的是,这是一个相对鲜为人知的事实,但您的系统一直生成分段错误.然后你的程序被中断,内核接管控制,检查地址错误是否对应于一个有效的映射条目,获取一个或多个物理页面并将它们链接到任务的映射.
So, what happens when you try to access this recently malloc'ed memory? A segmentation fault. Surprisingly, this is a relatively little known fact, but your system is generating segmentation faults all the time. Your program is then interrupted, the kernel takes control, checks if the address faulting corresponds to a valid map entry, takes one or more physical pages and links them to the task's map.
如果您的程序尝试访问不在您任务中的映射条目内的地址,内核将无法解决该故障,并将发送信号(或非 UNIX 系统的等效机制)到它指出了这个问题.如果程序不自行处理该信号,它将因臭名昭著的Segmentation Fault 错误而被杀死.
If your program tries to access an address which is not inside a map entry in your task, the kernel will not be able to resolve the fault, and will send the signal (or the equivalent mechanism for non-UNIX systems) to it pointing out this problem. If the program doesn't handle that signal by itself, it will be killed with the infamous Segmentation Fault error.
所以在您调用malloc()
时不会分配物理内存,而是在您实际访问该内存时.这允许操作系统执行一些漂亮的技巧,例如磁盘分页、balloning 和过度使用.
So physical memory is not allocated when you call malloc()
, but when you actually access that memory. This allows the OS to do some nifty tricks like disk paging, balloning and overcommiting.
这样,当您询问特定进程使用了多少内存时,您需要查看两个不同的数字:
This way, when you ask how much memory a specific process is using, you need to look at two different numbers:
虚拟大小:已请求的内存量,即使实际上并未使用.
Virtual Size: The amount of memory that has been requested, even if it's not actually used.
Resident Size:实际使用的内存,由物理页面支持.
Resident Size: The memory which it is really using, backed by physical pages.
在计算中,资源管理是一个复杂的问题.你有各种各样的策略,从最严格的基于能力的系统,到像 Linux 这样的内核更宽松的行为(memory_overcommit == 0
),这基本上允许你请求内存达到任务允许的最大地图大小(这是一个取决于架构的限制).
In computing, resource management in a complex issue. You have a wide range of strategies, from the most strict capability-based systems, to the much more relaxed behavior of kernels like Linux (with memory_overcommit == 0
), which basically will allow you to request memory up to the maximum map size allowed for a task (which is a limit that depends on the architecture).
在中间,您有像 Solaris 这样的操作系统(在您的文章中提到过),它将任务的虚拟内存量限制为(物理页面 + 交换磁盘页面
)的总和.但是不要被您引用的文章所迷惑,这并不总是一个好主意.如果您正在运行具有成百上千个独立进程的 Samba 或 Apache 服务器(由于碎片导致大量虚拟内存浪费),您将不得不配置大量的交换磁盘,否则您的系统将耗尽虚拟内存,但仍有大量可用 RAM.
In the middle, you have OSes like Solaris (mentioned in your article), which limit the amount of virtual memory for a task to the sum of (physical pages + swap disk pages
). But don't be fooled by the article you referenced, this is not always a good idea. If you're running a Samba or Apache server with hundreds to thousands of independent processes running at the same time (which leads to a lot of virtual memory wasting due to fragmentation), you'll have to configure a ridiculous amount of swap disk, or your system will run out of virtual memory, while still having a lot of free RAM.
它没有.至少它不应该,但 ARM 供应商有一种疯狂的倾向,即对他们随系统分发的内核进行任意更改.
It doesn't. At least it shouldn't, but ARM vendors have an insane tendency to introduce arbitrary changes to the kernels they distribute with their systems.
在您的测试用例中,x86 机器按预期工作.由于您以小块分配内存,并且您将 vm.overcommit_memory
设置为 0,因此您可以填充所有虚拟空间,即 3GB 行上的某个位置,因为您在 32 位机器上运行它(如果你在 64 位上尝试这个,循环将运行直到 n==N).显然,当您尝试使用该内存时,内核会检测到物理内存越来越少,并启动 OOM 杀手对策.
In your test case, the x86 machine is working as it is expected. As you're allocating memory in small chunks, and you have vm.overcommit_memory
set to 0, you're allowed to fill all your virtual space, which is somewhere on the 3GB line, because you're running it on a 32 bit machine (if you try this on 64 bits, the loop will run until n==N). Obviously, when you try to use that memory, the kernel detects that physical memory is getting scarce, and activates the OOM killer countermeasure.
在 ARM 上应该是一样的.事实并非如此,我想到了两种可能性:
On ARM it should be the same. As it doesn't, two possibilities come to my mind:
overcommit_memory
是 NEVER (2) 策略,也许是因为有人在内核上强制这样做.
overcommit_memory
is on NEVER (2) policy, perhaps because someone has forced it this way on the kernel.
您已达到任务允许的最大地图大小.
You're reaching the maximum allowed map size for the task.
由于每次在 ARM 上运行时,malloc 阶段都会得到不同的值,因此我将放弃第二个选项.确保 overcommit_memory
已启用(值为 0)并重新运行您的测试.如果您可以访问这些内核源代码,请查看它们以确保内核尊重这个 sysctl(正如我所说,一些 ARM 供应商喜欢对他们的内核做一些讨厌的事情).
As on each run on ARM, you get different values for the malloc phase, I would discard the second option. Make sure overcommit_memory
is enabled (value 0) and rerun your test. If you have access to those kernel sources, take a look at them to make sure the kernel honors this sysctl (as I said, some ARM vendors like to do nasty things to their kernels).
作为参考,我在 QEMU 模拟 vertilepb 和 Efika MX (IMX.515) 下运行了 demo3.第一个在 3 GB 标记处停止 malloc'ing,正如在 32 位机器上预期的那样,而另一个在 2 GB 处更早地做到了.这可能会让人感到意外,但如果您查看它的内核配置 (https://github.com/genesi/linux-legacy/blob/master/arch/arm/configs/mx51_efikamx_defconfig),你会看到:
As a reference, I've ran demo3 under QEMU emulating vertilepb and on an Efika MX (iMX.515). The first one stopped malloc'ing at the 3 GB mark, as expected on a 32 bit machine, while the other did it earlier, at 2 GB. This may come as a surprise, but if you take a look at its kernel config (https://github.com/genesi/linux-legacy/blob/master/arch/arm/configs/mx51_efikamx_defconfig), you'll see this:
CONFIG_VMSPLIT_2G=y
# CONFIG_VMSPLIT_1G is not set
CONFIG_PAGE_OFFSET=0x80000000
内核配置为 2GB/2GB 分割,因此系统按预期运行.
The kernel is configured with a 2GB/2GB split, so the system is behaving as expected.
这篇关于Linux malloc() 在 ARM 和 x86 上的行为是否不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!