问题描述
今天我正在帮助我的一个朋友编写一些 C 代码,我发现了一些奇怪的行为,我无法向他解释为什么会发生这种情况.我们有一个包含整数列表的 TSV 文件,每行一个 int
.第一行是列表的行数.
Today I was helping a friend of mine with some C code, and I've found some strange behavior that I couldn't explain him why it was happening. We had TSV file with a list of integers, with an int
each line. The first line was the number of lines the list had.
我们还有一个带有非常简单的readfile"的 c 文件.第一行被读取到n
,行数,然后有一个初始化:
We also had a c file with a very simple "readfile". The first line was read to n
, the number of lines, then there was an initialization of:
int list[n]
最后是 n
的 for 循环和 fscanf
.
and finally a for loop of n
with a fscanf
.
对于小 n(直到 ~100.000),一切都很好.但是,我们发现当 n 很大(10^6)时,会发生段错误.
For small n's (till ~100.000), everything was fine. However, we've found that when n was big (10^6), a segfault would occur.
最后,我们将列表初始化更改为
Finally, we changed the list initialization to
int *list = malloc(n*sizeof(int))
一切顺利,即使是非常大的n
.
and everything when well, even with very large n
.
有人可以解释为什么会发生这种情况吗?是什么导致 int list[n]
段错误,当我们开始使用 list = malloc(n*sizeof(int))
时停止了?
Can someone explain why this occurred? what was causing the segfault with int list[n]
, that was stopped when we start using list = malloc(n*sizeof(int))
?
推荐答案
这里有几个不同的部分.
There are several different pieces at play here.
首先是声明数组的区别
int array[n];
和
int* array = malloc(n * sizeof(int));
在第一个版本中,您声明了一个具有自动存储持续时间的对象.这意味着数组只有在调用它的函数存在时才存在.在第二个版本中,您将获得动态存储持续时间的内存,这意味着它会一直存在,直到它被 free
显式释放.
In the first version, you are declaring an object with automatic storage duration. This means that the array lives only as long as the function that calls it exists. In the second version, you are getting memory with dynamic storage duration, which means that it will exist until it is explicitly deallocated with free
.
第二个版本在这里工作的原因是 C 通常如何编译的实现细节.通常,C 内存被分成几个区域,包括堆栈(用于函数调用和局部变量)和堆(用于malloc
ed 对象).堆栈的大小通常比堆小得多;通常它是 8MB 之类的东西.结果,如果您尝试使用
The reason that the second version works here is an implementation detail of how C is usually compiled. Typically, C memory is split into several regions, including the stack (for function calls and local variables) and the heap (for malloc
ed objects). The stack typically has a much smaller size than the heap; usually it's something like 8MB. As a result, if you try to allocate a huge array with
int array[n];
那么你可能会超出栈的存储空间,导致段错误.另一方面,堆通常有很大的大小(比如系统上可用的空间),因此 malloc
处理一个大对象不会导致内存不足错误.
Then you might exceed the stack's storage space, causing the segfault. On the other hand, the heap usually has a huge size (say, as much space as is free on the system), and so malloc
ing a large object won't cause an out-of-memory error.
一般来说,要小心 C 中的可变长度数组.它们很容易超过堆栈大小.首选 malloc
,除非您知道大小很小,或者您确实只需要短时间内使用该数组.
In general, be careful with variable-length arrays in C. They can easily exceed stack size. Prefer malloc
unless you know the size is small or that you really only do want the array for a short period of time.
希望这有帮助!
这篇关于数组类型和用 malloc 分配的数组的区别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!