本文介绍了mmap()的整个大文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想用下面的code(test.c的)的mmap二进制文件(〜8GB)。

I am trying to "mmap" a binary file (~ 8Gb) using the following code (test.c).

#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>

#define handle_error(msg) \
  do { perror(msg); exit(EXIT_FAILURE); } while (0)

int main(int argc, char *argv[])
{
   const char *memblock;
   int fd;
   struct stat sb;

   fd = open(argv[1], O_RDONLY);
   fstat(fd, &sb);
   printf("Size: %lu\n", (uint64_t)sb.st_size);

   memblock = mmap(NULL, sb.st_size, PROT_WRITE, MAP_PRIVATE, fd, 0);
   if (memblock == MAP_FAILED) handle_error("mmap");

   for(uint64_t i = 0; i < 10; i++)
   {
     printf("[%lu]=%X ", i, memblock[i]);
   }
   printf("\n");
   return 0;
}

test.c的使用 GCC编译-std = C99 test.c的-o测试文件测试返回:测试:ELF 64位LSB的可执行文件,X86-64,版本1(SYSV),动态链接(使用共享库),为GNU / Linux 2.6.15,不可剥离

虽然这工作正常对于小文件,我得到一个分段错误,当我尝试加载一个大的。该计划实际上返回:

Although this works fine for small files, I get a segmentation fault when I try to load a big one. The program actually returns:

Size: 8274324021
mmap: Cannot allocate memory

我设法地图使用boost整个文件输入输出流:: mapped_file所::但是我想用C和系统调用来做到这一点。有什么不对我的code?

I managed to map the whole file using boost::iostreams::mapped_file but I want to do it using C and system calls. What is wrong with my code?

推荐答案

MAP_PRIVATE 映射需要的内存保留,因为写这些网页可能会导致写入时复制分配。这意味着你不能映射的东西比你的物理RAM +交换过大得多。尝试使用 MAP_SHARED 映射来代替。这意味着,写入映射将在磁盘上反映出来 - 因为如此,内核知道它可以随时释放内存做回写,所以它不会限制你。

MAP_PRIVATE mappings require a memory reservation, as writing to these pages may result in copy-on-write allocations. This means that you can't map something too much larger than your physical ram + swap. Try using a MAP_SHARED mapping instead. This means that writes to the mapping will be reflected on disk - as such, the kernel knows it can always free up memory by doing writeback, so it won't limit you.

我也注意到你与 PROT_WRITE 映射,但你去和从内存映射读取。您也打开了 O_RDONLY 文件 - 这本身就可能是你另一个问题;您必须指定 O_RDWR 如果你想使用 PROT_WRITE MAP_SHARED

I also note that you're mapping with PROT_WRITE, but you then go on and read from the memory mapping. You also opened the file with O_RDONLY - this itself may be another problem for you; you must specify O_RDWR if you want to use PROT_WRITE with MAP_SHARED.

至于 PROT_WRITE 而已,这种情况发生在x86工作,因为86不支持只写映射,但可能会导致在其他平台上段错误。请求 PROT_READ | PROT_WRITE - 或者,如果你只需要读取, PROT_READ

As for PROT_WRITE only, this happens to work on x86, because x86 doesn't support write-only mappings, but may cause segfaults on other platforms. Request PROT_READ|PROT_WRITE - or, if you only need to read, PROT_READ.

在我的系统(带有676MB RAM,256MB交换VPS),我复制你的问题;更改为 MAP_SHARED 结果在 EPERM 错误(因为我不能写与 O_RDONLY )。更改为 PROT_READ MAP_SHARED 允许映射成功。

On my system (VPS with 676MB RAM, 256MB swap), I reproduced your problem; changing to MAP_SHARED results in an EPERM error (since I'm not allowed to write to the backing file opened with O_RDONLY). Changing to PROT_READ and MAP_SHARED allows the mapping to succeed.

如果您需要修改字节的文件,其中一个方案是,以公开,只有你会写文件的范围。也就是说,则munmap MAP_PRIVATE 你打算写的地区重新映射。当然,如果你打算写的整个文件的,那么你需要8GB的内存这样做。

If you need to modify bytes in the file, one option would be to make private just the ranges of the file you're going to write to. That is, munmap and remap with MAP_PRIVATE the areas you intend to write to. Of course, if you intend to write to the entire file then you need 8GB of memory to do so.

另外,你可以写 1 到<$c$c>/proc/sys/vm/overcommit_memory.这将使映射请求成功;但是,请记住,如果你真的尝试使用COW内存的全部8GB,你的程序(或其他程序!)将由OOM杀手杀害。

Alternately, you can write 1 to /proc/sys/vm/overcommit_memory. This will allow the mapping request to succeed; however, keep in mind that if you actually try to use the full 8GB of COW memory, your program (or some other program!) will be killed by the OOM killer.

这篇关于mmap()的整个大文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-12 11:38
查看更多