问题描述
尝试通过写入各种块大小和不同数量的线程来创建1Mb(1048576Byte)文件.当int NUM_THREADS = 2
或int NUM_THREADS = 1
时,创建的文件大小与给定的大小相同,即10MB.
Trying to create 1Mb(1048576Byte) file by writing in various chunk sizes and a different number of threads. When int NUM_THREADS = 2
or int NUM_THREADS = 1
then created file size is same as given i.e. 10MB .
但是,当我将线程数增加到4时,创建的文件大小约为400MB.为什么会出现这种异常?
However when I increase thread count to 4, The created file size is around 400MB; Why this anomaly?
#include <pthread.h>
#include <string>
#include <iostream>
#define TenGBtoByte 1048576
#define fileToWrite "/tmp/schatterjee.txt"
using namespace std;
pthread_mutex_t mutexsum;
struct workDetails {
int threadcount;
int chunkSize;
char *data;
};
void *SPWork(void *threadarg) {
struct workDetails *thisWork;
thisWork = (struct workDetails *) threadarg;
int threadcount = thisWork->threadcount;
int chunkSize = thisWork->chunkSize;
char *data = thisWork->data;
long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
FILE *f = fopen(fileToWrite, "a+");
for (long i = 0; i < noOfWrites; ++i) {
pthread_mutex_lock(&mutexsum);
fprintf(f, "%s", data);
fflush (f);
pthread_mutex_unlock(&mutexsum);
}
fclose(f);
pthread_exit((void *) NULL);
}
int main(int argc, char *argv[]) {
int blocksize[] = {1024};
int NUM_THREADS = 2;
for (int BLOCKSIZE: blocksize) {
char *data = new char[BLOCKSIZE];
fill_n(data, BLOCKSIZE, 'x');
pthread_t thread[NUM_THREADS];
workDetails detail[NUM_THREADS];
pthread_attr_t attr;
int rc;
long threadNo;
void *status;
/* Initialize and set thread detached attribute */
pthread_mutex_init(&mutexsum, NULL);
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
detail[threadNo].threadcount = NUM_THREADS;
detail[threadNo].chunkSize = BLOCKSIZE;
detail[threadNo].data = data;
rc = pthread_create(&thread[threadNo], &attr, SPWork, (void *) &detail[threadNo]);
if (rc) exit(-1);
}
pthread_attr_destroy(&attr);
for (threadNo = 0; threadNo < NUM_THREADS; threadNo++) {
rc = pthread_join(thread[threadNo], &status);
if (rc) exit(-1);
}
pthread_mutex_destroy(&mutexsum);
delete[] data;
}
pthread_exit(NULL);
}
--1)这是一项基准测试任务,因此请按照他们的要求进行.2)long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
基本上计算每个线程应写入多少次才能获得10MB的合并大小.4)我试图将互斥锁放在不同的位置.所有结果都相同
N.B. -1)It's a benchmarking task, so doing as they asked in requirement.2) long noOfWrites = (TenGBtoByte / (threadcount * chunkSize));
basically computing how many times each thread should write to get the combined size of 10MB.4)I tried to put Mutex lock at various position . All yeild in same result
也欢迎提出有关程序其他更改的建议
推荐答案
您正在像这样分配和初始化数据数组:
You are allocating and initializing your data array like this:
char *data = new char[BLOCKSIZE];
fill_n(data, BLOCKSIZE, 'x');
然后使用fprintf
将其写入文件:
Then you are writing it to file using fprintf
:
fprintf(f, "%s", data);
函数fprintf
期望data
是一个以空字符结尾的字符串.这已经是未定义的行为.如果此方法在线程数量较少的情况下起作用,那是因为内存块之后的内存恰好包含零字节.
Function fprintf
expects data
to be a null-terminated string. This is an undefined behavior already. If this worked with low number of threads, it is because memory after than memory chunk happen to contain zero byte.
除此之外,程序中的互斥没有任何作用,可以将其删除.文件锁定也是多余的,因此您可以使用fwrite_unlocked
和fflush_unlocked
来写入数据,因为每个线程都使用单独的FILE
对象.基本上,程序中的所有同步都在内核中执行,而不是在用户空间中执行.
Other than that, mutex in your program serves no purpose and can be removed. File locking is also redundant, so you can use fwrite_unlocked
and fflush_unlocked
to write your data since every thread uses separate FILE
object. Essentially all synchronization in your program is performed in the kernel, not in userspace.
即使在删除互斥锁并使用_unlocked
函数之后,您的程序也可以可靠地创建1 MB文件,而不管线程数如何.因此,无效的文件写入似乎是您遇到的唯一问题.
Even after removing mutex and using _unlocked
functions your program reliably creates 1 MB files regardless of number of threads. So invalid file writing seems to be the only issue you have.
这篇关于当增加线程数时,多线程文件IO程序的行为异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!