问题描述
我有一个非常大的数据文件,并且在该数据文件中的每个记录具有4行。我写了一个非常简单的C程序来分析这种类型的文件,并打印出一些有用的信息。该方案的基本思想是本
I have a very large data file, and each record in this data file has 4 lines. I have written a very simple C program to analyze files of this type and print out some useful information. The basic idea of the program is this.
int main()
{
char buffer[BUFFER_SIZE];
while(fgets(buffer, BUFFER_SIZE, stdin))
{
fgets(buffer, BUFFER_SIZE, stdin);
do_some_simple_processing_on_the_second_line_of_the_record(buffer);
fgets(buffer, BUFFER_SIZE, stdin);
fgets(buffer, BUFFER_SIZE, stdin);
}
print_out_result();
}
这当然是留下了一些细节(理智/错误检查等),但是这是不相关的问题。
This of course leaves out some details (sanity/error checking, etc), but that is not relevant to the question.
该程序工作正常,但我的工作数据文件是巨大的。我想我会尝试通过并行使用OpenMP循环加快程序。有点搜索之后,虽然,它看起来OpenMP的只能处理为
回路,其中的迭代次数是事先知道。由于我不知道这些文件的大小事前,甚至简单的命令,比如 WC -l 需要较长的时间来运行,我怎么能并行这个计划?
The program works fine, but the data files I'm working with are huge. I figured I would try to speed up the program by parallelizing the loop with OpenMP. After a bit of searching, though, it appears that OpenMP can only handle for
loops where the number of iterations is know beforehand. Since I don't know the size of the files beforehand, and even simple commands like wc -l
take a long time to run, how can I parallelize this program?
推荐答案
您确认您的过程实际上是CPU密集型的,而不是I / O密集型?您code看起来很像I / O密集型code,这将受益于并行一无所获。
Have you checked that your process is actually CPU-bound and not I/O-bound? Your code looks very much like I/O-bound code, which would gain nothing from parallelization.
这篇关于并行while循环使用OpenMP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!