问题描述
我编写了四个不同的程序来计算两个文件中的单词总数.这四个版本看起来大致相同.前三个版本使用两个线程进行计数,仅三个语句的顺序不同.最新版本使用一个线程进行计数.我将首先列出每个版本的不同部分和通用部分,然后列出每个版本的输出和我的问题.
I wrote four different programs to count total words in two files. These four versions look mostly the same. First three versions use two threads to count and just the orders of three statements are different. The last version uses one thread to count. I will list the different part of each version and the common part first, then the output of each version and my question.
不同部分:
// version 1
count_words(&file1);
pthread_create(&new_thread, NULL, count_words, &file2);
pthread_join(new_thread, NULL);
// version 2
pthread_create(&new_thread, NULL, count_words, &file2);
count_words(&file1);
pthread_join(new_thread, NULL);
// version 3
pthread_create(&new_thread, NULL, count_words, &file2);
pthread_join(new_thread, NULL);
count_words(&file1);
// version 4
count_words(&file1);
count_words(&file2);
公共部分:(将不同部分插入此公共部分以制作完整版本)
#include <stdio.h>
#include <pthread.h>
#include <ctype.h>
#include <stdlib.h>
#include <time.h>
#define N 2000
typedef struct file_t {
char *name;
int words;
} file_t;
double time_diff(struct timespec *, struct timespec *);
void *count_words(void *);
// Usage: progname file1 file2
int main(int argc, char *argv[]) {
pthread_t new_thread;
file_t file1, file2;
file1.name = argv[1];
file1.words = 0;
file2.name= argv[2];
file2.words = 0;
// Insert different part here
printf("Total words: %d\n", file1.words+file2.words);
return 0;
}
void *count_words(void *arg) {
FILE *fp;
file_t *file = (file_t *)arg;
int i, c, prevc = '\0';
struct timespec process_beg, process_end;
struct timespec thread_beg, thread_end;
double process_diff, thread_diff;
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &process_beg);
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &thread_beg);
fp = fopen(file->name, "r");
for (i = 0; i < N; i++) {
while ((c = getc(fp)) != EOF) {
if (!isalnum(c) && isalnum(prevc))
file->words++;
prevc = c;
}
fseek(fp, 0, SEEK_SET);
}
fclose(fp);
clock_gettime(CLOCK_PROCESS_CPUTIME_ID, &process_end);
clock_gettime(CLOCK_THREAD_CPUTIME_ID, &thread_end);
process_diff = time_diff(&process_beg, &process_end);
thread_diff = time_diff(&thread_beg, &thread_end);
printf("count_words() in %s takes %.3fs process time and"
"%.3fs thread time\n", file->name, process_diff, thread_diff);
return NULL;
}
double time_diff(struct timespec *beg, struct timespec *end) {
return ((double)end->tv_sec + (double)end->tv_nsec*1.0e-9)
- ((double)beg->tv_sec + (double)beg->tv_nsec*1.0e-9);
}
注意
- file1是具有10000个单词的单词"的文件. file2是cp命令创建的file1的副本.
- 为使执行时间足够长,程序会重复计算字数. N是循环数.因此结果不是总字数的准确,而是乘以N.
- 请不要过分强调计数算法.在这个例子中,我只是担心执行时间.
- 重要信息:该计算机是Intel®Celeron(R)CPU 420 @ 1.60GHz.一个核心.操作系统是Linux 3.2.0.就像其他人所说的那样,也许一个核心是造成这种奇怪现象的原因.但我仍然想弄清楚.
- file1 is a file with 10000 words of "word". file2 is a copy of file1, created by cp command.
- To make the execution time long enough, program repeatly counts the words. N is the number of loops. So the result is not the accurate number of total words, but that multiplies by N.
- Please don't put too much emphasis on the counting algorithm. I am just concerned about the execution time in this example.
- An important information: The machine is Intel® Celeron(R) CPU 420 @ 1.60GHz. one core. The OS is Linux 3.2.0. Maybe one core is the cause of this strange phenomenon like others said. But I still want to figure it out.
程序对字进行计数,并使用clock_gettime()计算例程count_words()的进程cpu时间和线程cpu时间,然后输出时间和字号.以下是输出和我对问题的评论.如果有人可以解释加班的原因,我将不胜感激.
The program counts words and uses clock_gettime() to calculate the process cpu time and the thread cpu time of routine count_words() and then output the times and the word number. Below is the output and my comment with questions. I will be very appreciated if someone can explain the reason what the extra time is taken on.
// version 1
count_words() in file1 takes 2.563s process time and 2.563s thread time
count_words() in file2 takes 8.374s process time and 8.374s thread time
Total words: 40000000
注释:原始线程完成count_words()并等待新线程终止.当count_words()在新线程中运行时,不会发生上下文切换(因为进程时间==线程时间). 为什么要花这么长时间?新线程中的count_words()会发生什么?
Comment: The original thread finishes count_words() and waits for the new thread to die. When count_words() running in the new thread, no context switch happens (because process time == thread time). Why it takes so much time? What happens in count_words() in the new thread?
// version 2
count_words() in file1 takes 16.755s process time and 8.377s thread time
count_words() in file2 takes 16.753s process time and 8.380s thread time
Total words: 40000000
注释:两个线程在此处并行运行.发生上下文切换,因此处理时间>线程时间.
Comment: Two threads parallel runs here. Context switch happens, so the process time > thread time.
// version 3
count_words() in file2 takes 8.374s process time and 8.374s thread time
count_words() in file1 takes 8.365s process time and 8.365s thread time
Total words: 40000000
注释:新线程首先计数,而原始线程等待它.加入新线程后,原始线程开始计数. 他们两个都没有上下文切换,为什么要花这么多时间,尤其是新线程加入后的计数呢?
Comment: New thread counts first and original thread waits for it. After new thread is joined, original thread begins to count. Neither of them has context switching, why so much time taken, especially the count after the new thread joined?
// version 4
count_words() in file1 takes 2.555s process time and 2.555s thread time
count_words() in file2 takes 2.556s process time and 2.556s thread time
Total words: 40000000
评论:最快的版本.没有创建新线程.两个count_words()都在单个线程中运行.
Comment: Fastest version. No new thread created. Both count_words() runs in a single thread.
推荐答案
这可能是因为创建任何线程都迫使libc在getc
中使用同步.这会使此功能明显变慢.以下示例对我来说像版本3一样慢:
It's probably because creation of any thread forces libc to use synchronization in getc
. This makes this function significantly slower. Following example is for me as slow as version 3:
void *skip(void *p){ return NULL; };
pthread_create(&new_thread, NULL, skip, NULL);
count_words(&file1);
count_words(&file2);
要解决此问题,您可以使用缓冲区:
To fix this problem you can use a buffer:
for (i = 0; i < N; i++) {
char buffer[BUFSIZ];
int read;
do {
read = fread(buffer, 1, BUFSIZ, fp);
int j;
for(j = 0; j < read; j++) {
if (!isalnum(buffer[j]) && isalnum(prevc))
file->words++;
prevc = buffer[j];
}
} while(read == BUFSIZ);
fseek(fp, 0, SEEK_SET);
}
在此解决方案中,很少调用IO功能以至于使同步开销微不足道.这不仅解决了奇怪的计时问题,而且使其速度提高了数倍.对我来说,它是从0.54s
(没有线程)或0.85s
(有线程)减少到0.15s
(在两种情况下).
In this solution, IO functions are called rarely enough to make synchronization overhead insignificant. This not only solves problem of weird timings, but also makes it several times faster. For me it's reduction from 0.54s
(without threads) or 0.85s
(with threads) to 0.15s
(in both cases).
这篇关于在pthread程序中例程的额外执行时间花了多少钱?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!