Pthreads矩阵乘法错误

Pthreads矩阵乘法错误

本文介绍了Pthreads矩阵乘法错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在现有的串行矩阵乘法代码上使用pthreads.我的目标是使用pthreads来获得更好的执行时间,只是为了提高速度.但是那时我被困住了.我原来的串行代码可以正常工作,我在15秒钟内完成了1000x1000方阵乘法.但是,当我执行当前的pthreads程序时,出现了分段错误.这是我的代码:

I want to use pthreads on my existing serial matrix multiplication code. My goal is to achieve better execution time using pthreads, simply to achieve speed-up. But at that point I'm stuck. My original serial code, works just fine, and I finish 1000x1000 square matrix multiplication in about 15 seconds. But when I execute my current pthreads program, I get a segmentation fault. Here is my code:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <assert.h>

int SIZE, NTHREADS;
int **A, **B, **C;

void init()
{
    int i, j;

    A = (int**)malloc(SIZE * sizeof(int *));
    for(i = 0; i < SIZE; i++)
        A[i] = malloc(SIZE * sizeof(int));

    B = (int**)malloc(SIZE * sizeof(int *));
    for(i = 0; i < SIZE; i++)
        B[i] = malloc(SIZE * sizeof(int));

    C = (int**)malloc(SIZE * sizeof(int *));
    for(i = 0; i < SIZE; i++)
        C[i] = malloc(SIZE * sizeof(int));

    srand(time(NULL));

    for(i = 0; i < SIZE; i++) {
        for(j = 0; j < SIZE; j++) {
            A[i][j] = rand()%100;
            B[i][j] = rand()%100;
        }
    }
}

void mm(int tid)
{
    int i, j, k;
    int start = tid * SIZE/NTHREADS;
    int end = (tid+1) * (SIZE/NTHREADS) - 1;

    for(i = start; i <= end; i++) {
        for(j = 0; j < SIZE; j++) {
            C[i][j] = 0;
            for(k = 0; k < SIZE; k++) {
                C[i][j] += A[i][k] * B[k][j];
            }
        }
    }
}

void *worker(void *arg)
{
    int tid = *((int *) arg);
    mm(tid);
}

int main(int argc, char* argv[])
{
    pthread_t* threads;
    int rc, i;

    if(argc != 3)
    {
        printf("Usage: %s <size_of_square_matrix> <number_of_threads>\n", argv[0]);
        exit(1);
    }

    SIZE = atoi(argv[1]);
    NTHREADS = atoi(argv[2]);
    init();
    threads = (pthread_t*)malloc(NTHREADS * sizeof(pthread_t));

    clock_t begin, end;
    double time_spent;


    begin = clock();

    for(i = 0; i < NTHREADS; i++) {
        rc = pthread_create(&threads[i], NULL, worker, (void *)i);
        assert(rc == 0);
    }

    for(i = 0; i < NTHREADS; i++) {
        rc = pthread_join(threads[i], NULL);
        assert(rc == 0);
    }

    end = clock();

    time_spent = (double)(end - begin) / CLOCKS_PER_SEC;
    printf("Elapsed time: %.2lf seconds.\n", time_spent);

    for(i = 0; i < SIZE; i++)
        free((void *)A[i]);
    free((void *)A);

    for(i = 0; i < SIZE; i++)
        free((void *)B[i]);
    free((void *)B);

    for(i = 0; i < SIZE; i++)
        free((void *)C[i]);
    free((void *)C);

    free(threads);

    return 0;
}

如果有人可以帮助我使我的pthreads程序运行并提高速度,我会很高兴.

If someone could help me make my pthreads program run, and achieve some speed-up I would be glad.

推荐答案

对于当前代码,您应该使用

With your current code, you should retrieve the index using

int tid = (int)arg;

(您的代码有效地将循环计数器视为一个地址,然后在0或大约0处取消引用地址.这些地址可能无法被您的进程读取和/或无法适当对齐,因此出现段错误)

(Your code is effectively treating the loop counter as an address then dereferencing addresses at or around 0. These addresses may not be readable by your process and/or won't be suitably aligned, hence the seg fault)

上面的更改可能会使您工作正常,但是请注意,将int传递为void*并不完全正确.它依赖于sizeof(int) <= sizeof(void*),这很可能但不能保证为真.如果您对此很在意,则可以为传递给每个线程的数据分配内存,或者传递i的地址并包括同步以确保您在每次pthread_create调用之后等待,直到对线程进行了调度并进行了调度.阅读其arg.

The above change might get things working for you but note that passing an int as a void* isn't completely correct. It relies on sizeof(int) <= sizeof(void*) which is likely but not guaranteed to be true. If you cared about this, you could either allocate memory for the data you pass to each thread instead or pass the address of i and include synchronisation to ensure that you wait after each pthread_create call until the thread has been scheduled and has read its arg.

这篇关于Pthreads矩阵乘法错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 23:41