c - 如何针对C中各种大小的缓冲区测试WKdm算法

我正在尝试测试wkdm算法，看看它在100kb、1Mb和10mb的缓冲区中的性能如何；但是，下面的测试程序中任何大于1kb的缓冲区都会抛出exc_bad_access：无法访问内存。
我使用wkdm.c的main（），这是一个简单的测试，并尝试转换它，以便更改要压缩的输入缓冲区的大小。
我使用的是WKDM算法的标准Scott-Kaplan实现，它包含一个源文件和头文件，找到here我尝试过Linux和OS X 32位操作系统。

#include <stdio.h>
#include <unistd.h>
#include <math.h>
#include <strings.h>
#include <sys/time.h>
#include "WKdm.h"

#define PAGE_SIZE_IN_WORDS 1024
#define PAGE_SIZE_IN_BYTES 4096

int main() {

    WK_word i;
    //int testSize = 1024; //1KB Works
    int testSize = 102400; //100KB causes EXC_BAD_ACCESS

    int testNumWords = testSize / sizeof(WK_word);
    printf("testSize = %d bytes or %d words\n", testSize, testNumWords);

    WK_word* source_buf = (WK_word*) malloc(testSize * 2);
    WK_word* dest_buf = (WK_word*) malloc(testSize * 2);
    WK_word* udest_buf = (WK_word*) malloc(testSize * 2);

    for (i = 0; i < testNumWords; i++) {
        source_buf[i] = rand() % 1000; //Semi-random: 0-999 stored in each 4-byte word
    }

    source_buf[testNumWords + 1] = 99999;
    udest_buf[testNumWords + 1] = 55555;

    printf("first 50 words of source_buf are:\n");
    for (i = 0; i < 50; i++)
        printf(" %d", source_buf[i]);
    fflush(stdout);

    struct timeval t0;  struct timeval t1;
    gettimeofday(&t0, 0);

    // Compress the source_buf into the dest_buf
    i = WKdm_compress(source_buf, dest_buf, testNumWords);

    gettimeofday(&t1, 0);
    long elapsed = (t1.tv_sec - t0.tv_sec) * 1000000 + t1.tv_usec - t0.tv_usec;

    printf("\nWKdm_compress size in bytes: %u\n", i);
    printf("Time to compress: %lu microseconds\n\n", elapsed);

    printf("redzone value at end of source buf (should be 99999) is %u\n",
            source_buf[testNumWords + 1]); fflush(stdout);

    gettimeofday(&t0, 0);

    WKdm_decompress(dest_buf, udest_buf, testNumWords);

    gettimeofday(&t1, 0);
    elapsed = (t1.tv_sec - t0.tv_sec) * 1000000 + t1.tv_usec - t0.tv_usec;
    printf("Time to decompress: %lu microseconds\n\n", elapsed);

    printf("redzone value at end of udest buf (should be 55555) is %u\n", udest_buf[testSize + 1]);

    printf("first 50 words of udest_buf are:\n");
    for (i = 0; i < 50; i++)
        printf(" %d", udest_buf[i]);

    i = bcmp(source_buf, udest_buf, 100);

    printf("\nbcmp of orig. and compr'd/decompr'd copy (should be 0) is %u\n", i);
    fflush(stdout);
    return 0;
}

最佳答案

scott kaplan实现的wkdm算法是为4kb的页面大小设计的。如果要压缩任何大于4KB的内容，则需要增加用于在建模期间以中间形式保存输出数据的数组的大小这3个数组位于WKdm_compress和WKdm_decompress函数的顶部您可以增加它们的大小以存储更多的中间数据，但似乎comp/decomp时间会显著增加。
另外，压缩大于1MB的缓冲区会导致更多的越界异常因此，除非您想进行大量的重写，否则您可能只想将WKdm用于小于4KB的缓冲区。
顺便说一下，Kaplan的WKdm实现是为压缩4KB而优化的，这可能是苹果在OS X 10.9mavericks（页面大小为4KB）中使用它进行内存压缩的一个重要原因。