本文介绍了RGB转换为RGBA用C的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一个字节数组重新$ P $的psenting在RGB字节顺序的图像内容复制到另一个RGBA(每像素4字节)的缓冲区。 Alpha通道,以后将被填补。什么是实现这一目标的最快的方法?

I need to copy the contents of a byte array representing an image in RGB byte order into another RGBA(4 bytes per pixel) buffer. The alpha channel will get filled later. What would be the fastest way of achieving this?

推荐答案

如何棘手,你要吗?你可以将其设置为4个字节的字一次复制,这可能是有点快了一些32位系统:

How tricky do you want it? You could set it up to copy a 4-byte word at a time, which might be a bit faster on some 32-bit systems:

void fast_unpack(char* rgba, const char* rgb, const int count) {
    if(count==0)
        return;
    for(int i=count; --i; rgba+=4, rgb+=3) {
        *(uint32_t*)(void*)rgba = *(const uint32_t*)(const void*)rgb;
    }
    for(int j=0; j<3; ++j) {
        rgba[j] = rgb[j];
    }
}

在端的额外的情况是处理的事实,即RGB阵列缺少一个字节。你也可以把它用一致的动作和SSE指令快一点,同时在4个像素的倍数工作。如果你技痒,你可以尝试更可怕的混淆之类的东西prefetching高速缓存行成FP寄存器,例如,然后在一次全部块传输它的其他图像。当然,你走出这些优化的里程将是高度依赖于你在目标定位具体的系统配置,我会很怀疑,有多少好处在所有做任何本,而不是简单的事情。

The extra case on the end is to deal with the fact that the rgb array is missing a byte. You could also make it a bit faster using aligned moves and SSE instructions, working in multiples of 4 pixels at a time. If you're feeling really ambitious, you can try even more horribly obfuscated things like prefetching a cache line into the FP registers, for example, then blitting it across to the other image all at once. Of course the mileage you get out of these optimizations is going to be highly dependent on the specific system configuration you are targetting, and I would be really skeptical that there is much benefit at all to doing any of this instead of the simple thing.

和我简单的实验证实,这确实是一个的的快一点,至少在我的x86机器上。这里是一个风向标:

And my simple experiments confirm that this is indeed a little bit faster, at least on my x86 machine. Here is a benchmark:

#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <time.h>

void fast_unpack(char* rgba, const char* rgb, const int count) {
    if(count==0)
        return;
    for(int i=count; --i; rgba+=4, rgb+=3) {
        *(uint32_t*)(void*)rgba = *(const uint32_t*)(const void*)rgb;
    }
    for(int j=0; j<3; ++j) {
        rgba[j] = rgb[j];
    }
}

void simple_unpack(char* rgba, const char* rgb, const int count) {
    for(int i=0; i<count; ++i) {
        for(int j=0; j<3; ++j) {
            rgba[j] = rgb[j];
        }
        rgba += 4;
        rgb  += 3;
    }
}

int main() {
    const int count = 512*512;
    const int N = 10000;

    char* src = (char*)malloc(count * 3);
    char* dst = (char*)malloc(count * 4);

    clock_t c0, c1;
    double t;
    printf("Image size = %d bytes\n", count);
    printf("Number of iterations = %d\n", N);

    printf("Testing simple unpack....");
    c0 = clock();
    for(int i=0; i<N; ++i) {
        simple_unpack(dst, src, count);
    }
    c1 = clock();
    printf("Done\n");
    t = (double)(c1 - c0) / (double)CLOCKS_PER_SEC;
    printf("Elapsed time: %lf\nAverage time: %lf\n", t, t/N);


    printf("Testing tricky unpack....");
    c0 = clock();
    for(int i=0; i<N; ++i) {
        fast_unpack(dst, src, count);
    }
    c1 = clock();
    printf("Done\n");
    t = (double)(c1 - c0) / (double)CLOCKS_PER_SEC;
    printf("Elapsed time: %lf\nAverage time: %lf\n", t, t/N);

    return 0;
}

和这里的结果(使用g ++编译-O3):

And here are the results (compiled with g++ -O3):

图像尺寸= 262144字节

迭代= 10000

测试简单的解包....完成

Testing simple unpack....Done

经过时间:3.830000

Elapsed time: 3.830000

平均时间:0.000383

Average time: 0.000383

测试棘手的解压....完成

Testing tricky unpack....Done

经过时间:2.390000

Elapsed time: 2.390000

平均时间:0.000239

Average time: 0.000239

所以,也许约40%的速度在一个美好的一天。

So, maybe about 40% faster on a good day.

这篇关于RGB转换为RGBA用C的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-29 06:17