问题描述
我的目标是读取默认OpenGL帧缓冲区的内容,并将像素数据存储在cv::Mat
中.显然,有两种 方式可以实现这一目标:
My goal is to read the contents of the default OpenGL framebuffer and store the pixel data in a cv::Mat
. Apparently there are two different ways of achieving this:
1)同步:使用FBO和glRealPixels
1) Synchronous: use FBO and glRealPixels
cv::Mat a = cv::Mat::zeros(cv::Size(1920, 1080), CV_8UC3);
glReadPixels(0, 0, 1920, 1080, GL_BGR, GL_UNSIGNED_BYTE, a.data);
2)异步:使用PBO和glReadPixels
2) Asynchronous: use PBO and glReadPixels
cv::Mat b = cv::Mat::zeros(cv::Size(1920, 1080), CV_8UC3);
glBindBuffer(GL_PIXEL_PACK_BUFFER, pbo_userImage);
glReadPixels(0, 0, 1920, 1080, GL_BGR, GL_UNSIGNED_BYTE, 0);
unsigned char* ptr = static_cast<unsigned char*>(glMapBuffer(GL_PIXEL_PACK_BUFFER, GL_READ_ONLY));
std::copy(ptr, ptr + 1920 * 1080 * 3 * sizeof(unsigned char), b.data);
glUnmapBuffer(GL_PIXEL_PACK_BUFFER);
glBindBuffer(GL_PIXEL_PACK_BUFFER, 0);
从我收集的有关该主题的所有信息中,异步版本2)应该更快.但是,比较两个版本的经过时间会发现差异通常是最小的,有时版本1)事件的效果要优于PBO变体.
From all the information I collected on this topic, the asynchronous version 2) should be much faster. However, comparing the elapsed time for both versions yields that the differences are often times minimal, and sometimes version 1) events outperforms the PBO variant.
为了进行性能检查,我插入了以下代码(基于此答案) :
For performance checks, I've inserted the following code (based on this answer):
std::chrono::steady_clock::time_point begin = std::chrono::steady_clock::now();
....
std::chrono::steady_clock::time_point end = std::chrono::steady_clock::now();
std::cout << "Time difference = " << std::chrono::duration_cast<std::chrono::microseconds>(end - begin).count() << std::endl;
在创建PBO时,我还尝试了使用提示:在这里,我没有发现GL_DYNAMIC_COPY
和GL_STREAM_READ
之间的差异.
I've also experimented with the usage hint when creating the PBO: I didn't find much of difference between GL_DYNAMIC_COPY
and GL_STREAM_READ
here.
我很乐意提出一些建议,以进一步提高从帧缓冲区读取像素的速度.
I'd be happy for suggestions how to increase the speed of this pixel read operation from the framebuffer even further.
推荐答案
您的第二个版本根本不是异步的,因为您在触发复制后立即映射了缓冲区.然后,映射调用将阻塞,直到缓冲区的内容可用为止,从而有效地变得同步.
Your second version is not asynchronous at all, since you're mapping the buffer immediately after triggering the copy. The map call will then block until the contents of the buffer are available, effectively becoming synchronous.
或者:根据驱动程序,在实际读取驱动程序时它将阻塞.换句话说,驱动程序可以以导致页面错误和随后的同步的方式来实现映射.对于您来说,这并不重要,因为std::copy
,您仍然可以直接访问该数据.
Or: depending on the driver, it will block when actually reading from it. In other words the driver may implement the mapping in such a way that it causes a pagefault, and a subsequent synchronization. It doesn't really matter in your case, since you are still accessing that data straight away due to the std::copy
.
正确的方法是通过使用同步对象和围栏.
这篇关于从OpenGL中的默认帧缓冲区读取像素数据:FBO与PBO的性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!