本文介绍了OpenCL / OpenGL纹理互操作/窗口的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为了提高渲染质量,我在OpenCL 1.1中的多功能可分离缩小器上编写。



基本图像(仅覆盖最终图像的一小部分)渲染成一个非常大的framebuffer。然后,它的颜色附加纹理被下采样,并通过OpenCL放置到另一个纹理。最后,一个屏幕对齐的四边形将显示结果。



到目前为止的想法。我们有什么:




  • 2个缩减器内核的实例(它存储的结果与坐标交换(即as(y,x) )

  • inputTexture(rtt-framebuffer的颜色附件)

  • tempTexture,size:inputHeight x outputWidth,使用CL_MEM_READ_WRITE

  • outputTexture



运行 kernel_instance_1(< otherParams> ;, inputTexture,tempTexture) code>产生所需的结果,但只在第一帧 - 不知何故动画中发生的变化不会显示任何错误(见下文)我假设内核每帧运行一次,但是源纹理内容保持不变(它没有,我也有一个纹理的实时输出)。



问题: 每次framebuffer的内容改变时,我必须调用clCreateFromGLTexture2D()吗?



EDIT 我刚刚意识到:inputTexture仍然附加到framebuffer对象的 GL_COLOR_ATTACHMENT0 - 这可能是一个问题? ENDEDIT



执行 kernel_instance_2(< otherParams> tempTexture,outputTexture)可见的结果,即使在两个内核调用之间插入一个屏障。也就是说outputTexture保持为空。



问题:获取在两个内核调用之间的纹理对象 tempTexture ,因此OpenCL看到的变化?



-calls,产生以下输出:

  clCreateKernel(separable_X)
clRetainMemObject(separable_X :: convolution)
clCreateKernel(separable_Y)
clRetainMemObject(separable_Y :: convolution)
clCreateFromGLTexture2D(separable_X :: dst + separable_y :: src,texID = 24,usage = temporary(source and target))
clCreateFromGLTexture2D(separable_Y :: dst,texID = 18,usage = target)
clCreateFromGLTexture2D(separable_X :: src,texID = 22,usage = source)
clRetainMemObject(separable_X :: dst)
clRetainMemObject(separable_Y :: src)
clRetainMemObject(separable_Y :: dst)
clRetainMemObject(clearEmpty :: dst)
clEnqueueAcquireGLObjects(count = 3)
clEnqueueBarrier b clSetKernelArg(separable_X :: convert)
clSetKernelArg(separable_X :: offset)
clSetKernelArg(separable_X :: convolution)
clSetKernelArg(separable_X :: dst)
clSetKernelArg src)
clEnqueueNDRangeKernel(separable_X,(1440,1080,0),waiting4 0 events)
clSetKernelArg(separable_Y :: convert)
clSetKernelArg(separable_Y :: offset)
clEnqueueBarrier )
clSetKernelArg(separable_Y :: convolution)
clSetKernelArg(separable_Y :: dst)
clSetKernelArg(separable_Y :: src)
clEnqueueNDRangeKernel(separable_Y,(540,1440,0) waiting4 0 events)
clEnqueueBarrier()
clEnqueueReleaseGLObjects(count = 3)


另一种情况我得到很多次是 clEnqueueReleaseGLObjects() 返回错误代码-9999,有人提交为NVidia:非法读取或写入缓冲区。



问题: 可能是 write_imagef()不钳位颜色值,如果任何组件超过1.0f,存储格式是RGBA8?所以这实际上意味着必须写 write_imagef(texture,(int2)coord,clamp(color,0.f,1.f)); ...

预先感谢很多 - 这让我在近一个星期后碰到了我的头...



EDIT
可能值得一提的更多信息:



如何区分这两个实例?

有两个不同的 __ kernel 函数具有不同的名称( separable_X separable_Y ),它们都有相同的主体调用 separable() - function。



如何在GL和CL之间同步?

- 对象在调用之前发出 glFinish() clEnqueueAcquireGLObjects()

- clEnqueueReleaseGLObjects()通过使用cl_events(可能在将来更改)

解决方案

在clEnqueueAcquireGLObjects之前使用glFinish,这是正确的,但你也应该 调用clFinish AFTER clEnqueueReleaseGLObjects。请仔细阅读OpenCL 1.1规范的9.8.6.2节。



此外,对于您的其他问题:

不,您只需执行一次即可从OpenGL纹理创建OpenCL图像。这应该发生在使用循环之前。

否。一旦获得了OpenCL,你可以根据需要使用它。

不,它完美地工作。我们一直使用它。


To improve rendering quality I'm writing on a versatile separable downscaler in OpenCL 1.1.

The basic image (covering only a small part of the final image) is rendered into a very large framebuffer. Then its color-attached texture is downsampled and placed into another texture via OpenCL. Finally a screen-aligned quad gets rendered to show the result.

So far the idea. What do we have:

  • 2 instances of the downscaler-kernel (it stores the results with coordinates exchanged (i.e. as (y,x) )
  • inputTexture (the color attachment of the rtt-framebuffer)
  • tempTexture, size: inputHeight x outputWidth, created with CL_MEM_READ_WRITE
  • outputTexture

Running kernel_instance_1( <otherParams>, inputTexture, tempTexture ) produces the desired result, but only in the very first frame - somehow the changes happening in the animation don't show up at all. As I get no errors (see below) I assume the kernel runs every frame, but the source texture content stays the same (which it doesn't, I also have a live-output of that texture).

Question: Do I have to call clCreateFromGLTexture2D() every time the contents of the framebuffer changed?

EDIT I just realized: the inputTexture is still attached to the framebuffer object's GL_COLOR_ATTACHMENT0 - may this be a problem? ENDEDIT

Running kernel_instance_2( <otherParams>, tempTexture, outputTexture ) yields no visible result, even with a barrier enqueued between both kernel calls. I.e. the outputTexture stays empty.

Question: Do I need to release and re-acquire the texture object tempTexture in between both kernel calls, so OpenCL sees the changes?

Just to see what OpenCL-calls are made, the following output was produced:

clCreateKernel( separable_X )
clRetainMemObject( separable_X::convolution )
clCreateKernel( separable_Y )
clRetainMemObject( separable_Y::convolution )
clCreateFromGLTexture2D( separable_X::dst + separable_y::src, texID=24, usage=temporary (source and target) )
clCreateFromGLTexture2D( separable_Y::dst, texID=18, usage=target )
clCreateFromGLTexture2D( separable_X::src, texID=22, usage=source )
clRetainMemObject( separable_X::dst )
clRetainMemObject( separable_Y::src )
clRetainMemObject( separable_Y::dst )
clRetainMemObject( clearEmpty::dst )
clEnqueueAcquireGLObjects( count=3 )
clEnqueueBarrier()
clSetKernelArg( separable_X::convert )
clSetKernelArg( separable_X::offset )
clSetKernelArg( separable_X::convolution )
clSetKernelArg( separable_X::dst )
clSetKernelArg( separable_X::src )
clEnqueueNDRangeKernel( separable_X, (1440, 1080, 0), waiting4 0 events )
clSetKernelArg( separable_Y::convert )
clSetKernelArg( separable_Y::offset )
clEnqueueBarrier()
clSetKernelArg( separable_Y::convolution )
clSetKernelArg( separable_Y::dst )
clSetKernelArg( separable_Y::src )
clEnqueueNDRangeKernel( separable_Y, (540, 1440, 0), waiting4 0 events )
clEnqueueBarrier()
clEnqueueReleaseGLObjects( count=3 )

If any call had produced an error, it would've been inside that output.

Another situation I get lots of times is that clEnqueueReleaseGLObjects() returns error code -9999, which somebody filed as "NVidia: Illegal read or write to a buffer".

Question: could it be that write_imagef() does not clamp the color value if any component exceeds 1.0f and the storage format is RGBA8? So that'd actually mean one must write write_imagef( texture, (int2)coord, clamp( color, 0.f, 1.f ) );...

Thanks a lot in advance - this gets me banging my head since nearly a week...

EDITSome more infos that might be worth mentioning:

how I can distinguish the two instances?
There are 2 distinct __kernel functions with different names (separable_X and separable_Y) inside the program source, which both have the same body calling the separable()-function.

how do I sync between GL and CL?
- the function taking care of acquiring GL objects issues a glFinish() before calling clEnqueueAcquireGLObjects()
- I wait for completion of clEnqueueReleaseGLObjects() by using cl_events (likely to change in the future)

解决方案

You're using glFinish before clEnqueueAcquireGLObjects which is correct, but you should also call clFinish AFTER clEnqueueReleaseGLObjects. Read section 9.8.6.2 of the OpenCL 1.1 specification carefully.

Also, to your other questions:

No, you only do that once to create an OpenCL image from an OpenGL texture. That should happen before the loop where it is used.

No. Once acquired for OpenCL you can use it there as much as you need.

No, it works perfectly. We use it all the time.

这篇关于OpenCL / OpenGL纹理互操作/窗口的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-23 12:56