本文介绍了OpenCL文件无法在OS X上编译的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个很大的opencl文件,在Windows和Linux Ubuntu上都可以正常编译,但在MacOSX上却无法运行. cvmcompiler进程使用100%的CPU,并且永远不会完成.该项目的完整代码在这里:

I have a quite large opencl file that compiles fine on both Windows and Linux Ubuntu but fails on MacOSX. The cvmcompiler process uses 100% of the CPU and never completes. The full code of the project is there:

https://github.com/favreau/Sol-R

,相关文件为:

https://github.com /favreau/Sol-R/blob/master/solr/engines/opencl/RayTracer.cl

通过克隆项目并运行cmake/make流程,该问题应该很容易重现.请注意,由于OpenCL是在运行时编译的,因此需要启动应用程序才能重现该问题.

The problem should be fairly easy to reproduce by cloning the project and running the cmake/make process. Note that since OpenCL is compiled at runtime, the application needs to be started to reproduce the issue.

推荐答案

是否与Iris Pro GPU和OS X 10.11一起使用(但可在10.12上使用)?因为上周我们遇到了类似的情况.编译调用将挂起几分钟,占用大量内存,然后返回无效的错误代码.在10.12上可以与其他GPU正常工作,因此感觉就像是Intel/Apple编译器的错误.内核相对简单.这是一系列if/else条件.每个对象都使用逻辑AND运算符(&&)检查了一些条件.根据英特尔几年前的提示,我们回顾了C语言中的那些运算符是短路"的,这意味着您在语义上创建了许多可能的分支(即使编译器确实应该意识到没有副作用,但显然并非总是如此).

Was it with Iris Pro GPU and on OS X 10.11 (but works on 10.12)? Because we ran into a similar issue last week with those conditions. The compile call would hang for a few minutes, use tons of memory, and then return a not useful error code. Worked fine with other GPUs, and on 10.12, so felt like an Intel/Apple compiler bug. The kernel was relatively simple; it was a chain of if/else conditions. Each has a few conditions it checked, using logical AND operators (&&). Based on a tip from Intel years ago, we recalled that those operators in C are "short-circuit" which means you are semantically creating many possible branches (even though the compiler really should realize there are no side effects but apparently not always).

我们的解决方案是将它们放入一系列布尔分配中,因此每个if都具有一个布尔值,并且if和else块周围没有分支.因此,按照以下方式更改代码:

Our solution was to pull those out into a series of boolean assignments, so each if had a single boolean and no branching around the if and else blocks. So, changing code along these lines:

   if (cond1 && cond2 && cond3)
      ...
   else if (cond4 && cond5 && cond6)
      ...

收件人

   bool b1 = cond1 && cond2 && cond3;
   bool b2 = cond4 && cond5 && cond6;
   if (b1)
      ...
   else if (b2)
      ...

这使内核得以编译.

我看到您的内核中有一些if语句,最多包含三个&&运算符,所以也许您遇到了同样的问题.

I see your kernel has some if statements with upwards of three && operators, so maybe you are having the same problem.

我也看到使用&而不是&&解决了这个问题,但是我以为如果某些条件不是相同的位模式,那么使用按位AND而不是逻辑AND就感到不舒服.

I have also seen this solved using & instead of && but I never felt comfortable using a bitwise AND instead of a logical AND just in case some of the conditions were not the same bit pattern.

||同样的逻辑也适用于短路.

The same logic applies to ||, which also short-circuits.

祝你好运!希望对您或至少对您有所帮助.

Good luck! I hope this helps you, or at least someone.

为了在适当的时候提供更多的荣誉,虽然英特尔向我们提到了短路问题(并建议使用&),但AMD在《 OpenCL优化指南》中也提到了短路问题,并建议使用布尔变量来修复(第2.8.7.2节旁路短路"),这是我们用来修复它的方法.

To give additional credit where due, while Intel mentioned short-circuiting to us as an issue (and suggested using &), AMD also mentions it in their OpenCL Optimization Guide and suggests the use of boolean variables to fix it (section 2.8.7.2 Bypass Short-Circuiting), which is what we used to fix it.

这篇关于OpenCL文件无法在OS X上编译的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-19 23:29