问题描述
由于Fortran非常常见,我正在编写一个大规模并行的科学代码。在我的代码开始时,我阅读了我的配置文件,该文件告诉我要使用哪种类型的求解器。现在,这意味着在一个子程序中(在主程序中),我有 if(solver.eq.1)then
call solver1()
elseif(solver.eq.2)then
call solver2()
else
call solver3()
endif
编辑以避免一些混淆:如果在我的时间集成循环中,并且我有一个在3个嵌套循环中。 / p>
现在我的问题是,使用函数指针不是更有效率,因为解算器
变量不会在执行期间改变,除了初始化过程。
显然函数指针是F2003。只要我使用gfortran 4.6,这应该不成问题。但我主要使用蓝色基因P,有一个f2003编译器,所以我想它也会在那里工作,虽然我在网上找不到任何确凿的证据。
对Fortran一无所知,这是我的答案:分支的主要问题是CPU可能无法在它们之间推测性地执行代码。为了缓解这个问题,引入了分支预测(这在现代CPU中非常复杂)。通过函数指针的间接调用可能会成为预测单元的一个问题。中央处理器。如果它无法预测调用实际发生的位置,则会阻塞管道。
我确信CPU会正确预测您的分支将始终是采取或不采取,因为这是一个微不足道的预测案例。
也许CPU可以在间接调用中进行推测,也许它不能。这就是为什么你需要测试哪个更好。
如果不行,你一定会注意到你的基准。 b
$ b
另外,也许你可以将if test从你的内部循环中提取出来,这样它就不会经常被调用。这将使分支的实际表现无关紧要。
as it is so common with Fortran, I'm writing a massively parallel scientific code. In the beginning of my code I read my configuration file which tells me which type of solver I want to use. Now that means that in a subroutine (during the main run) I have
if(solver.eq.1)then
call solver1()
elseif(solver.eq.2)then
call solver2()
else
call solver3()
endif
Edit to avoid some confusion: This if is inside my time integration loop and I have one that is inside 3 nested loops.
Now my question is, wouldn't it be more efficient to use function pointers instead as the solver
variable will not change during execution, except at the initialisation procedure.
Obviously function pointers are F2003. That shouldn't be a problem as long as I use gfortran 4.6. But I'm mainly using a BlueGene P, there is a f2003 compiler, so I suppose it's going to work there as well although I couldn't find any conclusive evidence on the web.
Knowing nothing about Fortran, this is my answer: The main problem with branching is that a CPU potentially cannot speculatively execute code across them. To mitigate this problem, branch prediction was introduced (which is very sophisticated in modern CPUs).
Indirect calls through a function pointer can be a problem for the prediction unit of the CPU. If it can't predict where the call will actually go, this will stall the pipeline.
I am quite sure that the CPU will correctly predict that your branch will always be taken or not taken because it is a trivial case of prediction.
Maybe the CPU can speculate across the indirect call, maybe it can't. This is why you need to test which is better.
If it cannot, you will certainly notice in your benchmark.
In addition, maybe you can hoist the if test out of your inner loop so it won't be called often. This will make the actual performance of the branch irrelevant.
这篇关于如果还是在fortran函数指针的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!