本文介绍了我可以在现代Intel Core CPU上测量分支预测故障吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题和它的答案(最近被标记为Epic答案)使我想知道;我可以通过Windows分支预测失败来衡量Windows中正在运行的应用程序的性能吗?我知道存在一些静态分析工具,这些工具可能有助于优化代码以在分支预测情况下实现良好性能,而手动技术可以通过简单地进行更改和重新测试来提供帮助,但我正在寻找一种可以报告一段时间内Windows应用程序运行时分支预测失败的总数,我希望Visual C ++的Profiler工具能对我有所帮助.

This question and its answer, which was recently tagged as an Epic Answer, has prompted me to wonder; Can I measure the performance of a running application in Windows in terms of its CPU branch prediction failures? I know that some static analysis tools exist, that might help with optimizing code for good performance in branch-prediction situations, and that manual techniques could help by simply making changes and re-testing, but I'm looking for some automatic mechanism that can report a total number of branch prediction failures, over a period of time, as a Windows application runs, and I'm hoping that some Profiler tool for Visual C++ could help me.

出于这个问题的考虑,有问题的应用程序要么使用本机编译器(例如Windows的Visual C ++)构建,要么使用其他一些本机编译器(例如GCC,FreePascal,Delphi或TurboAssembler)构建.该可执行文件可能根本没有任何调试信息.我想知道是否可以检测和计算分支预测失败,可能是通过某些Windows服务(例如WMI)读取内部CPU信息,或者是完全在运行Windows的虚拟环境中运行(例如使用VirtualBox),然后完全运行我的测试应用程序(位于VirtualBox中)在虚拟化的Windows环境中进行虚拟CPU的运行时分析.或其他我不知道的技术,就是这个问题.

For the sake of this question, the application in question is either built with a native-compiler such as Visual C++ for Windows, or using some other native compiler, such as GCC, FreePascal, Delp or TurboAssembler. The executable may not have any debug information at all. I want to know if I can detect, and count branch prediction failures, perhaps by reading internal CPU information through some Windows service like WMI, or perhaps by running entirely inside a virtualized environment running Windows, such as using VirtualBox, and then running a completely virtualized windows environment with my test application, inside VirtualBox, and doing runtime analysis of the virtual CPU. Or some other technique that I don't know of, thus this question.

是的,我用谷歌搜索.唯一有希望的是AMD的此PDF .第18页提到了一些与我想做的事情非常接近的事情,但似乎是为那些在原始评估硬件平台上没有任何操作系统的工作而写的:

Yes, I googled. The only thing that looks promising is this PDF from AMD. Page 18 mentions something very close to what I'd like to do, but seems written for those working without any operating system, on raw evaluation hardware platforms:

条件分支可能会被错误地预测 选择正确或错误的路径是随机的或接近50-50的比例.这 分支预测硬件无法学习"模式,分支是 预测不正确.收藏.收集此表中的事件 评估分支预测性能:

Conditional branches may be mispredicted when the likelihood of choosing the true or false path is random or near a 50-50 split. The branch prediction hardware cannot "learn" a pattern and branches are not predicted correctly. Collection. Collect the events in this table to measure branch prediction performance:

分支计算分支的比率 取和每个分支使用的指令数之比 这些公式:分支采用率= Taken_branches/ Ret_instructions分支采用的比率= Taken_分支/分支
每个分支的指令= Ret_instructions/分支

Branches Compute the rate at which branches are taken and the ratio of the number of instructions per branch using these formulas: Branch taken rate = Taken_branches / Ret_instructions Branch taken ratio = Taken_branches / Branches
Instructions per branch = Ret_instructions / Branches

更新:我想我可以说我正在寻找一种读取Intel Core i7 PMU模块或其他CPU等效功能的方法.看来Intel VTUNE(根据Adrian的评论)非常接近我的要求.

Update: I guess I could say I'm looking for a way to read the Intel Core i7 PMU module, or equivalent functions of other CPUs. It looks like Intel VTUNE (from the comments by Adrian) is very close to what I asked for.

推荐答案

VTune Performance Analyzer可以做到!顺便说一句,如果您正在研究这些主题,请参阅Intel Press的"Optimization Cookbook".

VTune Performance Analyzer can do it! Btw if you are studying these topics, take a look at "Optimization Cookbook" from Intel Press.

注:注释给出了相同的答案,但存在一定的不确定性,我使用了VTune,并测量了Intel CPU的分支预测率.所以我100%确信.

Note: Comments state the same answer but with some uncertainty, I used VTune and I measured the branch prediction rate for an Intel CPU. So I'm 100% sure.

这是VTune的链接

这是该书的链接

这篇关于我可以在现代Intel Core CPU上测量分支预测故障吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-03 06:55