本文介绍了性能:软件事件之间的奇怪关系的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,这真的让我很烦.

Okay, so this really bugs me.

我正在使用perf记录cpu-clock事件(一个软件事件):

I'm using perf to record the cpu-clock event (a software event):

$ > perf record -e cpu-clock srun -n 1 ./stream

...,并且性能报告生成的表为空.

... and the table produced by perf report is empty.

我正在使用perf来记录perf列表中列出的所有可用软件事件:

I'm using perf to record all available software events listed in perf list:

$ > perf record -e alignment-faults,context-switches,cpu-clock,cpu-migrations,\
dummy,emulation-faults,major-faults,minor-faults,page-faults,task-clock\
srun -n 1 ./stream

...该表为我提供了可用样品的列表:

... the table gives me a list of available samples:

0 alignment-faults
125 context-switches
255 cpu-clock
21 cpu-migrations
0 dummy
0 emulation-faults
0 major-faults
128 minor-faults
132 page-faults
254 task-clock

我可以查看在cpu-clock中收集的样本,它为我提供了信息.为什么?!如果仅测量cpu-clock,为什么它不起作用?为什么在四个事件中没有收集到样本?

I can look at the samples collected in cpu-clock and it gives me information. Why?! Why does it not work if I only measure cpu-clock? Why were there no samples collected in four events?

这是此问题的后续措施:错误:perf.data文件没有示例

This is a follow-up to this question:error: perf.data file has no samples

推荐答案

可能 srun 不会直接使用分叉启动目标进程.它可能使用诸如ssh或daemon之类的各种远程外壳程序来启动进程.

Probably srun don't start target process with direct fork. It may use some varian ot remote shell like ssh or daemon to start processes.

性能记录(不带 -a 选项)将仅跟踪直接分叉的子流程,而不跟踪由sshd或其他守护程序启动(分叉)的进程.如果 srun 可以转到它并使用 perf record ... srun 命令,它将永远不会对远程计算机进行配置(这是对srun应用程序及其进行分叉的所有文件进行配置)

perf record (without -a option) will track only directly forked sub-processes, not the process started (forked) by sshd or other daemon. And it will never profile remote machine if the srun can go to it and perf record ... srun command was used (this is to profile srun application and everything it forks).

首先尝试使用 perf stat 获得总(原始)性能计数器,然后将perf用作srun参数;这是使用远程Shell或守护程序的工具(可能具有perf的完整路径)的正确用法:

Try perf stat first to get total (raw) performance counters, and put perf as srun argument; this is the correct usage with tools which uses remote shell or daemons (probably with full path to perf):

 srun -n 1 perf stat ./stream
 srun -n 1 /usr/bin/perf stat ./stream

性能统计将显示目标任务的运行时间.然后选择一些原始计数器较高的事件(perf记录通常将采样率调整到几kHz左右,如果有足够的原始事件计数,则将生成数千个样本):

perf stat will print running time of target task. Then select some event with high raw counter (perf record usually tune sample rate to around several kHz, so thousands of samples will be generated, if there are enough raw event counts):

 srun -n 1 perf record -e cpu-clock ./stream
 srun -n 1 /usr/bin/perf record -e cpu-clock ./stream

这篇关于性能:软件事件之间的奇怪关系的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

09-05 03:10
查看更多