问题描述
由于某些原因,我无法采样(perf record
)硬件缓存事件:
For some reason, I can't sample (perf record
) hardware cache events:
# perf record -e L1-dcache-stores -a -c 100 -- sleep 5
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.607 MB perf.data (~26517 samples) ]
# perf script
但我可以算出它们(perf stat
):
but I can count them (perf stat
):
# perf stat -e L1-dcache-stores -a -- sleep 5
Performance counter stats for 'sleep 5':
711,781 L1-dcache-stores
5.000842990 seconds time elapsed
我尝试使用不同的CPU,OS版本(和内核版本),perf
版本,但结果是相同的.这是预期的行为吗?是什么原因? perf
不能对此警告吗?
I tried on different CPUs, OS versions (and kernel versions), perf
versions but the result is the same. Is this an expected behaviour? What is the reason? Can't perf
warn about this?
推荐答案
perf evlist -vvv
输出中的三个性能数据有所不同,三个是缓存事件,第二个软件事件和最后一个硬件周期事件:
There is a difference in perf evlist -vvv
output of three perf.data, one of cache event, second of software event, and last of hw cycles event:
echo '2^234567 %2' | perf record -e L1-dcache-stores -c 100 -o cache bc
echo '2^234567 %2' | perf record -e cycles -c 100 -o cycles bc
echo '2^234567 %2' | perf record -e cs -c 100 -o cs bc
perf evlist -vvv -i cache
L1-dcache-stores: sample_freq=100, type: 3, config: 256, size: 96, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
perf evlist -vvv -i cycles
cycles: sample_freq=100, size: 96, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
perf evlist -vvv -i cs
cs: sample_freq=100, type: 1, config: 3, size: 96, sample_type: IP|TID|TIME, disabled: 1, inherit: 1, mmap: 1, mmap2: 1, comm: 1, enable_on_exec: 1, sample_id_all: 1, exclude_guest: 1
类型不同,类型定义为
0028 enum perf_type_id {
0029 PERF_TYPE_HARDWARE = 0,
0030 PERF_TYPE_SOFTWARE = 1,
0031 PERF_TYPE_TRACEPOINT = 2,
0032 PERF_TYPE_HW_CACHE = 3,
0033 PERF_TYPE_RAW = 4,
0034 PERF_TYPE_BREAKPOINT = 5,
0035
0036 PERF_TYPE_MAX, /* non-ABI */
0037 };
Perf脚本具有一个output
表,该表定义了如何打印各种事件: http://lxr.free-electrons.com/source/tools/perf/builtin-script.c?v=3.16#L68
Perf script has a output
table which defines how to print event of every kind: http://lxr.free-electrons.com/source/tools/perf/builtin-script.c?v=3.16#L68
68 /* default set to maintain compatibility with current format */
69 static struct {
70 bool user_set;
71 bool wildcard_set;
72 unsigned int print_ip_opts;
73 u64 fields;
74 u64 invalid_fields;
75 } output[PERF_TYPE_MAX] = {
76
77 [PERF_TYPE_HARDWARE] = {
78 .user_set = false,
79
80 .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
81 PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
82 PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
83 PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
84
85 .invalid_fields = PERF_OUTPUT_TRACE,
86 },
87
88 [PERF_TYPE_SOFTWARE] = {
89 .user_set = false,
90
91 .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
92 PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
93 PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
94 PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
95
96 .invalid_fields = PERF_OUTPUT_TRACE,
97 },
98
99 [PERF_TYPE_TRACEPOINT] = {
100 .user_set = false,
101
102 .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
103 PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
104 PERF_OUTPUT_EVNAME | PERF_OUTPUT_TRACE,
105 },
106
107 [PERF_TYPE_RAW] = {
108 .user_set = false,
109
110 .fields = PERF_OUTPUT_COMM | PERF_OUTPUT_TID |
111 PERF_OUTPUT_CPU | PERF_OUTPUT_TIME |
112 PERF_OUTPUT_EVNAME | PERF_OUTPUT_IP |
113 PERF_OUTPUT_SYM | PERF_OUTPUT_DSO,
114
115 .invalid_fields = PERF_OUTPUT_TRACE,
116 },
117 };
118
因此,没有说明从类型3的样本中打印任何字段的说明-PERF_TYPE_HW_CACHE,并且perf script
不打印它们.我们可以尝试在output
数组中注册该类型,甚至将补丁推入内核.
So, there is no instructions of printing any of field from samples with type 3 - PERF_TYPE_HW_CACHE, and perf script
does not print them. We can try to register this type in output
array and even push the patch to kernel.
这篇关于无法使用Linux性能采样硬件缓存事件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!