Age | Commit message (Collapse) | Author |
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-02-09
We've added 126 non-merge commits during the last 16 day(s) which contain
a total of 201 files changed, 4049 insertions(+), 2215 deletions(-).
The main changes are:
1) Add custom BPF allocator for JITs that pack multiple programs into a huge
page to reduce iTLB pressure, from Song Liu.
2) Add __user tagging support in vmlinux BTF and utilize it from BPF
verifier when generating loads, from Yonghong Song.
3) Add per-socket fast path check guarding from cgroup/BPF overhead when
used by only some sockets, from Pavel Begunkov.
4) Continued libbpf deprecation work of APIs/features and removal of their
usage from samples, selftests, libbpf & bpftool, from Andrii Nakryiko
and various others.
5) Improve BPF instruction set documentation by adding byte swap
instructions and cleaning up load/store section, from Christoph Hellwig.
6) Switch BPF preload infra to light skeleton and remove libbpf dependency
from it, from Alexei Starovoitov.
7) Fix architecture-agnostic macros in libbpf for accessing syscall
arguments from BPF progs for non-x86 architectures,
from Ilya Leoshkevich.
8) Rework port members in struct bpf_sk_lookup and struct bpf_sock to be
of 16-bit field with anonymous zero padding, from Jakub Sitnicki.
9) Add new bpf_copy_from_user_task() helper to read memory from a different
task than current. Add ability to create sleepable BPF iterator progs,
from Kenny Yu.
10) Implement XSK batching for ice's zero-copy driver used by AF_XDP and
utilize TX batching API from XSK buffer pool, from Maciej Fijalkowski.
11) Generate temporary netns names for BPF selftests to avoid naming
collisions, from Hangbin Liu.
12) Implement bpf_core_types_are_compat() with limited recursion for
in-kernel usage, from Matteo Croce.
13) Simplify pahole version detection and finally enable CONFIG_DEBUG_INFO_DWARF5
to be selected with CONFIG_DEBUG_INFO_BTF, from Nathan Chancellor.
14) Misc minor fixes to libbpf and selftests from various folks.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (126 commits)
selftests/bpf: Cover 4-byte load from remote_port in bpf_sk_lookup
bpf: Make remote_port field in struct bpf_sk_lookup 16-bit wide
libbpf: Fix compilation warning due to mismatched printf format
selftests/bpf: Test BPF_KPROBE_SYSCALL macro
libbpf: Add BPF_KPROBE_SYSCALL macro
libbpf: Fix accessing the first syscall argument on s390
libbpf: Fix accessing the first syscall argument on arm64
libbpf: Allow overriding PT_REGS_PARM1{_CORE}_SYSCALL
selftests/bpf: Skip test_bpf_syscall_macro's syscall_arg1 on arm64 and s390
libbpf: Fix accessing syscall arguments on riscv
libbpf: Fix riscv register names
libbpf: Fix accessing syscall arguments on powerpc
selftests/bpf: Use PT_REGS_SYSCALL_REGS in bpf_syscall_macro
libbpf: Add PT_REGS_SYSCALL_REGS macro
selftests/bpf: Fix an endianness issue in bpf_syscall_macro test
bpf: Fix bpf_prog_pack build HPAGE_PMD_SIZE
bpf: Fix leftover header->pages in sparc and powerpc code.
libbpf: Fix signedness bug in btf_dump_array_data()
selftests/bpf: Do not export subtest as standalone test
bpf, x86_64: Fail gracefully on bpf_jit_binary_pack_finalize failures
...
====================
Link: https://lore.kernel.org/r/20220209210050.8425-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
The ftrace.target.system_wide must be set before invoking
evlist__create_maps(), otherwise it has no effect.
Fixes: 53be50282269b46c ("perf ftrace: Add 'latency' subcommand")
Signed-off-by: Changbin Du <changbin.du@gmail.com>
Acked-by: Namhyung Kim <namhyung@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220127132010.4836-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
An event may have a number of uncore aliases that when added to the
evlist are consecutive.
If there are multiple uncore events in a group then
parse_events__set_leader_for_uncore_aliase will reorder the evlist so
that events on the same PMU are adjacent.
The collect_all_aliases function assumes that aliases are in blocks so
that only the first counter is printed and all others are marked merged.
The reordering for groups breaks the assumption and so all counts are
printed.
This change removes the assumption from collect_all_aliases
that the events are in blocks and instead processes the entire evlist.
Before:
```
$ perf stat -e '{UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE,UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE},duration_time' -a -A -- sleep 1
Performance counter stats for 'system wide':
CPU0 256,866 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 494,413 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 967 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,738 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 285,161 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 429,920 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 955 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,443 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 310,753 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 416,657 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,231 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,573 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 416,067 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 405,966 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,481 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,447 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 312,911 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 408,154 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,086 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,380 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 333,994 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 370,349 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,287 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,335 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 188,107 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 302,423 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 701 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,070 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 307,221 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 383,642 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,036 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,158 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 318,479 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 821,545 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,028 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,550 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 227,618 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 372,272 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 903 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,456 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 376,783 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 419,827 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,406 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,453 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 286,583 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 429,956 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 999 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,436 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 313,867 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 370,159 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,114 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,291 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 342,083 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 409,111 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,399 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,684 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 365,828 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 376,037 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,378 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,411 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 382,456 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 621,743 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,232 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,955 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 342,316 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 385,067 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,176 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,268 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 373,588 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 386,163 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,394 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,464 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 381,206 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 546,891 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,266 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,712 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 221,176 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 392,069 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 831 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,456 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 355,401 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 705,595 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,235 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,216 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 371,436 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 428,103 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,306 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,442 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 384,352 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 504,200 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,468 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,860 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 228,856 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 287,976 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 832 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,060 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 215,121 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 334,162 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 681 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,026 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 296,179 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 436,083 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,084 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,525 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 262,296 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 416,573 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 986 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,533 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 285,852 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 359,842 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,073 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,326 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 303,379 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 367,222 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,008 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,156 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 273,487 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 425,449 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 932 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,367 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 297,596 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 414,793 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,140 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,601 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 342,365 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 360,422 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,291 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,342 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 327,196 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 580,858 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,122 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,014 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 296,564 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 452,817 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,087 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,694 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 375,002 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 389,393 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,478 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 1,540 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 365,213 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 594,685 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 1,401 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 2,222 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 1,000,749,060 ns duration_time
1.000749060 seconds time elapsed
```
After:
```
Performance counter stats for 'system wide':
CPU0 20,547,434 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU36 45,202,862 UNC_CHA_TOR_OCCUPANCY.IA_MISS_DRD_REMOTE
CPU0 82,001 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU36 159,688 UNC_CHA_TOR_INSERTS.IA_MISS_DRD_REMOTE
CPU0 1,000,464,828 ns duration_time
1.000464828 seconds time elapsed
```
Fixes: 3cdc5c2cb924acb4 ("perf parse-events: Handle uncore event aliases in small groups properly")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Torgue <alexandre.torgue@foss.st.com>
Cc: Asaf Yaffe <asaf.yaffe@intel.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kshipra Bopardikar <kshipra.bopardikar@intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Maxime Coquelin <mcoquelin.stm32@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: Zhengjun Xing <zhengjun.xing@linux.intel.com>
Link: https://lore.kernel.org/r/20220205010941.1065469-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In branch mode, the branch symbols were being displayed with incorrect
cpumode labels. So fix this.
For example, before:
# perf record -b -a -- sleep 1
# perf report -b
Overhead Command Source Shared Object Source Symbol Target Symbol
0.08% swapper [kernel.kallsyms] [k] rcu_idle_enter [k] cpuidle_enter_state
==> 0.08% cmd0 [kernel.kallsyms] [.] psi_group_change [.] psi_group_change
0.08% cmd1 [kernel.kallsyms] [k] psi_group_change [k] psi_group_change
After:
# perf report -b
Overhead Command Source Shared Object Source Symbol Target Symbol
0.08% swapper [kernel.kallsyms] [k] rcu_idle_enter [k] cpuidle_enter_state
0.08% cmd0 [kernel.kallsyms] [k] psi_group_change [k] pei_group_change
0.08% cmd1 [kernel.kallsyms] [k] psi_group_change [k] psi_group_change
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220126105927.3411216-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This patch fixes a spelling typo in error message.
Signed-off-by: Masanari Iida <standby24x7@gmail.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20211225005558.503935-1-standby24x7@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
For perf recording, it retrieves process info by iterating nodes in proc
fs. If we run perf in a non-root PID namespace with command:
# unshare --fork --pid perf record -e cycles -a -- test_program
... in this case, unshare command creates a child PID namespace and
launches perf tool in it, but the issue is the proc fs is not mounted
for the non-root PID namespace, this leads to the perf tool gathering
process info from its parent PID namespace.
We can use below command to observe the process nodes under proc fs:
# unshare --pid --fork ls /proc
1 137 1968 2128 3 342 48 62 78 crypto kcore net uptime
10 138 2 2142 30 35 49 63 8 devices keys pagetypeinfo version
11 139 20 2143 304 36 50 64 82 device-tree key-users partitions vmallocinfo
12 14 2011 22 305 37 51 65 83 diskstats kmsg self vmstat
128 140 2038 23 307 39 52 656 84 driver kpagecgroup slabinfo zoneinfo
129 15 2074 24 309 4 53 67 9 execdomains kpagecount softirqs
13 16 2094 241 31 40 54 68 asound fb kpageflags stat
130 164 2096 242 310 41 55 69 buddyinfo filesystems loadavg swaps
131 17 2098 25 317 42 56 70 bus fs locks sys
132 175 21 26 32 43 57 71 cgroups interrupts meminfo sysrq-trigger
133 179 2102 263 329 44 58 75 cmdline iomem misc sysvipc
134 1875 2103 27 330 45 59 76 config.gz ioports modules thread-self
135 19 2117 29 333 46 6 77 consoles irq mounts timer_list
136 1941 2121 298 34 47 60 773 cpuinfo kallsyms mtd tty
So it shows many existed tasks, since unshared command has not mounted
the proc fs for the new created PID namespace, it still accesses the
proc fs of the root PID namespace. This leads to two prominent issues:
- Firstly, PID values are mismatched between thread info and samples.
The gathered thread info are coming from the proc fs of the root PID
namespace, but samples record its PID from the child PID namespace.
- The second issue is profiled program 'test_program' returns its forked
PID number from the child PID namespace, perf tool wrongly uses this
PID number to retrieve the process info via the proc fs of the root
PID namespace.
To avoid issues, we need to mount proc fs for the child PID namespace
with the option '--mount-proc' when use unshare command:
# unshare --fork --pid --mount-proc perf record -e cycles -a -- test_program
Conversely, when the proc fs of the root PID namespace is used by child
namespace, perf tool can detect the multiple PID levels and
nsinfo__is_in_root_namespace() returns false, this patch reports error
for this case:
# unshare --fork --pid perf record -e cycles -a -- test_program
Couldn't synthesize bpf events.
Perf runs in non-root PID namespace but it tries to gather process info from its parent PID namespace.
Please mount the proc file system properly, e.g. add the option '--mount-proc' for unshare command.
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Starovoitov <ast@kernel.org>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Yonghong Song <yhs@fb.com>
Link: https://lore.kernel.org/r/20211224124014.2492751-1-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Move NULL pointer check before dereferencing the variable.
Addresses-Coverity: 1497622 ("Derereference before null check")
Reviewed-by: James Clark <james.clark@arm.com>
Signed-off-by: Ameer Hamza <amhamza.mgc@gmail.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/r/20220125121141.18347-1-amhamza.mgc@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The stderr should be set to a pipe when using TUI. Otherwise it'd
print to stdout and break TUI windows with an error message.
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220202070828.143303-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This updates branch sample type with missing PERF_SAMPLE_BRANCH_TYPE_SAVE.
Suggested-by: James Clark <james.clark@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lore.kernel.org/lkml/1643799443-15109-1-git-send-email-anshuman.khandual@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This new PR_SET_VMA value isn't in sequence with all the other prctl
arguments and instead uses a big, 0x prefixed hex number: 0x53564d41 (S V M A).
This makes it harder to generate a string table as it would be rather
sparse, so make the regexp more stricter to avoid catching those.
A followup patch for 'perf trace' to cope with such oddities will be
needed, but then its a matter for the next merge window.
The next patch will update the prctl.h file to cope with this perf build
warning:
Warning: Kernel ABI header at 'tools/include/uapi/linux/prctl.h' differs from latest version at 'include/uapi/linux/prctl.h'
diff -u tools/include/uapi/linux/prctl.h include/uapi/linux/prctl.h
Here is the output of this script:
$ tools/perf/trace/beauty/prctl_option.sh
static const char *prctl_options[] = {
[1] = "SET_PDEATHSIG",
[2] = "GET_PDEATHSIG",
[3] = "GET_DUMPABLE",
[4] = "SET_DUMPABLE",
[5] = "GET_UNALIGN",
[6] = "SET_UNALIGN",
[7] = "GET_KEEPCAPS",
[8] = "SET_KEEPCAPS",
[9] = "GET_FPEMU",
[10] = "SET_FPEMU",
[11] = "GET_FPEXC",
[12] = "SET_FPEXC",
[13] = "GET_TIMING",
[14] = "SET_TIMING",
[15] = "SET_NAME",
[16] = "GET_NAME",
[19] = "GET_ENDIAN",
[20] = "SET_ENDIAN",
[21] = "GET_SECCOMP",
[22] = "SET_SECCOMP",
[23] = "CAPBSET_READ",
[24] = "CAPBSET_DROP",
[25] = "GET_TSC",
[26] = "SET_TSC",
[27] = "GET_SECUREBITS",
[28] = "SET_SECUREBITS",
[29] = "SET_TIMERSLACK",
[30] = "GET_TIMERSLACK",
[31] = "TASK_PERF_EVENTS_DISABLE",
[32] = "TASK_PERF_EVENTS_ENABLE",
[33] = "MCE_KILL",
[34] = "MCE_KILL_GET",
[35] = "SET_MM",
[36] = "SET_CHILD_SUBREAPER",
[37] = "GET_CHILD_SUBREAPER",
[38] = "SET_NO_NEW_PRIVS",
[39] = "GET_NO_NEW_PRIVS",
[40] = "GET_TID_ADDRESS",
[41] = "SET_THP_DISABLE",
[42] = "GET_THP_DISABLE",
[43] = "MPX_ENABLE_MANAGEMENT",
[44] = "MPX_DISABLE_MANAGEMENT",
[45] = "SET_FP_MODE",
[46] = "GET_FP_MODE",
[47] = "CAP_AMBIENT",
[50] = "SVE_SET_VL",
[51] = "SVE_GET_VL",
[52] = "GET_SPECULATION_CTRL",
[53] = "SET_SPECULATION_CTRL",
[54] = "PAC_RESET_KEYS",
[55] = "SET_TAGGED_ADDR_CTRL",
[56] = "GET_TAGGED_ADDR_CTRL",
[57] = "SET_IO_FLUSHER",
[58] = "GET_IO_FLUSHER",
[59] = "SET_SYSCALL_USER_DISPATCH",
[60] = "PAC_SET_ENABLED_KEYS",
[61] = "PAC_GET_ENABLED_KEYS",
[62] = "SCHED_CORE",
};
static const char *prctl_set_mm_options[] = {
[1] = "START_CODE",
[2] = "END_CODE",
[3] = "START_DATA",
[4] = "END_DATA",
[5] = "START_STACK",
[6] = "START_BRK",
[7] = "BRK",
[8] = "ARG_START",
[9] = "ARG_END",
[10] = "ENV_START",
[11] = "ENV_END",
[12] = "AUXV",
[13] = "EXE_FILE",
[14] = "MAP",
[15] = "MAP_SIZE",
};
$
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Colin Cross <ccross@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Link: https://lore.kernel.org/lkml/YflZqY0rYQ3d1bKt@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
No conflicts.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
bpf_program__set_<type>() APIs are deprecated, use generic
bpf_program__set_type() APIs instead.
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/r/20220124194254.2051434-8-andrii@kernel.org
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
|
|
bpf_object__open_buffer() API is deprecated, use the unified opts
bpf_object__open_mem() API in perf instead. This requires at least
libbpf 0.0.6.
Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220125005923.418339-3-christylee@fb.com
|
|
Daniel Borkmann says:
====================
pull-request: bpf-next 2022-01-24
We've added 80 non-merge commits during the last 14 day(s) which contain
a total of 128 files changed, 4990 insertions(+), 895 deletions(-).
The main changes are:
1) Add XDP multi-buffer support and implement it for the mvneta driver,
from Lorenzo Bianconi, Eelco Chaudron and Toke Høiland-Jørgensen.
2) Add unstable conntrack lookup helpers for BPF by using the BPF kfunc
infra, from Kumar Kartikeya Dwivedi.
3) Extend BPF cgroup programs to export custom ret value to userspace via
two helpers bpf_get_retval() and bpf_set_retval(), from YiFei Zhu.
4) Add support for AF_UNIX iterator batching, from Kuniyuki Iwashima.
5) Complete missing UAPI BPF helper description and change bpf_doc.py script
to enforce consistent & complete helper documentation, from Usama Arif.
6) Deprecate libbpf's legacy BPF map definitions and streamline XDP APIs to
follow tc-based APIs, from Andrii Nakryiko.
7) Support BPF_PROG_QUERY for BPF programs attached to sockmap, from Di Zhu.
8) Deprecate libbpf's bpf_map__def() API and replace users with proper getters
and setters, from Christy Lee.
9) Extend libbpf's btf__add_btf() with an additional hashmap for strings to
reduce overhead, from Kui-Feng Lee.
10) Fix bpftool and libbpf error handling related to libbpf's hashmap__new()
utility function, from Mauricio Vásquez.
11) Add support to BTF program names in bpftool's program dump, from Raman Shukhau.
12) Fix resolve_btfids build to pick up host flags, from Connor O'Brien.
* https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (80 commits)
selftests, bpf: Do not yet switch to new libbpf XDP APIs
selftests, xsk: Fix rx_full stats test
bpf: Fix flexible_array.cocci warnings
xdp: disable XDP_REDIRECT for xdp frags
bpf: selftests: add CPUMAP/DEVMAP selftests for xdp frags
bpf: selftests: introduce bpf_xdp_{load,store}_bytes selftest
net: xdp: introduce bpf_xdp_pointer utility routine
bpf: generalise tail call map compatibility check
libbpf: Add SEC name for xdp frags programs
bpf: selftests: update xdp_adjust_tail selftest to include xdp frags
bpf: test_run: add xdp_shared_info pointer in bpf_test_finish signature
bpf: introduce frags support to bpf_prog_test_run_xdp()
bpf: move user_size out of bpf_test_init
bpf: add frags support to xdp copy helpers
bpf: add frags support to the bpf_xdp_adjust_tail() API
bpf: introduce bpf_xdp_get_buff_len helper
net: mvneta: enable jumbo frames if the loaded XDP program support frags
bpf: introduce BPF_F_XDP_HAS_FRAGS flag in prog_flags loading the ebpf program
net: mvneta: add frags support to XDP_TX
xdp: add frags support to xdp_return_{buff/frame}
...
====================
Link: https://lore.kernel.org/r/20220124221235.18993-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
|
|
Return value from perf_event__process_tracing_data() directly instead
of taking this in another redundant variable.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Minghao Chi <chi.minghao@zte.com.cn>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220112080109.666800-1-chi.minghao@zte.com.cn
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a test which allows us to test parsing an event alias with hyphens.
Since these events typically do not exist on most host systems, add the
alias to the fake pmu.
Function perf_pmu__test_parse_init() has terms added to match known test
aliases.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-4-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add a test for aliases with hyphens in the name to ensure that the
pmu-events tables are as expects. There should be no reason why these sort
of aliases would be treated differently, but no harm in checking.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-3-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Event aliasing for events whose name in the form foo-bar-baz is not
supported, while foo-bar, foo_bar_baz, and other combinations are, i.e.
two hyphens are not supported.
The HiSilicon D06 platform has events in such form:
$ ./perf list sdir-home-migrate
List of pre-defined events (to be used in -e):
uncore hha:
sdir-home-migrate
[Unit: hisi_sccl,hha]
$ sudo ./perf stat -e sdir-home-migrate
event syntax error: 'sdir-home-migrate'
\___ parser error
Run 'perf list' for a list of valid events
Usage: perf stat [<options>] [<command>]
-e, --event <event>event selector. use 'perf list' to list available events
To support, add an extra PMU event symbol type for "baz", and add a new
rule in the bison file.
Signed-off-by: John Garry <john.garry@huawei.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Qi Liu <liuqi115@huawei.com>
Cc: Shaokun Zhang <zhangshaokun@hisilicon.com>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/1642432215-234089-2-git-send-email-john.garry@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A previous patch preventing "attr->sample_period" values from being
overridden in pfm events changed a related behaviour in arm-spe.
Before said patch:
perf record -c 10000 -e arm_spe_0// -- sleep 1
Would yield an SPE event with period=10000. After the patch, the period
in "-c 10000" was being ignored because the arm-spe code initializes
sample_period to a non-zero value.
This patch restores the previous behaviour for non-libpfm4 events.
Fixes: ae5dcc8abe31 (“perf record: Prevent override of attr->sample_period for libpfm4 events”)
Reported-by: Chase Conklin <chase.conklin@arm.com>
Signed-off-by: German Gomez <german.gomez@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Fastabend <john.fastabend@gmail.com>
Cc: KP Singh <kpsingh@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Martin KaFai Lau <kafai@fb.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: bpf@vger.kernel.org
Cc: netdev@vger.kernel.org
Link: http://lore.kernel.org/lkml/20220118144054.2541-1-german.gomez@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Remove all but the first include of stdbool.h from cpumap.h.
Reported-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117083730.863200-1-lv.ruyi@zte.com.cn
Signed-off-by: CGEL ZTE <cgel.zte@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Switch from directly accessing the perf_cpu_map to using the appropriate
libperf API when possible. Using the API simplifies the job of
refactoring use of perf_cpu_map.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220122045811.3402706-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Value should be built as an integer.
Switch some uses of perf_cpu_map to use the library API.
Fixes: 6d18804b963b78dc ("perf cpumap: Give CPUs their own type")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: André Almeida <andrealmeid@collabora.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: http://lore.kernel.org/lkml/20220122045811.3402706-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Perf script was failed to print the phys_addr for SPE profiling.
One 'dummy' event is added by SPE profiling but it doesn't have PHYS_ADDR
attribute set, perf script then exits with error.
Now referring to 'addr', use evsel__do_check_stype() to check the type.
Before:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
Samples for 'dummy:u' event do not have PHYS_ADDR attribute set. Cannot print 'phys_addr' field.
After:
# perf record -e arm_spe_0/branch_filter=0,ts_enable=1,pa_enable=1,load_filter=1,jitter=0,\
store_filter=0,min_latency=0,event_filter=2/ -p 4064384 -- sleep 3
# perf script -F pid,tid,addr,phys_addr
4064384/4064384 ffff802f921be0d0 2f921be0d0
4064384/4064384 ffff802f921be0d0 2f921be0d0
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Yao Jin <jinyao5@huawei.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220121065954.2121900-1-liwei391@huawei.com
Signed-off-by: Wei Li <liwei391@huawei.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick the changes in these csets:
21b084fdf2a49ca1 ("mm/mempolicy: wire up syscall set_mempolicy_home_node")
That add support for this new syscall in tools such as 'perf trace'.
For instance, this is now possible:
[root@five ~]# perf trace -e set_mempolicy_home_node
^C[root@five ~]#
[root@five ~]# perf trace -v -e set_mempolicy_home_node
Using CPUID AuthenticAMD-25-21-0
event qualifier tracepoint filter: (common_pid != 253729 && common_pid != 3585) && (id == 450)
mmap size 528384B
^C[root@five ~]
[root@five ~]# perf trace -v -e set* --max-events 5
Using CPUID AuthenticAMD-25-21-0
event qualifier tracepoint filter: (common_pid != 253734 && common_pid != 3585) && (id == 38 || id == 54 || id == 105 || id == 106 || id == 109 || id == 112 || id == 113 || id == 114 || id == 116 || id == 117 || id == 119 || id == 122 || id == 123 || id == 141 || id == 160 || id == 164 || id == 170 || id == 171 || id == 188 || id == 205 || id == 218 || id == 238 || id == 273 || id == 308 || id == 450)
mmap size 528384B
0.000 ( 0.008 ms): bash/253735 setpgid(pid: 253735 (bash), pgid: 253735 (bash)) = 0
6849.011 ( 0.008 ms): bash/16046 setpgid(pid: 253736 (bash), pgid: 253736 (bash)) = 0
6849.080 ( 0.005 ms): bash/253736 setpgid(pid: 253736 (bash), pgid: 253736 (bash)) = 0
7437.718 ( 0.009 ms): gnome-shell/253737 set_robust_list(head: 0x7f34b527e920, len: 24) = 0
13445.986 ( 0.010 ms): bash/16046 setpgid(pid: 253738 (bash), pgid: 253738 (bash)) = 0
[root@five ~]#
That is the filter expression attached to the raw_syscalls:sys_{enter,exit}
tracepoints.
$ find tools/perf/arch/ -name "syscall*tbl" | xargs grep -w set_mempolicy_home_node
tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node
tools/perf/arch/powerpc/entry/syscalls/syscall.tbl:450 nospu set_mempolicy_home_node sys_set_mempolicy_home_node
tools/perf/arch/s390/entry/syscalls/syscall.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node sys_set_mempolicy_home_node
tools/perf/arch/x86/entry/syscalls/syscall_64.tbl:450 common set_mempolicy_home_node sys_set_mempolicy_home_node
$
$ grep -w set_mempolicy_home_node /tmp/build/perf/arch/x86/include/generated/asm/syscalls_64.c
[450] = "set_mempolicy_home_node",
$
This addresses these perf build warnings:
Warning: Kernel ABI header at 'tools/include/uapi/asm-generic/unistd.h' differs from latest version at 'include/uapi/asm-generic/unistd.h'
diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h
Warning: Kernel ABI header at 'tools/perf/arch/x86/entry/syscalls/syscall_64.tbl' differs from latest version at 'arch/x86/entry/syscalls/syscall_64.tbl'
diff -u tools/perf/arch/x86/entry/syscalls/syscall_64.tbl arch/x86/entry/syscalls/syscall_64.tbl
Warning: Kernel ABI header at 'tools/perf/arch/powerpc/entry/syscalls/syscall.tbl' differs from latest version at 'arch/powerpc/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/powerpc/entry/syscalls/syscall.tbl arch/powerpc/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/s390/entry/syscalls/syscall.tbl' differs from latest version at 'arch/s390/kernel/syscalls/syscall.tbl'
diff -u tools/perf/arch/s390/entry/syscalls/syscall.tbl arch/s390/kernel/syscalls/syscall.tbl
Warning: Kernel ABI header at 'tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl' differs from latest version at 'arch/mips/kernel/syscalls/syscall_n64.tbl'
diff -u tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl arch/mips/kernel/syscalls/syscall_n64.tbl
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
'/', filename)
Its more intention revealing, and if we're interested in the odd cases
where this may end up truncating we can do debug checks at one
centralized place.
Motivation, of all the container builds, fedora rawhide started
complaining of:
util/machine.c: In function ‘machine__create_modules’:
util/machine.c:1419:50: error: ‘%s’ directive output may be truncated writing up to 255 bytes into a region of size between 0 and 4095 [-Werror=format-truncation=]
1419 | snprintf(path, sizeof(path), "%s/%s", dir_name, dent->d_name);
| ^~
In file included from /usr/include/stdio.h:894,
from util/branch.h:9,
from util/callchain.h:8,
from util/machine.c:7:
In function ‘snprintf’,
inlined from ‘maps__set_modules_path_dir’ at util/machine.c:1419:3,
inlined from ‘machine__set_modules_path’ at util/machine.c:1473:9,
inlined from ‘machine__create_modules’ at util/machine.c:1519:7:
/usr/include/bits/stdio2.h:71:10: note: ‘__builtin___snprintf_chk’ output between 2 and 4352 bytes into a destination of size 4096
There are other places where we should use path__join(), but lets get rid of
this one first.
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Acked-by: Ian Rogers <irogers@google.com>
Link: Link: https://lore.kernel.org/r/YebZKjwgfdOz0lAs@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When the target is a pid, not started by 'perf stat' we need to disable
the events, and in that case there is no need to setup affinities as we
use a dummy CPU map, with just one entry set to -1.
So stop doing it to avoid this needless call to sched_getaffinity():
# strace -ke sched_getaffinity perf stat -e cycles -p 241957 sleep 1
<SNIP>
sched_getaffinity(0, 512, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 8
> /usr/lib64/libc-2.33.so(sched_getaffinity@@GLIBC_2.3.4+0x1a) [0xe6eea]
> /var/home/acme/bin/perf(affinity__setup+0x6a) [0x532a2a]
> /var/home/acme/bin/perf(__evlist__disable.constprop.0+0x27) [0x4b9827]
> /var/home/acme/bin/perf(cmd_stat+0x29b5) [0x431725]
> /var/home/acme/bin/perf(run_builtin+0x6a) [0x4a2cfa]
> /var/home/acme/bin/perf(main+0x612) [0x40f8c2]
> /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
> /var/home/acme/bin/perf(_start+0x2d) [0x40fadd]
<SNIP>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When the target is a pid, not started by 'perf stat' we need to enable
the events, and in that case there is no need to setup affinities as we
use a dummy CPU map, with just one entry set to -1.
So stop doing it to avoid this needless call to sched_getaffinity():
# strace -ke sched_getaffinity perf stat -e cycles -p 241957 sleep 1
<SNIP>
sched_getaffinity(0, 512, [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31]) = 8
> /usr/lib64/libc-2.33.so(sched_getaffinity@@GLIBC_2.3.4+0x1a) [0xe6eea]
> /var/home/acme/bin/perf(affinity__setup+0x6a) [0x5329ca]
> /var/home/acme/bin/perf(__evlist__enable.constprop.0+0x23) [0x4b9693]
> /var/home/acme/bin/perf(enable_counters+0x14d) [0x42de5d]
> /var/home/acme/bin/perf(cmd_stat+0x2358) [0x4310c8]
> /var/home/acme/bin/perf(run_builtin+0x6a) [0x4a2cfa]
> /var/home/acme/bin/perf(main+0x612) [0x40f8c2]
> /usr/lib64/libc-2.33.so(__libc_start_main+0xd4) [0x27b74]
> /var/home/acme/bin/perf(_start+0x2d) [0x40fadd]
<SNIP>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
I.e. the simple:
$ perf stat sleep 1
Uses a dummy CPU map and thus there is no need to setup/cleanup
affinities to avoid IPIs, etc.
With this we're down to a sched_getaffinity() call, in the libnuma
initialization, that probably can be removed in a followup patch.
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Just like with free(), NULL is checked to avoid having all callers do
it.
Its convenient for when not using affinity setup/cleanup for dummy CPU
maps, i.e. CPU maps for pid targets.
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20220117160931.1191712-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Because of commit bf794bf52a80c627 ("powerpc/kprobes: Fix kallsyms
lookup across powerpc ABIv1 and ABIv2"), in ppc64 ABIv1, our perf
command eliminates the need to use the prefix "." at the symbol name.
But when the command "perf probe -a schedule" is executed on ppc64
ABIv1, it obtains two symbol address information through /proc/kallsyms,
for example:
cat /proc/kallsyms | grep -w schedule
c000000000657020 T .schedule
c000000000d4fdb8 D schedule
The symbol "D schedule" is not a function symbol, and perf will print:
"p:probe/schedule _text+13958584"Failed to write event: Invalid argument
Therefore, when searching symbols from map and adding probe point for
them, a symbol type check is added. If the type of symbol is not a
function, skip it.
Fixes: bf794bf52a80c627 ("powerpc/kprobes: Fix kallsyms lookup across powerpc ABIv1 and ABIv2")
Signed-off-by: Zechuan Chen <chenzechuan1@huawei.com>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jianlin Lv <Jianlin.Lv@arm.com>
Cc: Jin Yao <yao.jin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@linux.ibm.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20211228111338.218602-1-chenzechuan1@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux
Pull perf tool updates from Arnaldo Carvalho de Melo:
"New features:
- Add 'trace' subcommand for 'perf ftrace', setting the stage for
more 'perf ftrace' subcommands. Not using a subcommand yields the
previous behaviour of 'perf ftrace'.
- Add 'latency' subcommand to 'perf ftrace', that can use the
function graph tracer or a BPF optimized one, via the -b/--use-bpf
option.
E.g.:
$ sudo perf ftrace latency -a -T mutex_lock sleep 1
# DURATION | COUNT | GRAPH |
0 - 1 us | 4596 | ######################## |
1 - 2 us | 1680 | ######### |
2 - 4 us | 1106 | ##### |
4 - 8 us | 546 | ## |
8 - 16 us | 562 | ### |
16 - 32 us | 1 | |
32 - 64 us | 0 | |
64 - 128 us | 0 | |
128 - 256 us | 0 | |
256 - 512 us | 0 | |
512 - 1024 us | 0 | |
1 - 2 ms | 0 | |
2 - 4 ms | 0 | |
4 - 8 ms | 0 | |
8 - 16 ms | 0 | |
16 - 32 ms | 0 | |
32 - 64 ms | 0 | |
64 - 128 ms | 0 | |
128 - 256 ms | 0 | |
256 - 512 ms | 0 | |
512 - 1024 ms | 0 | |
1 - ... s | 0 | |
The original implementation of this command was in the bcc tool.
- Support --cputype option for hybrid events in 'perf stat'.
Improvements:
- Call chain improvements for ARM64.
- No need to do any affinity setup when profiling pids.
- Reduce multiplexing with duration_time in 'perf stat' metrics.
- Improve error message for uncore events, stating that some event
groups are can only be used in system wide (-a) mode.
- perf stat metric group leader fixes/improvements, including arch
specific changes to better support Intel topdown events.
- Probe non-deprecated sysfs path first, i.e. try the path
/sys/devices/system/cpu/cpuN/topology/thread_siblings first, then
the old /sys/devices/system/cpu/cpuN/topology/core_cpus.
- Disable debuginfod by default in 'perf record', to avoid stalls on
distros such as Fedora 35.
- Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.
- Enable ignore_missing_thread in 'perf trace'
Fixes:
- Avoid TUI crash when navigating in the annotation of recursive
functions.
- Fix hex dump character output in 'perf script'.
- Fix JSON indentation to 4 spaces standard in the ARM vendor event
files.
- Fix use after free in metric__new().
- Fix IS_ERR_OR_NULL() usage in the perf BPF loader.
- Fix up cross-arch register support, i.e. when printing register
names take into account the architecture where the perf.data file
was collected.
- Fix SMT fallback with large core counts.
- Don't lower case MetricExpr when parsing JSON files so as not to
lose info such as the ":G" event modifier in metrics.
perf test:
- Add basic stress test for sigtrap handling to 'perf test'.
- Fix 'perf test' failures on s/390
- Enable system wide for metricgroups test in 'perf test´.
- Use 3 digits for test numbering now we can have more tests.
Arch specific:
- Add events for Arm Neoverse N2 in the ARM JSON vendor event files
- Support PERF_MEM_LVLNUM encodings in powerpc, that came from a
single patch series, where I incorrectly merged the kernel bits,
that were then reverted after coordination with Michael Ellerman
and Stephen Rothwell.
- Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.
- Update AMD documentation, with info on raw event encoding.
- Add support for global and local variants of the "p_stage_cyc" sort
key, applicable to perf.data files collected on powerpc.
- Remove duplicate and incorrect aux size checks in the ARM CoreSight
ETM code.
Refactorings:
- Add a perf_cpu abstraction to disambiguate CPUs and CPU map
indexes, fixing problems along the way.
- Document CPU map methods.
UAPI sync:
- Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench
mem memcpy'
- Sync UAPI files with the kernel sources: drm, msr-index,
cpufeatures.
Build system
- Enable warnings through HOSTCFLAGS.
- Drop requirement for libstdc++.so for libopencsd check
libperf:
- Make libperf adopt perf_counts_values__scale() from tools/perf/util/.
- Add a stat multiplexing test to libperf"
* tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (115 commits)
perf record: Disable debuginfod by default
perf evlist: No need to do any affinity setup when profiling pids
perf cpumap: Add is_dummy() method
perf metric: Fix metric_leader
perf cputopo: Fix CPU topology reading on s/390
perf metricgroup: Fix use after free in metric__new()
libperf tests: Update a use of the new cpumap API
perf arm: Fix off-by-one directory path
tools arch x86: Sync the msr-index.h copy with the kernel sources
tools headers cpufeatures: Sync with the kernel sources
tools headers UAPI: Update tools's copy of drm.h header
tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
perf pmu-events: Don't lower case MetricExpr
perf expr: Add debug logging for literals
perf tools: Probe non-deprecated sysfs path 1st
perf tools: Fix SMT fallback with large core counts
perf cpumap: Give CPUs their own type
perf stat: Correct first_shadow_cpu to return index
perf script: Fix flipped index and cpu
perf c2c: Use more intention revealing iterator
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace
Pull tracing updates from Steven Rostedt:
"New:
- The Real Time Linux Analysis (RTLA) tool is added to the tools
directory.
- Can safely filter on user space pointers with: field.ustring ~
"match-string"
- eprobes can now be filtered like any other event.
- trace_marker(_raw) now uses stream_open() to allow multiple threads
to safely write to it. Note, this could possibly break existing
user space, but we will not know until we hear about it, and then
can revert the change if need be.
- New field in events to display when bottom halfs are disabled.
- Sorting of the ftrace functions are now done at compile time
instead of at bootup.
Infrastructure changes to support future efforts:
- Added __rel_loc type for trace events. Similar to __data_loc but
the offset to the dynamic data is based off of the location of the
descriptor and not the beginning of the event. Needed for user
defined events.
- Some simplification of event trigger code.
- Make synthetic events process its callback better to not hinder
other event callbacks that are registered. Needed for user defined
events.
And other small fixes and cleanups"
* tag 'trace-v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (50 commits)
tracing: Add ustring operation to filtering string pointers
rtla: Add rtla timerlat hist documentation
rtla: Add rtla timerlat top documentation
rtla: Add rtla timerlat documentation
rtla: Add rtla osnoise hist documentation
rtla: Add rtla osnoise top documentation
rtla: Add rtla osnoise man page
rtla: Add Documentation
rtla/timerlat: Add timerlat hist mode
rtla: Add timerlat tool and timelart top mode
rtla/osnoise: Add the hist mode
rtla/osnoise: Add osnoise top mode
rtla: Add osnoise tool
rtla: Helper functions for rtla
rtla: Real-Time Linux Analysis tool
tracing/osnoise: Properly unhook events if start_per_cpu_kthreads() fails
tracing: Remove duplicate warnings when calling trace_create_file()
tracing/kprobes: 'nmissed' not showed correctly for kretprobe
tracing: Add test for user space strings when filtering on string pointers
tracing: Have syscall trace events use trace_event_buffer_lock_reserve()
...
|
|
Fedora 35 sets DEBUGINFOD_URLS by default, which might lead to
unexpected stalls in perf record exit path, when we try to cache
profiled binaries.
# DEBUGINFOD_PROGRESS=1 ./perf record -a
^C[ perf record: Woken up 1 times to write data ]
Downloading from https://debuginfod.fedoraproject.org/ 447069
Downloading from https://debuginfod.fedoraproject.org/ 1502175
Downloading \^Z
Disabling DEBUGINFOD_URLS by default in perf record and adding
debuginfod option and .perfconfig variable support to enable id.
Default without debuginfo processing:
# perf record -a
Using system debuginfod setup:
# perf record -a --debuginfod
Using custom debuginfd url:
# perf record -a --debuginfod='https://evenbetterdebuginfodserver.krava'
Adding single perf_debuginfod_setup function and using
it also in perf buildid-cache command.
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Frank Ch. Eigler <fche@redhat.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Petlan <mpetlan@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211209200425.303561-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The cpumap is dummy, so no need to go on figuring out affinity.o
This way we reduce the setup time for simple scenarios like:
$ perf stat sleep 1
Acked-by: Andi Kleen <andi@firstfloor.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Needed to check if a cpu_map is dummy, i.e. not a cpu map at all, for
pid monitoring scenarios.
This probably needs to move to libperf, but since perf itself is the
first and so far only user, leave it at tools/perf/util/.
Acked-by: Andi Kleen <andi@firstfloor.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Multiple events may have a metric_leader to aggregate into.
This happens for uncore events where, for example, uncore_imc is
expanded into uncore_imc_0, uncore_imc_1, etc.
Such events all have the same metric_id and should aggregate into the
first event.
The change introducing metric_ids had a bug where the metric_id was
compared to itself, creating an always true condition.
Correct this by comparing the event in the metric_evlist and the
metric_leader.
Fixes: ec5c5b3d2c21b3f3 ("perf metric: Encode and use metric-id as qualifier")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20220115062852.1959424-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Commit fdf1e29b6118c18f ("perf expr: Add metric literals for topology.")
fails on s390:
# ./perf test -Fv 7
...
# FAILED tests/expr.c:173 #num_dies >= #num_packages
---- end ----
Simple expression parser: FAILED!
#
Investigating this issue leads to these functions:
build_cpu_topology()
+--> has_die_topology(void)
{
struct utsname uts;
if (uname(&uts) < 0)
return false;
if (strncmp(uts.machine, "x86_64", 6))
return false;
....
}
which always returns false on s390. The caller build_cpu_topology()
checks has_die_topology() return value. On false the
the struct cpu_topology::die_cpu_list is not contructed and has zero
entries. This leads to the failing comparison: #num_dies >= #num_packages.
s390 of course has a positive number of packages.
Fix this by adding s390 architecture to support CPU die list.
Output after:
# ./perf test -Fv 7
7: Simple expression parser :
--- start ---
division by zero
syntax error
---- end ----
Simple expression parser: Ok
#
Fixes: fdf1e29b6118c18f ("perf expr: Add metric literals for topology.")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20211124090343.9436-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
We shouldn't free() something that will be used in the next line, fix
it.
Fixes: b85a4d61d3022608 ("perf metric: Allow modifiers on metrics")
Addresses-Coverity-ID: 1494000
Signed-off-by: José Expósito <jose.exposito89@gmail.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20211208171113.22089-1-jose.exposito89@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Relative path include works in the regular build due to -I paths but may
fail in other situations.
Fixes: 83869019c74cc2d0 ("perf arch: Support register names from all archs")
Reviewed-by: German Gomez <german.gomez@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Truong <alexandre.truong@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220114064822.1806019-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To pick up fixes and get in line with other trees, powerpc kernel
mostly this time, but BPF as well.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
libbpf bpf_map__def() API is being deprecated, replace perf's
usage with the appropriate getters and setters.
Signed-off-by: Christy Lee <christylee@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
Link: https://lore.kernel.org/bpf/20220108004218.355761-4-christylee@fb.com
|
|
This patch changes MetricExpr to be written out in the same case. This
enables events in metrics to use modifiers like 'G' which currently
yield parse errors when made lower case. To keep tests passing the
literal #smt_on is compared in a non-case sensitive way - #SMT_on is
present in at least SkylakeX metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211126071305.3733878-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Useful for diagnosing problems with metrics.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20211124001231.3277836-1-irogers@google.com
[ Fixed up perf_cpu conflict, i.e. we need to append ".cpu" to cpu__max_present_cpu() result ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Following Documentation/ABI/stable/sysfs-devices-system-cpu the
/sys/devices/system/cpu/cpuX/topology/core_cpus is deprecated in favor
of thread_siblings, so probe thread_siblings before falling back on
core_cpus.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211124001231.3277836-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
strtoull can only read a 64-bit bitmap. On an AMD EPYC core_cpus may look
like:
00000000,00000000,00000000,00000001,00000000,00000000,00000000,00000001
and so the sibling wasn't spotted. Fix by writing a simple hweight string
parser.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Konstantin Khlebnikov <koct9i@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20211124001231.3277836-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A common problem is confusing CPU map indices with the CPU, by wrapping
the CPU with a struct then this is avoided. This approach is similar to
atomic_t.
Committer notes:
To make it build with BUILD_BPF_SKEL=1 these files needed the
conversions to 'struct perf_cpu' usage:
tools/perf/util/bpf_counter.c
tools/perf/util/bpf_counter_cgroup.c
tools/perf/util/bpf_ftrace.c
Also perf_env__get_cpu() was removed back in "perf cpumap: Switch
cpu_map__build_map to cpu function".
Additionally these needed to be fixed for the ARM builds to complete:
tools/perf/arch/arm/util/cs-etm.c
tools/perf/arch/arm64/util/pmu.c
Suggested-by: John Garry <john.garry@huawei.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-49-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_stat__update_shadow_stats() and perf_stat__print_shadow_stats() use
a cpu map index rather than a CPU, but first_shadow_cpu is returning the
wrong value for this. Change first_shadow_cpu to
first_shadow_cpu_map_idx to make things agree.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-48-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_counts are accessed by the densely packed index.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-47-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use perf_cpu_map__for_each_cpu() in setup_nodes.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: John Garry <john.garry@huawei.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Clarke <pc@us.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Vineet Singh <vineet.singh@intel.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: zhengjun.xing@intel.com
Link: https://lore.kernel.org/r/20220105061351.120843-46-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|