From 3f4ac23a990853ab5012037767281dfd4beb4b15 Mon Sep 17 00:00:00 2001 From: Ian Rogers Date: Sat, 4 May 2024 14:37:57 -0700 Subject: perf dsos: Switch backing storage to array from rbtree/list MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DSOs were held on a list for fast iteration and in an rbtree for fast finds. Switch to using a lazily sorted array where iteration is just iterating through the array and binary searches are the same complexity as searching the rbtree. The find may need to sort the array first which does increase the complexity, but add operations have lower complexity and overall the complexity should remain about the same. The set name operations on the dso just records that the array is no longer sorted, avoiding complexity in rebalancing the rbtree. Tighter locking discipline is enforced to avoid the array being resorted while long and short names or ids are changed. The array is smaller in size, replacing 6 pointers with 2, and so even with extra allocated space in the array, the array may be 50% unoccupied, the memory saving should be at least 2x. Committer testing: On a previous version of this patchset we were getting a lot of warnings about deleting a DSO still on a list, now it is ok: root@x1:~# perf probe -l root@x1:~# perf probe finish_task_switch Added new event: probe:finish_task_switch (on finish_task_switch) You can now use it in all perf tools, such as: perf record -e probe:finish_task_switch -aR sleep 1 root@x1:~# perf probe -l probe:finish_task_switch (on finish_task_switch@kernel/sched/core.c) root@x1:~# perf trace -e probe:finish_task_switch/max-stack=8/ --max-events=1 0.000 migration/0/19 probe:finish_task_switch(__probe_ip: -1894408688) finish_task_switch.isra.0 ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) smpboot_thread_fn ([kernel.kallsyms]) kthread ([kernel.kallsyms]) ret_from_fork ([kernel.kallsyms]) ret_from_fork_asm ([kernel.kallsyms]) root@x1:~# root@x1:~# perf probe -d probe:* Removed event: probe:finish_task_switch root@x1:~# perf probe -l root@x1:~# I also ran the full 'perf test' suite after applying this one, no regressions. Signed-off-by: Ian Rogers Tested-by: Arnaldo Carvalho de Melo Cc: Adrian Hunter Cc: Ahelenia ZiemiaƄska Cc: Alexander Shishkin Cc: Andi Kleen Cc: Athira Rajeev Cc: Ben Gainey Cc: Changbin Du Cc: Chengen Du Cc: Colin Ian King Cc: Dima Kogan Cc: Ilkka Koskinen Cc: Ingo Molnar Cc: James Clark Cc: Jiri Olsa Cc: K Prateek Nayak Cc: Kan Liang Cc: Leo Yan Cc: Li Dong Cc: Mark Rutland Cc: Masami Hiramatsu Cc: Namhyung Kim Cc: Paran Lee Cc: Peter Zijlstra Cc: Song Liu Cc: Sun Haiyong Cc: Thomas Richter Cc: Tiezhu Yang Cc: Yanteng Si Cc: zhaimingbing Link: https://lore.kernel.org/r/20240504213803.218974-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo --- tools/perf/util/dso.h | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) (limited to 'tools/perf/util/dso.h') diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h index 2c295438226d..b22dec8b3f3a 100644 --- a/tools/perf/util/dso.h +++ b/tools/perf/util/dso.h @@ -146,9 +146,7 @@ struct auxtrace_cache; struct dso { struct mutex lock; - struct list_head node; - struct rb_node rb_node; /* rbtree node sorted by long name */ - struct rb_root *root; /* root of rbtree that rb_node is in */ + struct dsos *dsos; struct rb_root_cached symbols; struct symbol **symbol_names; size_t symbol_names_len; @@ -238,8 +236,8 @@ static inline void dso__set_loaded(struct dso *dso) dso->loaded = true; } -int dso_id__cmp(struct dso_id *a, struct dso_id *b); -bool dso_id__empty(struct dso_id *id); +int dso_id__cmp(const struct dso_id *a, const struct dso_id *b); +bool dso_id__empty(const struct dso_id *id); struct dso *dso__new_id(const char *name, struct dso_id *id); struct dso *dso__new(const char *name); @@ -248,7 +246,7 @@ void dso__delete(struct dso *dso); int dso__cmp_id(struct dso *a, struct dso *b); void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated); void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated); -void dso__inject_id(struct dso *dso, struct dso_id *id); +void __dso__inject_id(struct dso *dso, struct dso_id *id); int dso__name_len(const struct dso *dso); -- cgit v1.2.3