summaryrefslogtreecommitdiffstats
path: root/tools/perf/util/dso.h
diff options
context:
space:
mode:
authorIan Rogers <irogers@google.com>2024-05-04 23:37:57 +0200
committerArnaldo Carvalho de Melo <acme@redhat.com>2024-05-06 14:13:11 +0200
commit3f4ac23a990853ab5012037767281dfd4beb4b15 (patch)
tree04b24f1ce0a7b60a7c8f667d7aed418c8d78874c /tools/perf/util/dso.h
parentperf vendor events amd: Add Zen 5 mapping (diff)
downloadlinux-3f4ac23a990853ab5012037767281dfd4beb4b15.tar.xz
linux-3f4ac23a990853ab5012037767281dfd4beb4b15.zip
perf dsos: Switch backing storage to array from rbtree/list
DSOs were held on a list for fast iteration and in an rbtree for fast finds. Switch to using a lazily sorted array where iteration is just iterating through the array and binary searches are the same complexity as searching the rbtree. The find may need to sort the array first which does increase the complexity, but add operations have lower complexity and overall the complexity should remain about the same. The set name operations on the dso just records that the array is no longer sorted, avoiding complexity in rebalancing the rbtree. Tighter locking discipline is enforced to avoid the array being resorted while long and short names or ids are changed. The array is smaller in size, replacing 6 pointers with 2, and so even with extra allocated space in the array, the array may be 50% unoccupied, the memory saving should be at least 2x. Committer testing: On a previous version of this patchset we were getting a lot of warnings about deleting a DSO still on a list, now it is ok: root@x1:~# perf probe -l root@x1:~# perf probe finish_task_switch Added new event: probe:finish_task_switch (on finish_task_switch) You can now use it in all perf tools, such as: perf record -e probe:finish_task_switch -aR sleep 1 root@x1:~# perf probe -l probe:finish_task_switch (on finish_task_switch@kernel/sched/core.c) root@x1:~# perf trace -e probe:finish_task_switch/max-stack=8/ --max-events=1 0.000 migration/0/19 probe:finish_task_switch(__probe_ip: -1894408688) finish_task_switch.isra.0 ([kernel.kallsyms]) __schedule ([kernel.kallsyms]) schedule ([kernel.kallsyms]) smpboot_thread_fn ([kernel.kallsyms]) kthread ([kernel.kallsyms]) ret_from_fork ([kernel.kallsyms]) ret_from_fork_asm ([kernel.kallsyms]) root@x1:~# root@x1:~# perf probe -d probe:* Removed event: probe:finish_task_switch root@x1:~# perf probe -l root@x1:~# I also ran the full 'perf test' suite after applying this one, no regressions. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia ZiemiaƄska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Diffstat (limited to 'tools/perf/util/dso.h')
-rw-r--r--tools/perf/util/dso.h10
1 files changed, 4 insertions, 6 deletions
diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
index 2c295438226d..b22dec8b3f3a 100644
--- a/tools/perf/util/dso.h
+++ b/tools/perf/util/dso.h
@@ -146,9 +146,7 @@ struct auxtrace_cache;
struct dso {
struct mutex lock;
- struct list_head node;
- struct rb_node rb_node; /* rbtree node sorted by long name */
- struct rb_root *root; /* root of rbtree that rb_node is in */
+ struct dsos *dsos;
struct rb_root_cached symbols;
struct symbol **symbol_names;
size_t symbol_names_len;
@@ -238,8 +236,8 @@ static inline void dso__set_loaded(struct dso *dso)
dso->loaded = true;
}
-int dso_id__cmp(struct dso_id *a, struct dso_id *b);
-bool dso_id__empty(struct dso_id *id);
+int dso_id__cmp(const struct dso_id *a, const struct dso_id *b);
+bool dso_id__empty(const struct dso_id *id);
struct dso *dso__new_id(const char *name, struct dso_id *id);
struct dso *dso__new(const char *name);
@@ -248,7 +246,7 @@ void dso__delete(struct dso *dso);
int dso__cmp_id(struct dso *a, struct dso *b);
void dso__set_short_name(struct dso *dso, const char *name, bool name_allocated);
void dso__set_long_name(struct dso *dso, const char *name, bool name_allocated);
-void dso__inject_id(struct dso *dso, struct dso_id *id);
+void __dso__inject_id(struct dso *dso, struct dso_id *id);
int dso__name_len(const struct dso *dso);