diff options
author | Colin Stolley <cstolley@runbox.com> | 2019-11-27 23:24:53 +0100 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2019-12-03 16:59:45 +0100 |
commit | ec48540fe8c387cf7424d5387ddbd53e89bb9d51 (patch) | |
tree | 1b2111770e24c35a2591d4458c6c6e3fba124e8b /packfile.c | |
parent | The first batch post 2.24 cycle (diff) | |
download | git-ec48540fe8c387cf7424d5387ddbd53e89bb9d51.tar.xz git-ec48540fe8c387cf7424d5387ddbd53e89bb9d51.zip |
packfile.c: speed up loading lots of packfiles
When loading packfiles on start-up, we traverse the internal packfile
list once per file to avoid reloading packfiles that have already
been loaded. This check runs in quadratic time, so for poorly
maintained repos with a large number of packfiles, it can be pretty
slow.
Add a hashmap containing the packfile names as we load them so that
the average runtime cost of checking for already-loaded packs becomes
constant.
Add a perf test to p5303 to show speed-up.
The existing p5303 test runtimes are dominated by other factors and do
not show an appreciable speed-up. The new test in p5303 clearly exposes
a speed-up in bad cases. In this test we create 10,000 packfiles and
measure the start-up time of git rev-parse, which does little else
besides load in the packs.
Here are the numbers for the new p5303 test:
Test HEAD^ HEAD
---------------------------------------------------------------------
5303.12: load 10,000 packs 1.03(0.92+0.10) 0.12(0.02+0.09) -88.3%
Signed-off-by: Colin Stolley <cstolley@runbox.com>
Helped-by: Jeff King <peff@peff.net>
[jc: squashed the change to call hashmap in install_packed_git() by peff]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'packfile.c')
-rw-r--r-- | packfile.c | 19 |
1 files changed, 10 insertions, 9 deletions
diff --git a/packfile.c b/packfile.c index 355066de17..f0dc63e92f 100644 --- a/packfile.c +++ b/packfile.c @@ -757,6 +757,9 @@ void install_packed_git(struct repository *r, struct packed_git *pack) pack->next = r->objects->packed_git; r->objects->packed_git = pack; + + hashmap_entry_init(&pack->packmap_ent, strhash(pack->pack_name)); + hashmap_add(&r->objects->pack_map, &pack->packmap_ent); } void (*report_garbage)(unsigned seen_bits, const char *path); @@ -856,20 +859,18 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (strip_suffix_mem(full_name, &base_len, ".idx") && !(data->m && midx_contains_pack(data->m, file_name))) { - /* Don't reopen a pack we already have. */ - for (p = data->r->objects->packed_git; p; p = p->next) { - size_t len; - if (strip_suffix(p->pack_name, ".pack", &len) && - len == base_len && - !memcmp(p->pack_name, full_name, len)) - break; - } + struct hashmap_entry hent; + char *pack_name = xstrfmt("%.*s.pack", (int)base_len, full_name); + unsigned int hash = strhash(pack_name); + hashmap_entry_init(&hent, hash); - if (!p) { + /* Don't reopen a pack we already have. */ + if (!hashmap_get(&data->r->objects->pack_map, &hent, pack_name)) { p = add_packed_git(full_name, full_name_len, data->local); if (p) install_packed_git(data->r, p); } + free(pack_name); } if (!report_garbage) |