diff options
author | Jeff King <peff@peff.net> | 2022-06-16 08:54:41 +0200 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2022-06-16 19:03:40 +0200 |
commit | 18c08abc824c6cdb7bc17e1b5f85f8b118507aa5 (patch) | |
tree | d89b7135d5e91f80705172f8cc44bf004a853158 /packfile.c | |
parent | Git 2.35.3 (diff) | |
download | git-18c08abc824c6cdb7bc17e1b5f85f8b118507aa5.tar.xz git-18c08abc824c6cdb7bc17e1b5f85f8b118507aa5.zip |
is_promisor_object(): walk promisor packs in pack-order
When we generate the list of promisor objects, we walk every pack with a
.promisor file and examine its objects for any links to other objects.
By default, for_each_packed_object() will go in pack .idx order.
This is the worst case with respect to our delta base cache. If we have
a delta chain of A->B->C->D, then visiting A may require reconstructing
both B and C, unless we also visited B recently, in which case we may
have cached its value. Because .idx order is based on sha1, it's random
with respect to the actual object contents and deltas, and thus we're
unlikely to get many cache hits.
If we instead traverse in pack order, then we get the optimal case:
packs are written to keep delta families together, and to place bases
before their children.
Even on a modest repository like git.git, this has a noticeable speedup
on p5600.4, which runs "fsck" on a partial clone with blob:none (so lots
of trees which need to be walked, and which delta well):
Test HEAD^ HEAD
-------------------------------------------------------
5600.4: 17.87(17.83+0.04) 15.42(15.35+0.06) -13.7%
On a larger repository like linux.git, the speedup is even more
pronounced:
Test HEAD^ HEAD
-----------------------------------------------------------
5600.4: 322.47(322.01+0.42) 186.41(185.76+0.63) -42.2%
Any other operations that call is_promisor_object(), like "rev-list
--exclude-promisor-objects", would similarly benefit, but the
invocations in p5600 don't actually trigger any such cases.
Note that we may pay a small price to build a rev-index in-memory to do
the pack-order traversal. But it's still a big net win, and even that
small cost goes away if you are using pack.writeReverseIndex.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'packfile.c')
-rw-r--r-- | packfile.c | 3 |
1 files changed, 2 insertions, 1 deletions
diff --git a/packfile.c b/packfile.c index 835b2d2716..47a23bb4a5 100644 --- a/packfile.c +++ b/packfile.c @@ -2260,7 +2260,8 @@ int is_promisor_object(const struct object_id *oid) if (has_promisor_remote()) { for_each_packed_object(add_promisor_object, &promisor_objects, - FOR_EACH_OBJECT_PROMISOR_ONLY); + FOR_EACH_OBJECT_PROMISOR_ONLY | + FOR_EACH_OBJECT_PACK_ORDER); } promisor_objects_prepared = 1; } |