summaryrefslogtreecommitdiffstats
path: root/pack-bitmap.h
diff options
context:
space:
mode:
authorJeff King <peff@peff.net>2020-12-08 23:04:34 +0100
committerJunio C Hamano <gitster@pobox.com>2020-12-08 23:48:17 +0100
commit449fa5ee06906ca6d109e06b14cb4f8ea60a6c88 (patch)
tree0c8635baf66e570791698635e6f7bcf6e46ca677 /pack-bitmap.h
parentpack-bitmap-write: build fewer intermediate bitmaps (diff)
downloadgit-449fa5ee06906ca6d109e06b14cb4f8ea60a6c88.tar.xz
git-449fa5ee06906ca6d109e06b14cb4f8ea60a6c88.zip
pack-bitmap-write: ignore BITMAP_FLAG_REUSE
The on-disk bitmap format has a flag to mark a bitmap to be "reused". This is a rather curious feature, and works like this: - a run of pack-objects would decide to mark the last 80% of the bitmaps it generates with the reuse flag - the next time we generate bitmaps, we'd see those reuse flags from the last run, and mark those commits as special: - we'd be more likely to select those commits to get bitmaps in the new output - when generating the bitmap for a selected commit, we'd reuse the old bitmap as-is (rearranging the bits to match the new pack, of course) However, neither of these behaviors particularly makes sense. Just because a commit happened to be bitmapped last time does not make it a good candidate for having a bitmap this time. In particular, we may choose bitmaps based on how recent they are in history, or whether a ref tip points to them, and those things will change. We're better off re-considering fresh which commits are good candidates. Reusing the existing bitmap _is_ a reasonable thing to do to save computation. But only reusing exact bitmaps is a weak form of this. If we have an old bitmap for A and now want a new bitmap for its child, we should be able to compute that only by looking at trees and that are new to the child. But this code would consider only exact reuse (which is perhaps why it was eager to select those commits in the first place). Furthermore, the recent switch to the reverse-edge algorithm for generating bitmaps dropped this optimization entirely (and yet still performs better). So let's do a few cleanups: - drop the whole "reusing bitmaps" phase of generating bitmaps. It's not helping anything, and is mostly unused code (or worse, code that is using CPU but not doing anything useful) - drop the use of the on-disk reuse flag to select commits to bitmap - stop setting the on-disk reuse flag in bitmaps we generate (since nothing respects it anymore) We will keep a few innards of the reuse code, which will help us implement a more capable version of the "reuse" optimization: - simplify rebuild_existing_bitmaps() into a function that only builds the mapping of bits between the old and new orders, but doesn't actually convert any bitmaps - make rebuild_bitmap() public; we'll call it lazily to convert bitmaps as we traverse (using the mapping created above) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'pack-bitmap.h')
-rw-r--r--pack-bitmap.h6
1 files changed, 5 insertions, 1 deletions
diff --git a/pack-bitmap.h b/pack-bitmap.h
index 1203120c43..afa4115136 100644
--- a/pack-bitmap.h
+++ b/pack-bitmap.h
@@ -73,7 +73,11 @@ void bitmap_writer_set_checksum(unsigned char *sha1);
void bitmap_writer_build_type_index(struct packing_data *to_pack,
struct pack_idx_entry **index,
uint32_t index_nr);
-void bitmap_writer_reuse_bitmaps(struct packing_data *to_pack);
+uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git,
+ struct packing_data *mapping);
+int rebuild_bitmap(const uint32_t *reposition,
+ struct ewah_bitmap *source,
+ struct bitmap *dest);
void bitmap_writer_select_commits(struct commit **indexed_commits,
unsigned int indexed_commits_nr, int max_bitmaps);
void bitmap_writer_build(struct packing_data *to_pack);