diff options
author | Jeff King <peff@peff.net> | 2024-02-28 23:39:00 +0100 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2024-02-28 23:42:01 +0100 |
commit | 5f64279443520922949ea89babe9bd3712c11fec (patch) | |
tree | c0ace764c7fe2e60633f5b2ada8b3db557d6103c /upload-pack.c | |
parent | upload-pack: disallow object-info capability by default (diff) | |
download | git-5f64279443520922949ea89babe9bd3712c11fec.tar.xz git-5f64279443520922949ea89babe9bd3712c11fec.zip |
upload-pack: always turn off save_commit_buffer
When the client sends us "want $oid" lines, we call parse_object($oid)
to get an object struct. It's important to parse the commits because we
need to traverse them in the negotiation phase. But of course we don't
need to hold on to the commit messages for each one.
We've turned off the save_commit_buffer flag in get_common_commits() for
a long time, since f0243f26f6 (git-upload-pack: More efficient usage of
the has_sha1 array, 2005-10-28). That helps with the commits we see
while actually traversing. But:
1. That function is only used by the v0 protocol. I think the v2
protocol's code path leaves the flag on (and thus pays the extra
memory penalty), though I didn't measure it specifically.
2. If the client sends us a bunch of "want" lines, that happens before
the negotiation phase. So we'll hold on to all of those commit
messages. Generally the number of "want" lines scales with the
refs, not with the number of objects in the repo. But a malicious
client could send a lot in order to waste memory.
As an example of (2), if I generate a request to fetch all commits in
git.git like this:
pktline() {
local msg="$*"
printf "%04x%s\n" $((1+4+${#msg})) "$msg"
}
want_commits() {
pktline command=fetch
printf 0001
git cat-file --batch-all-objects --batch-check='%(objectname) %(objecttype)' |
while read oid type; do
test "$type" = "commit" || continue
pktline want $oid
done
pktline done
printf 0000
}
want_commits | GIT_PROTOCOL=version=2 valgrind --tool=massif git-upload-pack . >/dev/null
before this patch upload-pack peaks at ~125MB, and after at ~35MB. The
difference is not coincidentally about the same as the sum of all commit
object sizes as computed by:
git cat-file --batch-all-objects --batch-check='%(objecttype) %(objectsize)' |
perl -alne '$v += $F[1] if $F[0] eq "commit"; END { print $v }'
In a larger repository like linux.git, that number is ~1GB.
In a repository with a full commit-graph file this will have no impact
(and the commit graph would save us from parsing at all, so is a much
better solution!). But it's easy to do, might help a little in
real-world cases (where even if you have a commit graph it might not be
fully up to date), and helps a lot for a worst-case malicious request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to '')
-rw-r--r-- | upload-pack.c | 2 |
1 files changed, 0 insertions, 2 deletions
diff --git a/upload-pack.c b/upload-pack.c index 2a5c52666e..3970bb9b30 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -526,8 +526,6 @@ static int get_common_commits(struct upload_pack_data *data, int got_other = 0; int sent_ready = 0; - save_commit_buffer = 0; - for (;;) { const char *arg; |