git - git

	Commit message (Collapse)	Author	Files	Lines
2024-11-20	The tenth batch	Junio C Hamano	1	-5/+10
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-20	Prepare for 2.47.1	Junio C Hamano	3	-2/+28
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-15	Clean up RelNotes for 2.48	Junio C Hamano	1	-15/+0
	There somehow ended up too many bogus "merge X later to maint" comments for topics that cannot be merged ever down to 'maint' because they were forked from more recent integration branches in the draft release notes. Remove them, as they are inviting for mistakes later. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-13	The ninth batch	Junio C Hamano	1	-0/+15
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12	index-pack: repack local links into promisor packs	Jonathan Tan	4	-2/+172
	Teach index-pack to, when processing the objects in a pack with --promisor specified on the CLI, repack local objects (and the local objects that they refer to, recursively) referenced by these objects into promisor packs. This prevents the situation in which, when fetching from a promisor remote, we end up with promisor objects (newly fetched) referring to non-promisor objects (locally created prior to the fetch). This situation may arise if the client had previously pushed objects to the remote, for example. One issue that arises in this situation is that, if the non-promisor objects become inaccessible except through promisor objects (for example, if the branch pointing to them has moved to point to the promisor object that refers to them), then GC will garbage collect them. There are other ways to solve this, but the simplest seems to be to enforce the invariant that we don't have promisor objects referring to non-promisor objects. This repacking is done from index-pack to minimize the performance impact. During a fetch, the only time most objects are fully inflated in memory is when their object ID is computed, so we also scan the objects (to see which objects they refer to) during this time. Also to minimize the performance impact, an object is calculated to be local if it's a loose object or present in a non-promisor pack. (If it's also in a promisor pack or referred to by an object in a promisor pack, it is technically already a promisor object. But a misidentification of a promisor object as a non-promisor object is relatively benign here - we will thus repack that promisor object into a promisor pack, duplicating it in the object store, but there is no correctness issue, just an issue of inefficiency.) Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-12	doc: git-add.txt: convert to new style convention	Jean-Noël Avila	2	-62/+68
	Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-08	The eighth batch	Junio C Hamano	1	-0/+15
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-06	compat/mingw: support POSIX semantics for atomic renames	Patrick Steinhardt	2	-7/+88
	By default, Windows restricts access to files when those files have been opened by another process. As explained in the preceding commits, these restrictions can be loosened such that reads, writes and/or deletes of files with open handles _are_ allowed. While we set up those sharing flags in most relevant code paths now, we still don't properly handle POSIX-style atomic renames in case the target path is open. This is failure demonstrated by t0610, where one of our tests spawns concurrent writes in a reftable-enabled repository and expects all of them to succeed. This test fails most of the time because the process that has acquired the "tables.list" lock is unable to rename it into place while other processes are busy reading that file. Windows 10 has introduced the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag that allows us to fix this usecase [1]. When set, it is possible to rename a file over a preexisting file even when the target file still has handles open. Those handles must have been opened with the `FILE_SHARE_DELETE` flag, which we have ensured in the preceding commits. Careful readers might have noticed that [1] does not mention the above flag, but instead mentions `FILE_RENAME_POSIX_SEMANTICS`. This flag is not for use with `SetFileInformationByHandle()` though, which is what we use. And while the `FILE_RENAME_FLAG_POSIX_SEMANTICS` flag exists, it is not documented on [2] or anywhere else as far as I can tell. Unfortunately, we still support Windows systems older than Windows 10 that do not yet have this new flag. Our `_WIN32_WINNT` SDK version still targets 0x0600, which is Windows Vista and later. And even though that Windows version is out-of-support, bumping the SDK version all the way to 0x0A00, which is Windows 10 and later, is not an option as it would make it impossible to compile on Windows 8.1, which is still supported. Instead, we have to manually declare the relevant infrastructure to make this feature available and have fallback logic in place in case we run on a Windows version that does not yet have this flag. On another note: `mingw_rename()` has a retry loop that is used in case deleting a file failed because it's still open in another process. One might be pressed to not use this loop anymore when we can use POSIX semantics. But unfortunately, we have to keep it around due to our dependence on the `FILE_SHARE_DELETE` flag. While we know to set that sharing flag now, other applications may not do so and may thus still cause sharing violations when we try to rename a file. This fixes concurrent writes in the reftable backend as demonstrated in t0610, but may also end up fixing other usecases where Git wants to perform renames. [1]: https://learn.microsoft.com/en-us/windows-hardware/drivers/ddi/ntifs/ns-ntifs-_file_rename_information [2]: https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-file_rename_info Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-06	fetch-pack: die if in commit graph but not obj db	Jonathan Tan	2	-5/+18
	When fetching, there is a step in which sought objects are first checked against the local repository; only objects that are not in the local repository are then fetched. This check first looks up the commit graph file, and returns "present" if the object is in there. However, the action of first looking up the commit graph file is not done everywhere in Git, especially if the type of the object at the time of lookup is not known. This means that in a repo corruption situation, a user may encounter an "object missing" error, attempt to fetch it, and still encounter the same error later when they reattempt their original action, because the object is present in the commit graph file but not in the object DB. Therefore, make it a fatal error when this occurs. (Note that we cannot proceed to include this object in the list of objects to be fetched without changing at least the fetch negotiation code: what would happen is that the client will send "want X" and "have X" and when I tested at $DAYJOB with a work server that uses JGit, the server reasonably returned an empty packfile. And changing the fetch negotiation code to only use the object DB when deciding what to report as "have" would be an unnecessary slowdown, I think.) This was discovered when a lazy fetch of a missing commit completed with nothing actually fetched, and the writing of the commit graph file after every fetch then attempted to read said missing commit, triggering a lazy fetch of said missing commit, resulting in an infinite loop with no user-visible indication (until they check the list of processes running on their computer). With this fix, there is no infinite loop. Note that although the repo corruption we discovered was caused by a bug in GC in a partial clone, the behavior that this patch teaches Git to warn about applies to any repo with commit graph enabled and with a missing commit, whether it is a partial clone or not. t5330, introduced in 3a1ea94a49 (commit-graph.c: no lazy fetch in lookup_commit_in_graph(), 2022-07-01), tests that an interaction between fetch and the commit graph does not cause an infinite loop. This patch changes the exit code in that situation, so that test had to be changed. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-06	Revert "fetch-pack: add a deref_without_lazy_fetch_extended()"	Jonathan Tan	1	-18/+7
	This reverts commit a6e65fb39caf18259c660c1c7910d5bf80bc15cb. This revert simplifies the next patch in this patch set. The commit message of that commit mentions that the new function "will be used for the bundle-uri client in a subsequent commit", but it seems that eventually it wasn't used. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	doc: correct misleading descriptions for --shallow-exclude	Elijah Newren	6	-9/+9
	The documentation for the --shallow-exclude option to clone/fetch/etc. claims that the option takes a revision, but it does not. As per upload-pack.c's process_deepen_not(), it passes the option to expand_ref() and dies if it does not find exactly one ref matching the name passed. Further, this has always been the case ever since these options were introduced by the commits merged in a460ea4a3cb1 (Merge branch 'nd/shallow-deepen', 2016-10-10). Fix the documentation to match the implementation. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	list-objects-filter-options: work around reported leak on error	Patrick Steinhardt	2	-10/+8
	This one is a little bit more curious. In t6112, we have a test that exercises the `git rev-list --filter` option with invalid filters. We execute git-rev-list(1) via `test_must_fail`, which means that we check for leaks even though Git exits with an error code. This causes the following leak: Direct leak of 27 byte(s) in 1 object(s) allocated from: #0 0x5555555e6946 in realloc.part.0 lsan_interceptors.cpp.o #1 0x5555558fb4b6 in xrealloc wrapper.c:137:8 #2 0x5555558b6e06 in strbuf_grow strbuf.c:112:2 #3 0x5555558b7550 in strbuf_add strbuf.c:311:2 #4 0x5555557c1a88 in strbuf_addstr strbuf.h:310:2 #5 0x5555557c1d4c in parse_list_objects_filter list-objects-filter-options.c:261:3 #6 0x555555885ead in handle_revision_pseudo_opt revision.c:2899:3 #7 0x555555884e20 in setup_revisions revision.c:3014:11 #8 0x5555556c4b42 in cmd_rev_list builtin/rev-list.c:588:9 #9 0x5555555ec5e3 in run_builtin git.c:483:11 #10 0x5555555eb1e4 in handle_builtin git.c:749:13 #11 0x5555555ec001 in run_argv git.c:819:4 #12 0x5555555eaf94 in cmd_main git.c:954:19 #13 0x5555556fd569 in main common-main.c:64:11 #14 0x7ffff7ca714d in __libc_start_call_main (.../lib/libc.so.6+0x2a14d) #15 0x7ffff7ca7208 in __libc_start_main@GLIBC_2.2.5 (.../libc.so.6+0x2a208) #16 0x5555555ad064 in _start (git+0x59064) This leak is valid, as we call `die()` and do not clean up the memory at all. But what's curious is that this is the only leak reported, because we don't clean up any other allocated memory, either, and I have no idea why the leak sanitizer treats this buffer specially. In any case, we can work around the leak by shuffling things around a bit. Instead of calling `gently_parse_list_objects_filter()` and dying after we have modified the filter spec, we simply do so beforehand. Like this we don't allocate the buffer in the error case, which makes the reported leak go away. It's not pretty, but it manages to make t6112 leak free. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	builtin/merge: release output buffer after performing merge	Patrick Steinhardt	2	-0/+2
	The `obuf` member of `struct merge_options` is used to buffer output in some cases. In order to not discard its allocated memory we only release its contents in `merge_finalize()` when we're not currently recursing into a subtree. This results in some situations where we seemingly do not release the buffer reliably. We thus have calls to `strbuf_release()` for this buffer scattered across the codebase. But we're missing one callsite in git-merge(1), which causes a memory leak. We should ideally refactor this interface so that callers don't have to know about any such internals. But for now, paper over the issue by adding one more `strbuf_release()` call. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	dir: fix leak when parsing "status.showUntrackedFiles"	Patrick Steinhardt	2	-2/+3
	We use `repo_config_get_string()` to read "status.showUntrackedFiles" from the config subsystem. This function allocates the result, but we never free the result after parsing it. The value never leaves the scope of the calling function, so refactor it to instead use `repo_config_get_string_tmp()`, which does not hand over ownership to the caller. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	t/helper: fix leaking buffer in "dump-untracked-cache"	Patrick Steinhardt	1	-0/+2
	We never release the local `struct strbuf base` buffer, thus leaking memory. Fix this leak. This leak is exposed by t7063, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	t/helper: stop re-initialization of `the_repository`	Patrick Steinhardt	2	-2/+1
	While "common-main.c" already initializes `the_repository` for us, we do so a second time in the "read-cache" test helper. This causes a memory leak because the old repository's contents isn't released. Stop calling `initialize_repository()` to plug this leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	sparse-index: correctly free EWAH contents	Patrick Steinhardt	1	-2/+5
	While we free the `fsmonitor_dirty` member of `struct index_state`, we do not free the contents of that EWAH. Do so by using `ewah_free()` instead of `FREE_AND_NULL()`. This leak is exposed by t7519, but plugging it alone does not make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	dir: release untracked cache data	Patrick Steinhardt	1	-0/+8
	There are several cases where we invalidate untracked cache directory entries where we do not free the underlying data, but reset the number of entries. This causes us to leak memory because `free_untracked()` will not iterate over any potential entries which we still had in the array. Fix this issue by freeing old entries. The leak is exposed by t7519, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	combine-diff: fix leaking lost lines	Patrick Steinhardt	2	-2/+4
	The `cnt` variable tracks the number of lines in a patch diff. It can happen though that there are no newlines, in which case we'd still end up allocating our array of `sline`s. In fact, we always allocate it with `cnt + 2` entries: one extra entry for the deletion hunk at the end, and another entry that we don't seem to ever populate at all but acts as a kind of sentinel value. When we loop through the array to clear it at the end of this function we only loop until `lno < cnt`, and thus we may not end up releasing whatever the two extra `sline`s contain. While that shouldn't matter for the sentinel value, it does matter for the extra deletion hunk sline. Regardless of that, plug this memory leak by releasing both extra entries, which makes the logic a bit easier to reason about. While at it, fix the formatting of a local comment, which incidentally also provides the necessary context for why we overallocate the `sline` array. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	builtin/tag: fix leaking key ID on failure to sign	Patrick Steinhardt	2	-1/+2
	We do not free the key ID when signing a tag fails. Do so by using the common exit path. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	transport-helper: fix leaking import/export marks	Patrick Steinhardt	2	-0/+3
	Fix leaking import and export marks for transport helpers. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	builtin/commit: fix leaking cleanup config	Patrick Steinhardt	2	-5/+13
	The cleanup string set by the config is leaking when it is being overridden by an option. Fix this by tracking these via two separate variables such that we can free the old value. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	trailer: fix leaking strbufs when formatting trailers	Patrick Steinhardt	2	-5/+8
	When formatting trailer lines we iterate through each of the trailers and munge their respective token/value pairs according to the trailer options. When formatting a trailer that has its `item->token` pointer set we perform the munging in two local buffers. In the case where we figure out that the value is empty and `trim_empty` is set we just skip over the trailer item. But the buffers are local to the loop and we don't release their contents, leading to a memory leak. Plug this leak by lifting the buffers outside of the loop and releasing them on function return. This fixes the memory leaks, but also optimizes the loop as we don't have to reallocate the buffers on every single iteration. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	trailer: fix leaking trailer values	Patrick Steinhardt	1	-2/+8
	Fix leaking trailer values when replacing the value with a command or when the token value is empty. This leak is exposed by t7513, but plugging it does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	builtin/commit: fix leaking change data contents	Patrick Steinhardt	2	-1/+9
	While we free the worktree change data, we never free its contents. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	upload-pack: fix leaking URI protocols	Patrick Steinhardt	2	-0/+2
	We don't clear `struct upload_pack::uri_protocols`, which causes a memory leak. Fix this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	pretty: clear signature check	Patrick Steinhardt	5	-0/+5
	The signature check in the formatting context is never getting released. Fix this to plug the resulting memory leak. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	diff-lib: fix leaking diffopts in `do_diff_cache()`	Patrick Steinhardt	2	-0/+2
	In `do_diff_cache()` we initialize a new `rev_info` and then overwrite its `diffopt` with a user-provided set of options. This can leak memory because `repo_init_revisions()` may end up allocating memory for the `diffopt` itself depending on the configuration. And since that field is overwritten we won't ever free it. Plug the memory leak by releasing the diffopts before we overwrite them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	revision: fix leaking bloom filters	Patrick Steinhardt	2	-0/+6
	The memory allocated by `prepare_to_use_bloom_filter()` is not released by `release_revisions()`, causing a memory leak. Plug it. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	builtin/grep: fix leak with `--max-count=0`	Patrick Steinhardt	2	-3/+11
	When executing with `--max-count=0` we'll return early from git-grep(1) without performing any cleanup, which causes memory leaks. Plug these. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	grep: fix leak in `grep_splice_or()`	Patrick Steinhardt	1	-0/+1
	In `grep_splice_or()` we search for the next `TRUE` node in our tree of grep expressions and replace it with the given new expression. But we don't free the old node, which causes a memory leak. Plug it. This leak is exposed by t7810, but plugging it alone isn't sufficient to make the test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	t/helper: fix leaks in "reach" test tool	Patrick Steinhardt	2	-0/+11
	The "reach" test tool doesn't bother to clean up any of its allocated resources, causing various leaks. Plug them. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	builtin/ls-remote: plug leaking server options	Patrick Steinhardt	1	-0/+1
	The list of server options populated via `OPT_STRING_LIST()` is never cleared, causing a memory leak. Plug it. This leak is exposed by t5702, but plugging it alone does not make the whole test suite pass. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-05	upload-pack: fix ambiguous error message	Elijah Newren	2	-1/+12
	upload-pack.c takes any --shallow-exclude argument(s) from clone/fetch/etc. and passes them through expand_ref(). If it does not get back exactly one ref from the call to expand_ref(), it will die with the following error: fatal: git upload-pack: ambiguous deepen-not: %s Given that the documentation suggests to users that --shallow-exclude accepts a revision rather than a ref (which will be corrected in a subsequent commit), users may try to pass a revision. In such a case, expand_ref() will return 0 matches, but the error message we print will be misleading since "ambiguous" suggests there are multiple matches. Provide a clearer error message for such a case. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-04	t1016: clean up style	Andrew Kreimer	1	-132/+130
	Adhere to Documentation/CodingGuidelines: - Whitespace and redirect operator. - Case arms indentation. - Tabs for indentation. Signed-off-by: Andrew Kreimer <algonell@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02	t5300: move --window clamp test next to unclamped	Jonathan Tan	1	-5/+5
	A subsequent commit will change the behavior of "git index-pack --promisor", which is exercised in "build pack index for an existing pack", causing the unclamped and clamped versions of the --window test to exhibit different behavior. Move the clamp test closer to the unclamped test that it references. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02	t0410: use from-scratch server	Jonathan Tan	1	-1/+1
	A subsequent commit will add functionality: when fetching from a promisor remote, existing non-promisor objects that are ancestors of any fetched object will be repacked into promisor packs (since if a promisor remote has an object, it also has all its ancestors). This means that sometimes, a fetch from a promisor remote results in 2 new promisor packs (instead of the 1 that you would expect). There is a test that fetches a descendant of a local object from a promisor remote, but also specifically tests that there is exactly 1 promisor pack as a result of the fetch. This means that this test will fail when the subsequent commit is added. Since the ancestry of the fetched object is not the concern of this test, make the fetched objects have no ancestry in common with the objets in the client repo. This is done by making the server from scratch, instead of using an existing repo that has objects in common with the client. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-02	t0410: make test description clearer	Jonathan Tan	1	-2/+2
	Commit 9a4c507886 (t0410: test fetching from many promisor remotes, 2019-06-25) adds some tests that demonstrate not the automatic fetching of missing objects, but the direct fetching from another promisor remote (configured explicitly in one test and implicitly via --filter on the "git fetch" CLI invocation in the other test) - thus demonstrating support for multiple promisor remotes, as described in the commit message. Change the test descriptions accordingly to make this clearer. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-11-01	The seventh batch	Taylor Blau	1	-0/+39

2024-11-01	rev-list: skip bitmap traversal for --left-right	Jeff King	2	-0/+19
	Running: git rev-list --left-right --use-bitmap-index one...two will produce output without any left-right markers, since the bitmap traversal returns only a single set of reachable commits. Instead we should refuse to use bitmaps here and produce the correct output using a traditional traversal. This is probably not the only remaining option that misbehaves with bitmaps, but it's particularly egregious in that it feels like it _could_ work. Doing two separate traversals for the left/right sides and then taking the symmetric set differences should yield the correct answer, but our traversal code doesn't know how to do that. It's not clear if naively doing two separate traversals would always be a performance win. A traditional traversal only needs to walk down to the merge base, but bitmaps always fill out the full reachability set. So depending on your bitmap coverage, we could end up walking old bits of history twice to fill out the same uninteresting bits on both sides. We'd also of course end up with a very large --boundary set, if the user asked for that. So this might or might not be something worth implementing later. But for now, let's make sure we don't produce the wrong answer if somebody tries it. The test covers this, but also the same thing with "--count" (which is what I originally tried in a real-world case). Ironically the try_bitmap_count() code already realizes that "--left-right" won't work there. But that just causes us to fall back to the regular bitmap traversal code, which itself doesn't handle counting (we produce a list of objects rather than a count). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-30	The sixth batch	Taylor Blau	1	-0/+11

2024-10-28	t6006: fix prereq handling with `test_format ()`	Patrick Steinhardt	1	-4/+4
	In df383b5842 (t/test-lib: wire up NO_ICONV prerequisite, 2024-10-16) we have introduced a new NO_ICONV prerequisite that makes us skip tests in case Git is not compiled with support for iconv. This change subtly broke t6006: while the test suite still passes, some of its tests won't execute because they run into an error. ./t6006-rev-list-format.sh: line 92: test_expect_%e: command not found The broken tests use `test_format ()`, and the mentioned commit simply prepended the new prerequisite to its arguments. But that does not work, as the function is not aware of prereqs at all and will now treat all of its arguments incorrectly. Fix this by making the function aware of prereqs by accepting an optional fourth argument. Adapt the callsites accordingly. Reported-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-28	compat/mingw: allow deletion of most opened files	Patrick Steinhardt	1	-0/+66
	On Windows, we emulate open(3p) via `mingw_open()`. This function implements handling of some platform-specific quirks that are required to make it behave as closely as possible like open(3p) would, but for most cases we just call the Windows-specific `_wopen()` function. This function has a major downside though: it does not allow us to specify the sharing mode. While there is `_wsopen()` that allows us to pass sharing flags, those sharing flags are not the same `FILE_SHARE_*` flags as `CreateFileW()` accepts. Instead, `_wsopen()` only allows concurrent read- and write-access, but does not allow for concurrent deletions. Unfortunately though, we have to allow concurrent deletions if we want to have POSIX-style atomic renames on top of an existing file that has open file handles. Implement a new function that emulates open(3p) for existing files via `CreateFileW()` such that we can set the required sharing flags. While we have the same issue when calling open(3p) with `O_CREAT`, implementing that mode would be more complex due to the required permission handling. Furthermore, atomic updates via renames typically write to exclusive lockfile and then perform the rename, and thus we don't have to handle the case where the locked path has been created with `O_CREATE`. So while it would be nice to have proper POSIX semantics in all paths, we instead aim for a minimum viable fix here. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-28	compat/mingw: share file handles created via `CreateFileW()`	Patrick Steinhardt	1	-2/+2
	Unless told otherwise, Windows will keep other processes from reading, writing and deleting files when one has an open handle that was created via `CreateFileW()`. This behaviour can be altered via `FILE_SHARE_*` flags: - `FILE_SHARE_READ` allows a concurrent process to open the file for reading. - `FILE_SHARE_WRITE` allows a concurrent process to open the file for writing. - `FILE_SHARE_DELETE` allows a concurrent process to delete the file or to replace it via an atomic rename. This sharing mechanism is quite important in the context of Git, as we assume POSIX semantics all over the place. But there are two callsites where we don't pass all three of these flags: - We don't set `FILE_SHARE_DELETE` when creating a file for appending via `mingw_open_append()`. This makes it impossible to delete the file from another process or to replace it via an atomic rename. The function was introduced via d641097589 (mingw: enable atomic O_APPEND, 2018-08-13) and has been using `FILE_SHARE_READ \| FILE_SHARE_WRITE` since the inception. There aren't any indicators that the omission of `FILE_SHARE_DELETE` was intentional. - We don't set any sharing flags in `mingw_utime()`, which changes the access and modification of a file. This makes it impossible to perform any kind of operation on this file at all from another process. While we only open the file for a short amount of time to update its timestamps, this still opens us up for a race condition with another process. `mingw_utime()` was originally implemented via `_wopen()`, which doesn't give you full control over the sharing mode. Instead, it calls `_wsopen()` with `_SH_DENYNO`, which ultimately translates to `FILE_SHARE_READ \| FILE_SHARE_WRITE`. It was then refactored via 090a3085bc (t/helper/test-chmtime: update mingw to support chmtime on directories, 2022-03-02) to use `CreateFileW()`, but we stopped setting any sharing flags at all, which seems like an unintentional side effect. By restoring `FILE_SHARE_READ \| FILE_SHARE_WRITE` we thus fix this and get back the old behaviour of `_wopen()`. The fact that we didn't set the equivalent of `FILE_SHARE_DELETE` can be explained, as well: neither `_wopen()` nor `_wsopen()` allow you to do so. So overall, it doesn't seem intentional that we didn't allow deletions here, either. Adapt both of these callsites to pass all three sharing flags. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: use oidread() instead of hashcpy() to fill object_id	Jeff King	1	-1/+1
	When chasing a REF_DELTA, we need to pull the raw hash bytes out of the mmap'd packfile into an object_id struct. We do that with a raw hashcpy() of the appropriate length (that happens directly now, though before the previous commit it happened inside find_pack_entry_one(), also using a hashcpy). But I think this creates a potentially dangerous situation due to d4d364b2c7 (hash: convert `oidcmp()` and `oideq()` to compare whole hash, 2024-06-14). When using sha1, we'll have uninitialized bytes in the latter part of the object_id.hash buffer, which could fool oideq(), etc. We should use oidread() instead, which correctly zero-pads the extra bytes, as of c98d762ed9 (global: ensure that object IDs are always padded, 2024-06-14). As far as I can see, this has not been a problem in practice because the object_id we feed to find_pack_entry_one() is never used with oideq(), etc. It is being compared to the bytes mmap'd from a pack idx file, which of course do not have the extra padding bytes themselves. So there's no bug here, but this just puzzled me while looking at the code. We should do the more obviously safe thing, both for future-proofing and to avoid confusing readers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: use object_id in find_pack_entry_one()	Jeff King	7	-18/+18
	The main function we use to search a pack index for an object is find_pack_entry_one(). That function still takes a bare pointer to the hash, despite the fact that its underlying bsearch_pack() function needs an object_id struct. And so we end up making an extra copy of the hash into the struct just to do a lookup. As it turns out, all callers but one already have such an object_id. So we can just take a pointer to that struct and use it directly. This avoids the extra copy and provides a more type-safe interface. The one exception is get_delta_base() in packfile.c, when we are chasing a REF_DELTA from inside the pack (and thus we have a pointer directly to the mmap'd pack memory, not a struct). We can just bump the hashcpy() from inside find_pack_entry_one() to this one caller that needs it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: convert find_sha1_pack() to use object_id	Jeff King	5	-12/+15
	The find_sha1_pack() function has a few problems: - it's badly named, since it works with any object hash - it takes the hash as a bare pointer rather than an object_id struct We can fix both of these easily, as all callers actually have a real object_id anyway. I also found the existence of this function somewhat confusing, as it is about looking in an arbitrary set of linked packed_git structs. It's good for things like dumb-http which are looking in downloaded remote packs, and not our local packs. But despite the name, it is not a good way to find the pack which contains a local object (it skips the use of the midx, the pack mru list, and so on). So let's also add an explanatory comment above the declaration that may point people in the right direction. I suspect the calls in fast-import.c, which use the packed_git list from the repository struct, could actually just be using find_pack_entry(). But since we'd need to keep it anyway for dumb-http, I didn't dig further there. If we eventually drop dumb-http support, then it might be worth examining them to see if we can get rid of the function entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	http-walker: use object_id instead of bare hash	Jeff King	3	-16/+17
	We long ago switched most code to using object_id structs instead of bare "unsigned char *" hashes. This gives us more type safety from the compiler, and generally makes it easier to understand what we expect in each parameter. But the dumb-http code has lagged behind. And indeed, the whole "walker" subsystem interface has the same problem, though http-walker is the only user left. So let's update the walker interface to pass object_id structs (which we already have anyway at all call sites!), and likewise use those within the http-walker methods that it calls. This cleans up the dumb-http code a bit, but will also let us fix a few more commonly used helper functions. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: warn people away from parse_packed_git()	Jeff King	1	-1/+10
	With a name like parse_packed_git(), you might think it's the right way to access a local pack index and its associated objects. But not so! It's a one-off used by the dumb-http code to access pack idx files we've downloaded from the remote, but whose packs we might not have. There's only one caller left for this function, and ideally we'd drop it completely and just inline it there. But that would require exposing other internals from packfile.[ch], like alloc_packed_git() and check_packed_git_idx(). So let's leave it be for now, and just warn people that it's probably not what they're looking for. Perhaps in the long run if we eventually drop dumb-http support, we can remove the function entirely then. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25	packfile: drop sha1_pack_index_name()	Jeff King	3	-14/+4
	Like sha1_pack_name() that we dropped in the previous commit, this function uses an error-prone static strbuf and the somewhat misleading name "sha1". The only caller left is in pack-redundant.c. While this command is marked for potential removal in our BreakingChanges document, we still have it for now. But it's simple enough to convert it to use its own strbuf with the underlying odb_pack_name() function, letting us drop the otherwise obsolete function. Note that odb_pack_name() does its own strbuf_reset(), so it's safe to use directly within a loop like this. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>