git - git

	Commit message (Collapse)	Author	Files	Lines
2022-10-26	Downmerge a handful of topics for 2.38.2	Junio C Hamano	2	-1/+48

2022-10-21	The fifth batch	Junio C Hamano	1	-1/+21
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-20	ci: use DC_SHA1=YesPlease on osx-clang job for CI	Junio C Hamano	1	-0/+2
	7b8cfe34 (Merge branch 'ed/fsmonitor-on-networked-macos', 2022-10-17) broke the build on macOS with sha1dc by bypassing our hash abstraction (git_SHA_CTX etc.), but it wasn't caught before the problematic topic was merged down to the 'master' branch. Nobody was even compile testing with DC_SHA1 set, although it is the recommended choice in these days for folks when they use SHA-1. This was because the default for macOS uses Apple Common Crypto, and both of the two CI jobs did not override the default. Tweak one of them to use DC_SHA1 to improve the coverage. We may want to give similar diversity for Linux jobs so that some of them build with other implementations of SHA-1; they currently all build and test with DC_SHA1 as that is the default on everywhere other than macOS. But let's start small to fill only the immediate need. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-20	ci: add address and undefined sanitizer tasks	Junio C Hamano	2	-0/+12
	The current code is clean with these two sanitizers, and we would like to keep it that way by running the checks for any new code. The signal of "passed with asan, but not ubsan" (or vice versa) is not that useful in practice, so it is tempting to run both santizers in a single task, but it seems to take forever, so tentatively let's try having two separate ones. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-20	The fourth batch	Junio C Hamano	1	-0/+9
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	cmake: increase time-out for a long-running test	Johannes Schindelin	1	-0/+4
	As suggested in https://github.com/git-for-windows/git/issues/3966#issuecomment-1221264238, t7112 can run for well over one hour, which seems to be the default maximum run time at least when running CTest-based tests in Visual Studio. Let's increase the time-out as a stop gap to unblock developers wishing to run Git's test suite in Visual Studio. Note: The actual run time is highly dependent on the circumstances. For example, in Git's CI runs, the Windows-based tests typically take a bit over 5 minutes to run. CI runs have the added benefit that Windows Defender (the common anti-malware scanner on Windows) is turned off, something many developers are not at liberty to do on their work stations. When Defender is turned on, even on this developer's high-end Ryzen system, t7112 takes over 15 minutes to run. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	cmake: avoid editing t/test-lib.sh	Johannes Schindelin	4	-6/+13
	In 7f5397a07c6c (cmake: support for testing git when building out of the source tree, 2020-06-26), we implemented support for running Git's test scripts even after building Git in a different directory than the source directory. The way we did this was to edit the file `t/test-lib.sh` to override `GIT_BUILD_DIR` to point somewhere else than the parent of the `t/` directory. This is unideal because it always leaves a tracked file marked as modified, and it is all too easy to commit that change by mistake. Let's change the strategy by teaching `t/test-lib.sh` to detect the presence of a file called `GIT-BUILD-DIR` in the source directory. If it exists, the contents are interpreted as the location to the _actual_ build directory. We then write this file as part of the CTest definition. To support building Git via a regular `make` invocation after building it using CMake, we ensure that the `GIT-BUILD-DIR` file is deleted (for convenience, this is done as part of the Makefile rule that is already run with every `make` invocation to ensure that `GIT-BUILD-OPTIONS` is up to date). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	add -p: avoid ambiguous signed/unsigned comparison	Johannes Schindelin	1	-1/+1
	In the interactive `add` operation, users can choose to jump to specific hunks, and Git will present the hunk list in that case. To avoid showing too many lines at once, only a maximum of 21 hunks are shown, skipping the "mode change" pseudo hunk. The comparison performed to skip the "mode change" pseudo hunk (if any) compares a signed integer `i` to the unsigned value `mode_change` (which can be 0 or 1 because it is a 1-bit type). According to section 6.3.1.8 of the C99 standard (see e.g. https://www.open-std.org/jtc1/sc22/WG14/www/docs/n1256.pdf), what should happen is an automatic conversion of the "lesser" type to the "greater" type, but since the types differ in signedness, it is ill-defined what is the correct "usual arithmetic conversion". Which means that Visual C's behavior can (and does) differ from GCC's: When compiling Git using the latter, `add -p`'s `goto` command shows no hunks by default because it casts a negative start offset to a pretty large unsigned value, breaking the "goto hunk" test case in `t3701-add-interactive.sh`. Let's avoid that by converting the unsigned bit explicitly to a signed integer. Note: This is a long-standing bug in the Visual C build of Git, but it has never been caught because t3701 is skipped when `NO_PERL` is set, which is the case in the `vs-test` jobs of Git's CI runs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	cmake: copy the merge tools for testing	Johannes Schindelin	1	-1/+2
	Even when running the tests via CTest, t7609 and t7610 rely on more than only a few mergetools to be copied to the build directory. Let's make it so. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	cmake: make it easier to diagnose regressions in CTest runs	Johannes Schindelin	1	-1/+1
	When a test script fails in Git's test suite, the usual course of action is to re-run it using options to increase the verbosity of the output, e.g. `-v` and `-x`. Like in Git's CI runs, when running the tests in Visual Studio via the CTest route, it is cumbersome or at least requires a very unintuitive approach to pass options to the test scripts: the CMakeLists.txt file would have to be modified, passing the desired options to _all_ test scripts, and then the CMake Cache would have to be reconfigured before running the test in question individually. Unintuitive at best, and opposite to the niceties IDE users expect. So let's just pass those options by default: This will not clutter any output window but the log that is written to a log file will have information necessary to figure out test failures. While at it, also imitate what the Windows jobs in Git's CI runs do to accelerate running the test scripts: pass the `--no-bin-wrappers` and `--no-chain-lint` options. This makes the test runs noticeably faster because the `bin-wrappers/` scripts as well as the `chain-lint` code make heavy use of POSIX shell scripting, which is really, really slow on Windows due to the need to emulate POSIX behavior via the MSYS2 runtime. In a test by Eric Sunshine, it added two minutes (!) just to perform the chain-lint task. The idea of adding a CMake config option (á la `GIT_TEST_OPTS`) was considered during the development of this patch, but then dropped: such a setting is global, across _all_ tests, where e.g. `--run=...` would not make sense. Users wishing to override these new defaults are better advised running the test script manually, in a Git Bash, with full control over the command line. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	fsmonitor OSX: compile with DC_SHA1=YesPlease	Ævar Arnfjörð Bjarmason	1	-5/+5
	As we'll address in subsequent commits the "DC_SHA1=YesPlease" is not on by default on OSX, instead we use Apple Common Crypto's SHA-1 implementation. In 6beb2688d33 (fsmonitor: relocate socket file if .git directory is remote, 2022-10-04) the build was broken with "DC_SHA1=YesPlease" (and probably other non-"APPLE_COMMON_CRYPTO" SHA-1 backends). So let's extract the fix for this from [1] to get the build working again with "DC_SHA1=YesPlease". In addition to the fix in [1] we also need to replace "SHA_DIGEST_LENGTH" with "GIT_MAX_RAWSZ". 1. https://lore.kernel.org/git/c085fc15b314abcb5e5ca6b4ee5ac54a28327cab.1665326258.git.gitgitgadget@gmail.com/ Signed-off-by: Eric DeCosta <edecosta@mathworks.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-19	Makefile: force -O0 when compiling with SANITIZE=leak	Jeff King	1	-0/+1
	Compiling with -O2 can interact badly with LSan's leak-checker, causing false positives. Imagine a simplified example like: char *str = allocate_some_string(); if (some_func(str) < 0) die("bad str"); free(str); The compiler may eliminate "str" as a stack variable, and just leave it in a register. The register is preserved through most of the function, including across the call to some_func(), since we'd eventually need to free it. But because die() is marked with NORETURN, the compiler knows that it doesn't need to save registers, and just clobbers it. When die() eventually exits, the leak-checker runs. It looks in registers and on the stack for any reference to the memory allocated by str (which would indicate that it's not leaked), but can't find one. So it reports it as a leak. Neither system is wrong, really. The C standard (mostly section 5.1.2.3) defines an abstract machine, and compilers are allowed to modify the program as long as the observable behavior of that abstract machine is unchanged. Looking at random memory values on the stack is undefined behavior, and not something that the optimizer needs to support. But there really isn't any other way for a leak checker to work; it inherently has to do undefined things like scouring memory for pointers. So the two things are inherently at odds with each other. We can't fix it by changing the code, because from the perspective of the program running in an abstract machine, there is no leak. This has caused real false positives in the past, like: - https://lore.kernel.org/git/patch-v3-5.6-9a44204c4c9-20211022T175227Z-avarab@gmail.com/ - https://lore.kernel.org/git/Yy4eo6500C0ijhk+@coredump.intra.peff.net/ - https://lore.kernel.org/git/Y07yeEQu+C7AH7oN@nand.local/ This patch makes those go away by forcing -O0 when compiling with LSan. There are a few ways we could do this: - we could just teach the linux-leaks CI job to set -O0. That's the smallest change, and means we wouldn't get spurious CI failures. But it doesn't help people looking for leaks manually or in a specific test (and because the problem depends on the vagaries of the optimizer, investigating these can waste a lot of time in head-scratching as the problem comes and goes) - we default to -O2 in CFLAGS; we could pull this out to a separate variable ("-O$(O)" or something) and modify "O" when LSan is in use. This is the most flexible, in that you could still build with "make O=2 SANITIZE=leak" if you really wanted to (say, for experimenting). But it would also fail to kick in if the user defines their own CFLAGS variable, which again leads to head-scratching. - we can just stick -O0 into BASIC_CFLAGS when enabling LSan. Since this comes after the user-provided CFLAGS, it will override any previous -O setting found there. This is more foolproof, albeit less flexible. If you want to experiment with an optimized leak-checking build, you'll have to put "-O2 -fsanitize=leak" into CFLAGS manually, rather than using our SANITIZE=leak Makefile magic. Since the final one is the least likely to break in normal use, this patch uses that approach. The resulting build is a little slower, of course, but since LSan is already about 2x slower than a regular build, another 10% slowdown isn't that big a deal. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	repack: don't remove .keep packs with `--pack-kept-objects`	Taylor Blau	3	-1/+34
	`git repack` supports a `--pack-kept-objects` flag which more or less translates to whether or not we pass `--honor-pack-keep` down to `git pack-objects` when assembling a new pack. This behavior has existed since ee34a2bead (repack: add `repack.packKeptObjects` config var, 2014-03-03). In that commit, the documentation was extended to say: [...] Note that we still do not delete `.keep` packs after `pack-objects` finishes. Unfortunately, this is not the case when `--pack-kept-objects` is combined with a `--geometric` repack. When doing a geometric repack, we include `.keep` packs when enumerating available packs only when `pack_kept_objects` is set. So this all works fine when `--no-pack-kept-objects` (or similar) is given. Kept packs are excluded from the geometric roll-up, so when we go to delete redundant packs (with `-d`), no `.keep` packs appear "below the split" in our geometric progression. But when `--pack-kept-objects` is given, things can go awry. Namely, when a kept pack is included in the list of packs tracked by the `pack_geometry` struct and part of the pack roll-up, we will delete the `.keep` pack when we shouldn't. Note that this doesn't result in object corruption, since the `.keep` pack's objects are still present in the new pack. But the `.keep` pack itself is removed, which violates our promise from back in ee34a2bead. But there's more. Because `repack` computes the geometric roll-up independently from selecting which packs belong in a MIDX (with `--write-midx`), this can lead to odd behavior. Consider when a `.keep` pack appears below the geometric split (ie., its objects will be part of the new pack we generate). We'll write a MIDX containing the new pack along with the existing `.keep` pack. But because the `.keep` pack appears below the geometric split line, we'll (incorrectly) try to remove it. While this doesn't corrupt the repository, it does cause us to remove the MIDX we just wrote, since removing that pack would invalidate the new MIDX. Funny enough, this behavior became far less noticeable after e4d0c11c04 (repack: respect kept objects with '--write-midx -b', 2021-12-20), which made `pack_kept_objects` be enabled by default only when we were writing a non-MIDX bitmap. But e4d0c11c04 didn't resolve this bug, it just made it harder to notice unless callers explicitly passed `--pack-kept-objects`. The solution is to avoid trying to remove `.keep` packs during `--geometric` repacks, even when they appear below the geometric split line, which is the approach this patch implements. Co-authored-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	ll-merge: mark unused parameters in callbacks	Jeff King	1	-9/+9
	We have a generic ll_merge_fn, but not every implementation needs every parameter. In particular, neither binary nor ext merges care about names (since they do not generate conflict markers), and most do not need to look at the ll_merge_driver itself. Ironically, neither ll_xdl_merge() nor ll_union_merge() needs to have their driver parameter annotated (even though both are named drv_unused!). This is because they may fall back to calling ll_binary_merge() directly. And even though that function won't look at it, we still pass it along, and hence it is "used" in the caller. We could get away with passing NULL, but that's likely more confusing and brittle than just passing along our own driver. And we have to keep the driver parameter in all callbacks, since ll_ext_merge() uses it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	diffcore-pickaxe: mark unused parameters in pickaxe functions	Jeff King	1	-2/+2
	We have a virtual pickaxe_fn for handling -G versus -S pickaxe options. They need to take the same set of parameters, but of course they care about different ones (e.g., a regex -G will never use a kwset). Mark the unused ones to appease -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	convert: mark unused parameter in null stream filter	Jeff King	1	-2/+2
	The null stream filter unsurprisingly does not look at its "filter" argument, since it just eats bytes. But we can't drop it, since it has to conform to the same virtual interface that real filters do. Mark the unused parameter to appease -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	apply: mark unused parameters in noop error/warning routine	Jeff King	1	-1/+1
	We squelch error/warning output by passing a noop handler to set_error_routine(). We need to tell the compiler that this is intended so that it doesn't trigger -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	apply: mark unused parameters in handlers	Jeff King	1	-8/+8
	In parse_git_diff_header(), we have a table-driven parser that maps strings to handler functions. Not all handlers need all of the parameters; let's mark the unused ones to appease -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	date: mark unused parameters in handler functions	Jeff King	1	-3/+3
	When parsing approxidates, we use a table to map special strings (like "noon") to functions which handle them. Not all functions need the "now" parameter, as they are not relative (e.g., "yesterday" does, but "pm" does not). Let's annotate those to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	string-list: mark unused callback parameters	Jeff King	7	-7/+8
	String-lists may be used with callbacks for clearing or iteration. These callbacks need to conform to a particular interface, even though not every callback needs all of its parameters. Mark the unused ones to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	object-file: mark unused parameters in hash_unknown functions	Jeff King	1	-5/+10
	The 0'th entry of our hash_algos array fills out the virtual methods with a series of functions which simply BUG(). This is the right thing to do, since the point is to catch use of an invalid algo parameter, but we need to annotate them to appease -Wunused-parameters. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	mark unused parameters in trivial compat functions	Jeff King	3	-8/+12
	When a platform feature isn't available or in use, we sometimes conditionally compile empty or trivial functions to turn these into noops. We need to annotate their parameters so that -Wunused-parameters won't complain about them. Note that there are many more of these in compat/mingw.h, but we'll leave them for now, as there's some trickery required to get the UNUSED macro available there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	update-index: drop unused argc from do_reupdate()	Jeff King	1	-3/+3
	The parse-options callback for --again soaks up all remaining options by manipulating the parse_opt_ctx's argc and argv fields. Even though it has to look at both, the actual parsing happens via the do_reupdate() helper, which only looks at the argv half (by passing it along to parse_pathspec). So that helper doesn't need to see argc at all. Note that the helper does look at "argv + 1" without confirming that argc is greater than 0. We know this is correct because it is skipping past the actual "--again" string, which will always be present. However, to make what's going on more obvious, let's move that "+1" into the caller, which has the matching "-1" when fixing up the ctx's argc/argv. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	submodule--helper: drop unused argc from module_list_compute()	Jeff King	1	-9/+9
	The module_list_compute() function takes an argc/argv pair, but never looks at argc. This is OK, as the NULL terminator in argv is sufficient for our purposes (we feed it to parse_pathspec(), which takes only the array, not a count). Note that one of the callers _looks_ like it would be buggy, but isn't: we pass 0/NULL for argc/argv from module_foreach(), so finding the terminating NULL in that argv naively would segfault. However, parse_pathspec() is smart enough to interpret a bare NULL as an empty argv. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-18	diffstat_consume(): assert non-zero length	Jeff King	1	-0/+3
	The callback interface for xdiff_emit_line_fn gives us a line/len pair, but diffstat_consume() never looks at "len". At first glance this seems like a bug that could cause us to read further than xdiff intends. But in practice, we read only the first character, and xdiff would never pass us an empty line. Let's add a run-time assertion that this is true, which clarifies our assumption and silences -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-17	The third batch	Junio C Hamano	1	-0/+34
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-14	t1002: modernize outdated conditional	Nsengiyumva Wilberforce	1	-8/+8
	Tests in this script use an unusual and hard to reason about conditional construct if expression; then false; else :; fi Change them to use more idiomatic construct: ! expression Cc: Christian Couder <christian.couder@gmail.com> Cc: Hariom Verma <hariom18599@gmail.com> Signed-off-by: Nsengiyumva Wilberforce <nsengiyumvawilberforce@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13	pack-bitmap-write.c: instrument number of reused bitmaps	Taylor Blau	1	-1/+7
	When debugging bitmap generation performance, it is useful to know how many bitmaps were generated from scratch, and how many were the result of permuting the bit-order of an existing bitmap. Keep track of the latter, and emit the count as a trace2_data line to aid in debugging. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13	midx.c: instrument MIDX and bitmap generation with trace2 regions	Taylor Blau	1	-0/+28
	When debugging MIDX and MIDX-bitmap related issues, it is useful to figure out where Git is spending its time. GitHub has been using the below trace2 regions to instrument various components of generating a MIDX itself, as well time spent preparing to build a MIDX bitmap. These are limited to instrumenting the following functions: - midx.c::find_commits_for_midx_bitmap() - midx.c::midx_pack_order() - midx.c::prepare_midx_packing_data() - midx.c::write_midx_bitmap() - midx.c::write_midx_internal() - midx.c::write_midx_reverse_index() to start and end with a trace2_region_enter() and trace2_region_leave(), respectively. The category for all of these is "midx", which matches the existing convention. The region description matches the name of the function. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13	midx.c: consider annotated tags during bitmap selection	Taylor Blau	2	-0/+28
	When generating a multi-pack bitmap without a `--refs-snapshot` (e.g., by running `git multi-pack-index write --bitmap` directly), we determine the set of bitmap-able commits by enumerating each reference, and adding the referrent as the tip of a reachability traversal when it appears somewhere in the MIDX. (Any commit we encounter during the reachability traversal then becomes a candidate for bitmap selection). But we incorrectly avoid peeling the object at the tip of each reference. So if we see some reference that points at an annotated tag (which in turn points through zero or more additional annotated tags at a commit), that we will not add it as a tip for the reachability traversal. This means that if some commit C is only referenced through one or more annotated tag(s), then C won't become a bitmap candidate. Correct this by peeling the reference tips as we enumerate them to ensure that we consider commits which are the targets of annotated tags, in addition to commits which are referenced directly. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13	midx.c: fix whitespace typo	Taylor Blau	1	-1/+1
	This was unintentionally introduced via 893b563505 (midx: inline nth_midxed_pack_entry(), 2021-09-11) where "struct repository r" became "struct repository r". The latter does not adhere to our usual style conventions, so fix that up to look more like our usual declarations. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-13	config: respect includes in protected config	Glen Choo	3	-22/+26
	Protected config is implemented by reading a fixed set of paths, which ignores config [include]-s. Replace this implementation with a call to config_with_options(), which handles [include]-s and saves us from duplicating the logic of 1) identifying which paths to read and 2) reading command line config. As a result, git_configset_add_parameters() is unused, so remove it. It was introduced alongside protected config in 5b3c650777 (config: learn `git_protected_config()`, 2022-07-14) as a way to handle command line config. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: remove "max_processes", add "const" to signal() handler	Ævar Arnfjörð Bjarmason	1	-11/+26
	As with the *_fn members removed in a preceding commit, let's not copy the "processes" member of the "struct run_process_parallel_opts" over to the "struct parallel_processes". In this case we need the number of processes for the kill_children() function, which will be called from a signal handler. To do that adjust this code added in c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15) so that we use a dedicated "struct parallel_processes_for_signal" for passing data to the signal handler, in addition to the "struct parallel_process" it'll now have access to our "opts" variable. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: pass "opts" further down, and use "opts->processes"	Ævar Arnfjörð Bjarmason	1	-8/+11
	Continue the migration away from the "max_processes" member of "struct parallel_processes" to the "processes" member of the "struct run_process_parallel_opts", in this case we needed to pass the "opts" further down into pp_cleanup() and pp_buffer_stderr(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: use "opts->processes", not "pp->max_processes"	Ævar Arnfjörð Bjarmason	1	-7/+9
	Neither the "processes" nor "max_processes" members ever change after their initialization, and they're always equivalent, but some existing code used "pp->max_processes" when we were already passing the "opts" to the function, let's use the "opts" directly instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: don't copy "data" to "struct parallel_processes"	Ævar Arnfjörð Bjarmason	1	-6/+3
	As with the *_fn members removed in a preceding commit, let's not copy the "data" member of the "struct run_process_parallel_opts" over to the "struct parallel_processes". Now that we're passing the "opts" down there's no reason to do so. This makes the code easier to follow, as we have a "const" attribute on the "struct run_process_parallel_opts", but not "struct parallel_processes". We do not alter the "ungroup" argument, so storing it in the non-const structure would make this control flow less obvious. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: don't copy "ungroup" to "struct parallel_processes"	Ævar Arnfjörð Bjarmason	1	-10/+8
	As with the *_fn members removed in the preceding commit, let's not copy the "ungroup" member of the "struct run_process_parallel_opts" over to the "struct parallel_processes". Now that we're passing the "opts" down there's no reason to do so. This makes the code easier to follow, as we have a "const" attribute on the "struct run_process_parallel_opts", but not "struct parallel_processes". We do not alter the "ungroup" argument, so storing it in the non-const structure would make this control flow less obvious. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: don't copy *_fn to "struct parallel_processes"	Ævar Arnfjörð Bjarmason	1	-42/+25
	The only remaining reason for copying the callbacks in the "struct run_process_parallel_opts" over to the "struct parallel_processes" was to avoid two if/else statements in case the "start_failure" and "task_finished" callbacks were NULL. Let's handle those cases in pp_start_one() and pp_collect_finished() instead, and avoid the default_* stub functions, and the need to copy this data around. Organizing the code like this made more sense before the "struct run_parallel_parallel_opts" existed, as we'd have needed to pass each of these as a separate parameter. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: make "struct parallel_processes" const if possible	Ævar Arnfjörð Bjarmason	1	-2/+2
	Add a "const" to two "struct parallel_processes" parameters where we're not modifying anything in "pp". For kill_children() we'll call it from both the signal handler, and from run_processes_parallel() itself. Adding a "const" there makes it clear that we don't need to modify any state when killing our children. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command API: move *_tr2() users to "run_processes_parallel()"	Ævar Arnfjörð Bjarmason	5	-37/+36
	Have the users of the "run_processes_parallel_tr2()" function use "run_processes_parallel()" instead. In preceding commits the latter was refactored to take a "struct run_process_parallel_opts" argument, since the only reason for "run_processes_parallel_tr2()" to exist was to take arguments that are now a part of that struct we can do away with it. See ee4512ed481 (trace2: create new combined trace facility, 2019-02-22) for the addition of the "*_tr2()" variant of the function, it was used by every caller except "t/helper/test-run-command.c".. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command API: have run_process_parallel() take an "opts" struct	Ævar Arnfjörð Bjarmason	4	-59/+121
	As noted in fd3aaf53f71 (run-command: add an "ungroup" option to run_process_parallel(), 2022-06-07) which added the "ungroup" passing it to "run_process_parallel()" via the global "run_processes_parallel_ungroup" variable was a compromise to get the smallest possible regression fix for "maint" at the time. This follow-up to that is a start at passing that parameter and others via a new "struct run_process_parallel_opts", as the earlier version[1] of what became fd3aaf53f71 did. Since we need to change all of the occurrences of "n" to "opt->SOMETHING" let's take the opportunity and rename the terse "n" to "processes". We could also have picked "max_processes", "jobs", "threads" etc., but as the API is named "run_processes_parallel()" let's go with "processes". Since the new "run_processes_parallel()" function is able to take an optional "tr2_category" and "tr2_label" via the struct we can at this point migrate all of the users of "run_processes_parallel_tr2()" over to it. But let's not migrate all the API users yet, only the two users that passed the "ungroup" parameter via the "run_processes_parallel_ungroup" global 1. https://lore.kernel.org/git/cover-v2-0.8-00000000000-20220518T195858Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command.c: use designated init for pp_init(), add "const"	Ævar Arnfjörð Bjarmason	1	-20/+14
	Use a designated initializer to initialize those parts of pp_init() that don't need any conditionals for their initialization, this sets us on a path to pp_init() itself into mostly a validation and allocation function. Since we're doing that we can add "const" to some of the members of the "struct parallel_processes", which helps to clarify and self-document this code. E.g. we never alter the "data" pointer we pass t user callbacks, nor (after the preceding change to stop invoking online_cpus()) do we change "max_processes", the same goes for the "ungroup" option. We can also do away with a call to strbuf_init() in favor of macro initialization, and to rely on other fields being NULL'd or zero'd. Making members of a struct "const" rather that the pointer to the struct itself is usually painful, as e.g. it precludes us from incrementally setting up the structure. In this case we only set it up with the assignment in run_process_parallel() and pp_init(), and don't pass the struct pointer around as "const", so making individual members "const" is worth the potential hassle for extra safety. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command API: don't fall back on online_cpus()	Ævar Arnfjörð Bjarmason	4	-5/+14
	When a "jobs = 0" is passed let's BUG() out rather than fall back on online_cpus(). The default behavior was added when this API was implemented in c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15). Most of our code in-tree that scales up to "online_cpus()" by default calls that function by itself. Keeping this default behavior just for the sake of two callers means that we'd need to maintain this one spot where we're second-guessing the config passed down into pp_init(). The preceding commit has an overview of the API callers that passed "jobs = 0". There were only two of them (actually three, but they resolved to these two config parsing codepaths). The "fetch.parallel" caller already had a test for the "fetch.parallel=0" case added in 0353c688189 (fetch: do not run a redundant fetch from submodule, 2022-05-16), but there was no such test for "submodule.fetchJobs". Let's add one here. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command API: make "n" parameter a "size_t"	Ævar Arnfjörð Bjarmason	2	-26/+20
	Make the "n" variable added in c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15) a "size_t". As we'll see in a subsequent commit we do pass "0" here, but never "jobs < 0". We could have made it an "unsigned int", but as we're having to change this let's not leave another case in the codebase where a size_t and "unsigned int" size differ on some platforms. In this case it's likely to never matter, but it's easier to not need to worry about it. After this and preceding changes: make run-command.o DEVOPTS=extra-all CFLAGS=-Wno-unused-parameter Only has one (and new) -Wsigned-compare warning relevant to a comparison about our "n" or "{nr,max}_processes": About using our "n" (size_t) in the same expression as online_cpus() (int). A subsequent commit will adjust & deal with online_cpus() and that warning. The only users of the "n" parameter are: * builtin/fetch.c: defaults to 1, reads from the "fetch.parallel" config. As seen in the code that parses the config added in d54dea77dba (fetch: let --jobs=<n> parallelize --multiple, too, 2019-10-05) will die if the git_config_int() return value is < 0. It will however pass us n = 0, as we'll see in a subsequent commit. * submodule.c: defaults to 1, reads from "submodule.fetchJobs" config. Read via code originally added in a028a1930c6 (fetching submodules: respect `submodule.fetchJobs` config option, 2016-02-29). It now piggy-backs on the the submodule.fetchJobs code and validation added in f20e7c1ea24 (submodule: remove submodule.fetchjobs from submodule-config parsing, 2017-08-02). Like builtin/fetch.c it will die if the git_config_int() return value is < 0, but like builtin/fetch.c it will pass us n = 0. * builtin/submodule--helper.c: defaults to 1. Read via code originally added in 2335b870fa7 (submodule update: expose parallelism to the user, 2016-02-29). Since f20e7c1ea24 (submodule: remove submodule.fetchjobs from submodule-config parsing, 2017-08-02) it shares a config parser and semantics with the submodule.c caller. * hook.c: hardcoded to 1, see 96e7225b310 (hook: add 'run' subcommand, 2021-12-22). * t/helper/test-run-command.c: can be -1 after parsing the arguments, but will then be overridden to online_cpus() before passing it to this API. See be5d88e1128 (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command tests: use "return", not "exit"	Ævar Arnfjörð Bjarmason	1	-11/+22
	Change the "run-command" test helper to "return" instead of calling "exit", see 338abb0f045 (builtins + test helpers: use return instead of exit() in cmd_*, 2021-06-08) Because we'd previously gotten past the SANITIZE=leak check by using exit() here we need to move to "goto cleanup" pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command API: have "run_processes_parallel{,_tr2}()" return void	Ævar Arnfjörð Bjarmason	4	-41/+35
	Change the "run_processes_parallel{,_tr2}()" functions to return void, instead of int. Ever since c553c72eed6 (run-command: add an asynchronous parallel child processor, 2015-12-15) they have unconditionally returned 0. To get a "real" return value out of this function the caller needs to get it via the "task_finished_fn" callback, see the example in hook.c added in 96e7225b310 (hook: add 'run' subcommand, 2021-12-22). So the "result = " and "if (!result)" code added to "builtin/fetch.c" d54dea77dba (fetch: let --jobs=<n> parallelize --multiple, too, 2019-10-05) has always been redundant, we always took that "if" path. Likewise the "ret =" in "t/helper/test-run-command.c" added in be5d88e1128 (test-tool run-command: learn to run (parts of) the testsuite, 2019-10-04) wasn't used, instead we got the return value from the "if (suite.failed.nr > 0)" block seen in the context. Subsequent commits will alter this API interface, getting rid of this always-zero return value makes it easier to understand those changes. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-12	run-command test helper: use "else if" pattern	Ævar Arnfjörð Bjarmason	1	-8/+7
	Adjust the cmd__run_command() to use an "if/else if" chain rather than mutually exclusive "if" statements. This non-functional change makes a subsequent commit smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11	The second batch	Junio C Hamano	1	-0/+16
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11	CodingGuidelines: recommend against unportable C99 struct syntax	Ævar Arnfjörð Bjarmason	1	-0/+5
	Per 33665d98e6b (reftable: make assignments portable to AIX xlc v12.01, 2022-03-28) forms like ".a.b = c" can be replaced by using ".a = { .b = c }" instead. We'll probably allow these sooner than later, but since the workaround is trivial let's note it among the C99 features we'd like to hold off on for now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-11	grep.c: remove "extended" in favor of "pattern_expression", fix segfault	Ævar Arnfjörð Bjarmason	3	-9/+16
	Since 79d3696cfb4 (git-grep: boolean expression on pattern matching., 2006-06-30) the "pattern_expression" member has been used for complex queries (AND/OR...), with "pattern_list" being used for the simple OR queries. Since then we've used both "pattern_expression" and its associated boolean "extended" member to see if we have a complex expression. Since f41fb662f57 (revisions API: have release_revisions() release "grep_filter", 2022-04-13) we've had a subtle bug relating to that: If we supplied options that were only used for "complex queries", but didn't supply the query itself we'd set "opt->extended", but would have a NULL "pattern_expression". As a result these would segfault as we tried to call "free_grep_patterns()" from "release_revisions()": git -P log -1 --invert-grep git -P log -1 --all-match The root cause of this is that we were conflating the state management we needed in "compile_grep_patterns()" itself with whether or not we had an "opt->pattern_expression" later on. In this cases as we're going through "compile_grep_patterns()" we have no "opt->pattern_list" but have "opt->no_body_match" or "opt->all_match". So we'd set "opt->extended = 1", but not "return" on "opt->extended" as that's an "else if" in the same "if" statement. That behavior is intentional and required, as the common case is that we have an "opt->pattern_list" that we're about to parse into the "opt->pattern_expression". But we don't need to keep track of this "extended" flag beyond the state management in compile_grep_patterns() itself. It needs it, but once we're out of that function we can rely on "opt->pattern_expression" being non-NULL instead for using these extended patterns. As 79d3696cfb4 itself shows we've assumed that there's a one-to-one mapping between the two since the very beginning. I.e. "match_line()" would check "opt->extended" to see if it should call "match_expr()", and the first thing we do in that function is assume that we have a "opt->pattern_expression". We'd then call "match_expr_eval()", which would have died if that "opt->pattern_expression" was NULL. The "die" was added in c922b01f54c (grep: fix segfault when "git grep '('" is given, 2009-04-27), and can now be removed as it's now clearly unreachable. We still do the right thing in the case that prompted that fix: git grep '(' fatal: unmatched parenthesis Arguably neither the "--invert-grep" option added in [1] nor the earlier "--all-match" option added in [2] were intended to be used stand-alone, and another approach[3] would be to error out in those cases. But since we've been treating them as a NOOP when given without --grep for a long time let's keep doing that. We could also return in "free_pattern_expr()" if the argument is non-NULL, as an alternative fix for this segfault does [4]. That would be more elegant in making the "free_*()" function behave like "free()", but it would also remove a sanity check: The "free_pattern_expr()" function calls itself recursively, and only the top-level is allowed to be NULL, let's not conflate those two conditions. 1. 22dfa8a23de (log: teach --invert-grep option, 2015-01-12) 2. 0ab7befa31d (grep --all-match, 2006-09-27) 3. https://lore.kernel.org/git/patch-1.1-f4b90799fce-20221010T165711Z-avarab@gmail.com/ 4. http://lore.kernel.org/git/7e094882c2a71894416089f894557a9eae07e8f8.1665423686.git.me@ttaylorr.com Reported-by: orygaw <orygaw@protonmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>