git - git

	Commit message (Collapse)	Author	Files	Lines
2021-09-10	The fifth batch	Junio C Hamano	1	-0/+61
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	pack-objects: rename .idx files into place after .bitmap files	Ævar Arnfjörð Bjarmason	1	-1/+2
	In preceding commits the race of renaming .idx files in place before .rev files and other auxiliary files was fixed in pack-write.c's finish_tmp_packfile(), builtin/repack.c's "struct exts", and builtin/index-pack.c's final(). As noted in the change to pack-write.c we left in place the issue of writing .bitmap files after the .idx, let's fix that issue. See 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21) for commentary at the time when .bitmap was implemented about how those files are written out, nothing in that commit contradicts what's being done here. Note that this commit and preceding ones only close any race condition with .idx files being written before their auxiliary files if we're optimistic about our lack of fsync()-ing in this are not tripping us over. See the thread at [1] for a rabbit hole of various discussions about filesystem races in the face of doing and not doing fsync() (and if doing fsync(), not doing it properly). We may want to fsync the containing directory once after renaming the *.idx file into place, but that is outside the scope of this series. 1. https://lore.kernel.org/git/8735qgkvv1.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	pack-write: split up finish_tmp_packfile() function	Ævar Arnfjörð Bjarmason	4	-13/+39
	Split up the finish_tmp_packfile() function and use the split-up version in pack-objects.c in preparation for moving the step of renaming the .idx file later as part of a function change. Since the only other caller of finish_tmp_packfile() was in bulk-checkin.c, and it won't be needing a change to its .idx renaming, provide a thin wrapper for the old function as a static function in that file. If other callers end up needing the simpler version it could be moved back to "pack-write.c" and "pack.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	builtin/index-pack.c: move `.idx` files into place last	Taylor Blau	1	-2/+2
	In a similar spirit as preceding patches to `git repack` and `git pack-objects`, fix the identical problem in `git index-pack`. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	index-pack: refactor renaming in final()	Ævar Arnfjörð Bjarmason	1	-25/+23
	Refactor the renaming in final() into a helper function, this is similar in spirit to a preceding refactoring of finish_tmp_packfile() in pack-write.c. Before e37d0b8730b (builtin/index-pack.c: write reverse indexes, 2021-01-25) it probably wasn't worth it to have this sort of helper, due to the differing "else if" case for "pack" files v.s. "idx" files. But since we've got "rev" as well now, let's do the renaming via a helper, this is both a net decrease in lines, and improves the readability, since we can easily see at a glance that the logic for writing these three types of files is exactly the same, aside from the obviously differing cases of "*final_name" being NULL, and "make_read_only_if_same" being different. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	builtin/repack.c: move `.idx` files into place last	Taylor Blau	1	-1/+1
	In a similar spirit as the previous patch, fix the identical problem from `git repack` (which invokes `pack-objects` with a temporary location for output, and then moves the files into their final locations itself). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	pack-write.c: rename `.idx` files after `*.rev`	Taylor Blau	1	-1/+1
	We treat the presence of an `.idx` file as the indicator of whether or not it's safe to use a packfile. But `finish_tmp_packfile()` (which is responsible for writing the pack and moving the temporary versions of all of its auxiliary files into place) is inconsistent about the write order. Specifically, it moves the `.rev` file into place after the `.idx`, leaving open the possibility to open a pack which looks "ready" (because the `.idx` file exists and is readable) but appears momentarily to not have a `.rev` file. This causes Git to fall back to generating the pack's reverse index in memory. Though racy, this amounts to an unnecessary slow-down at worst, and doesn't affect the correctness of the resulting reverse index. Close this race by moving the .rev file into place before moving the .idx file into place. This still leaves the issue of `.idx` files being renamed into place before the auxiliary `.bitmap` file is renamed when in pack-object.c's write_pack_file() "write_bitmap_index" is true. That race will be addressed in subsequent commits. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	pack-write: refactor renaming in finish_tmp_packfile()	Ævar Arnfjörð Bjarmason	3	-24/+23
	Refactor the renaming in finish_tmp_packfile() into a helper function. The callers are now expected to pass a "name_buffer" ending in "pack-OID." instead of the previous "pack-", we then append "pack", "idx" or "rev" to it. By doing the strbuf_setlen() in rename_tmp_packfile() we reuse the buffer and avoid the repeated allocations we'd get if that function had its own temporary "struct strbuf". This approach of reusing the buffer does make the last user in pack-object.c's write_pack_file() slightly awkward, since we needlessly do a strbuf_setlen() before calling strbuf_release() for consistency. In subsequent changes we'll move that bitmap writing code around, so let's not skip the strbuf_setlen() now. The previous strbuf_reset() idiom originated with 5889271114a (finish_tmp_packfile():use strbuf for pathname construction, 2014-03-03), which in turn was a minimal adjustment of pre-strbuf code added in 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	bulk-checkin.c: store checksum directly	Taylor Blau	1	-6/+6
	finish_bulk_checkin() stores the checksum from finalize_hashfile() by writing to the `hash` member of `struct object_id`, but that hash has nothing to do with an object id (it's just a convenient location that happens to be sized correctly). Store the hash directly in an unsigned char array. This behaves the same as writing to the `hash` member, but makes the intent clearer (and avoids allocating an extra four bytes for the `algo` member of `struct object_id`). It likewise prevents the possibility of a segfault when reading `algo` (e.g., by calling `oid_to_hex()`) if it is uninitialized. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	pack-bitmap: drop bitmap_index argument from try_partial_reuse()	Jeff King	1	-3/+2
	Starting in commit 0f533c7284 (pack-bitmap: read multi-pack bitmaps, 2021-08-31), we no longer look at the "struct bitmap_index" passed to try_partial_reuse(). This is because we only handle verbatim reuse from a single pack: either the pack whose bitmap we're looking at, or the "preferred" pack of a midx bitmap. And thus the primary item we look at is the "pack" parameter added by that same commit, and not the bitmap_git->pack parameter (which would be NULL for a midx bitmap). It's our caller, reuse_partial_packfile_from_bitmap(), which decides which pack to use and passes it in to us. Drop the unused parameter to prevent confusion. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-10	pack-bitmap: drop repository argument from prepare_midx_bitmap_git()	Jeff King	3	-5/+3
	We never look at the repository argument which is passed. This makes sense, since the multi_pack_index struct already tells us everything we need to access the files in its associated object directory. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09	pack.h: line-wrap the definition of finish_tmp_packfile()	Ævar Arnfjörð Bjarmason	1	-1/+6
	Line-wrap the definition of finish_tmp_packfile() to make subsequent changes easier to read. See 0e990530ae (finish_tmp_packfile(): a helper function, 2011-10-28) for the commit that introduced this overly long line. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09	entry: show finer-grained counter in "Filtering content" progress line	SZEDER Gábor	1	-7/+5
	The "Filtering content" progress in entry.c:finish_delayed_checkout() is unusual because of how it calculates the progress count and because it shows the progress of a nested loop. It works basically like this: start_delayed_progress(p, nr_of_paths_to_filter) for_each_filter { display_progress(p, nr_of_paths_to_filter - nr_of_paths_still_left_to_filter) for_each_path_handled_by_the_current_filter { checkout_entry() } } stop_progress(p) There are two issues with this approach: - The work done by the last filter (or the only filter if there is only one) is never counted, so if the last filter still has some paths to process, then the counter shown in the "done" progress line will not match the expected total. The partially-RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] helps spot this issue. Under it the 'missing file in delayed checkout' and 'invalid file in delayed checkout' tests in 't0021-conversion.sh' fail, because both use only one filter. (The test 'delayed checkout in process filter' uses two filters but the first one does all the work, so that test already happens to succeed even with GIT_TEST_CHECK_PROGRESS=1.) - The progress counter is updated only once per filter, not once per processed path, so if a filter has a lot of paths to process, then the counter might stay unchanged for a long while and then make a big jump (though the user still gets a sense of progress, because we call display_throughput() after each processed path to show the amount of processed data). Move the display_progress() call to the inner loop, right next to that checkout_entry() call that does the hard work for each path, and use a dedicated counter variable that is incremented upon processing each path. After this change the 'invalid file in delayed checkout' in 't0021-conversion.sh' would succeed with the GIT_TEST_CHECK_PROGRESS=1 assertion discussed above, but the 'missing file in delayed checkout' test would still fail. It'll fail because its purposefully buggy filter doesn't process any paths, so we won't execute that inner loop at all, see [2] for how to spot that issue without GIT_TEST_CHECK_PROGRESS=1. It's not straightforward to fix it with the current progress.c library (see [3] for an attempt), so let's leave it for now. Let's also initialize the progress to "NULL" while we're at it. Since 7a132c628e5 (checkout: make delayed checkout respect --quiet and --no-progress, 2021-08-26) we have had progress conditional on "show_progress", usually we use the idiom of a "NULL" initialization of the "progress", rather than the more verbose ternary added in 7a132c628e5. 1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/ 2. http://lore.kernel.org/git/20210802214827.GE23408@szeder.dev 3. https://lore.kernel.org/git/20210620200303.2328957-7-szeder.dev@gmail.com/ Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-09	commit-graph: fix bogus counter in "Scanning merged commits" progress line	SZEDER Gábor	1	-1/+1
	The final value of the counter of the "Scanning merged commits" progress line is always one less than its expected total, e.g.: Scanning merged commits: 83% (5/6), done. This happens because while iterating over an array the loop variable is passed to display_progress() as-is, but while C arrays (and thus the loop variable) start at 0 and end at N-1, the progress counter must end at N. Fix this by passing 'i + 1' to display_progress(), like most other callsites do. There's an RFC series to add a GIT_TEST_CHECK_PROGRESS=1 mode[1] which catches this issue in the 'fetch.writeCommitGraph' and 'fetch.writeCommitGraph with submodules' tests in 't5510-fetch.sh'. The GIT_TEST_CHECK_PROGRESS=1 mode is not part of this series, but future changes to progress.c may add it or similar assertions to catch this and similar bugs elsewhere. 1. https://lore.kernel.org/git/20210620200303.2328957-1-szeder.dev@gmail.com/ Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	The fourth batch	Junio C Hamano	1	-0/+33
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	t7814: show lack of alternate ODB-adding	Jonathan Tan	1	-0/+3
	The previous patches have made "git grep" no longer need to add submodule ODBs as alternates, at least for the code paths tested in t7814. Demonstrate this by making adding a submodule ODB as an alternate fatal in this test. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	submodule-config: pass repo upon blob config read	Jonathan Tan	3	-9/+19
	When reading the config of a submodule, if reading from a blob, read using an explicitly specified repository instead of by adding the submodule's ODB as an alternate and then reading an object from the_repository. This makes the "grep --recurse-submodules with submodules without .gitmodules in the working tree" test in t7814 work when GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB is true. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	grep: add repository to OID grep sources	Jonathan Tan	3	-12/+27
	Record the repository whenever an OID grep source is created, and teach the worker threads to explicitly provide the repository when accessing objects. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	grep: allocate subrepos on heap	Jonathan Tan	1	-9/+30
	Currently, struct repository objects corresponding to submodules are allocated on the stack in grep_submodule(). This currently works because they will not be used once grep_submodule() exits, but a subsequent patch will require these structs to be accessible for longer (perhaps even in another thread). Allocate them on the heap and clear them only at the very end. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	grep: read submodule entry with explicit repo	Jonathan Tan	1	-5/+5
	Replace an existing parse_object_or_die() call (which implicitly works on the_repository) with a function call that allows a repository to be passed in. There is no such direct equivalent to parse_object_or_die(), but we only need the type of the object, so replace with oid_object_info(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	grep: typesafe versions of grep_source_init	Jonathan Tan	3	-23/+34
	grep_source_init() can create "struct grep_source" objects and, depending on the value of the type passed, some void-pointer parameters have different meanings. Because one of these types (GREP_SOURCE_OID) will require an additional parameter in a subsequent patch, take the opportunity to increase clarity and type safety by replacing this function with individual functions for each type. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	grep: use submodule-ODB-as-alternate lazy-addition	Jonathan Tan	3	-1/+7
	In the parent commit, Git was taught to add submodule ODBs as alternates lazily, but grep does not use this because it computes the path to add directly, not going through add_submodule_odb(). Add an equivalent to add_submodule_odb() that takes the exact ODB path and teach grep to use it. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	submodule: lazily add submodule ODBs as alternates	Jonathan Tan	4	-1/+41
	Teach Git to add submodule ODBs as alternates to the object store of the_repository only upon the first access of an object not in the_repository, and not when add_submodule_odb() is called. This provides a means of gradually migrating from accessing a submodule's object through alternates to accessing a submodule's object by explicitly passing its repository object. Any Git command can declare that it might access submodule objects by calling add_submodule_odb() (as they do now), but the submodule ODBs themselves will not be added until needed, so individual commands and/or combinations of arguments can be migrated one by one. [The advantage of explicit repository-object passing is code clarity (it is clear which repository an object read is from), performance (there is no need to linearly search through all submodule ODBs whenever an object is accessed from any repository, whether superproject or submodule), and the possibility of future features like partial clone submodules (which right now is not possible because if an object is missing, we do not know which repository to lazy-fetch into).] This commit also introduces an environment variable that a test may set to make the actual registration of alternates fatal, in order to demonstrate that its codepaths do not need this registration. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Emily Shaffer <emilyshaffer@google.com> Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	pack-write: skip .rev work when not writing .rev	Ævar Arnfjörð Bjarmason	1	-0/+3
	Fix a performance regression introduced in a587b5a786 (pack-write.c: extract 'write_rev_file_order', 2021-03-30) and stop needlessly allocating the "pack_order" array and sorting it with "pack_order_cmp()", only to throw that work away when we discover that we're not writing .rev files after all. This redundant work was not present in the original version of this code added in 8ef50d9958 (pack-write.c: prepare to write 'pack-.rev' files, 2021-01-25). There we'd call write_rev_file() from e.g. finish_tmp_packfile(), but we'd "return NULL" early in write_rev_file() if not doing a "WRITE_REV" or "WRITE_REV_VERIFY". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-08	hash-object: prefix_filename() returns allocated memory these days	Junio C Hamano	1	-1/+1
	Back when a1be47e4 (hash-object: fix buffer reuse with --path in a subdirectory, 2017-03-20) was written, the prefix_filename() helper used a static piece of memory to the caller, making the caller responsible for copying it, if it wants to keep it across another call to the same function. Two callers of the prefix_filename() in hash-object were made to xstrdup() the value obtained from it. But in the same series, when e4da43b1 (prefix_filename: return newly allocated string, 2017-03-20) changed the rule to gave the caller possession of the memory, we forgot to revert one of the xstrdup() changes, allowing the returned value to leak. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	Documentation: fix default directory of git bugreport -o	Bagas Sanjaya	1	-2/+2
	git bugreport writes bug report to the current directory by default, instead of repository root. Fix the documentation. Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	Makefile: remove the check_bindir script	Ævar Arnfjörð Bjarmason	2	-15/+1
	This script was added in f28ac70f48 (Move all dashed-form commands to libexecdir, 2007-11-28) when commands such as "git-add" lived in the bin directory, instead of the git exec directory. This notice helped someone incorrectly installing version v1.6.0 and later into a directory built for a pre-v1.6.0 git version. We're now long past the point where anyone who'd be helped by this warning is likely to be doing that, so let's just remove this check and warning to simplify the Makefile. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	send-email: fix a "first config key wins" regression in v2.33.0	Ævar Arnfjörð Bjarmason	2	-1/+16
	Fix a regression in my c95e3a3f0b8 (send-email: move trivial config handling to Perl, 2021-05-28) where we'd pick the first config key out of multiple defined ones, instead of using the normal "last key wins" semantics of "git config --get". This broke e.g. cases where a .git/config would have a different sendemail.smtpServer than ~/.gitconfig. We'd pick the ~/.gitconfig over .git/config, instead of preferring the repository-local version. The same would go for /etc/gitconfig etc. The full list of impacted config keys (the %config_settings values which are references to scalars, not arrays) is: sendemail.smtpencryption sendemail.smtpserver sendemail.smtpserverport sendemail.smtpuser sendemail.smtppass sendemail.smtpdomain sendemail.smtpauth sendemail.smtpbatchsize sendemail.smtprelogindelay sendemail.tocmd sendemail.cccmd sendemail.aliasfiletype sendemail.envelopesender sendemail.confirm sendemail.from sendemail.assume8bitencoding sendemail.composeencoding sendemail.transferencoding sendemail.sendmailcmd I.e. having any of these set in say ~/.gitconfig and in-repo .git/config regressed in v2.33.0 to prefer the --global one over the --local. To test this add a test of config priority to one of these config variables, most don't have tests at all, but there was an existing one for sendemail.8bitEncoding. The "git config" (instead of "test_config") is somewhat of an anti-pattern, but follows established conventions in t9001-send-email.sh, likewise with any other pattern or idiom in this test. The populating of home/.gitconfig and setting of HOME= is copied from a test in t0017-env-helper.sh added in 1ff750b128e (tests: make GIT_TEST_GETTEXT_POISON a boolean, 2019-06-21). This test fails without this bugfix, but now it works. Reported-by: Eli Schwartz <eschwartz@archlinux.org> Tested-by: Eli Schwartz <eschwartz@archlinux.org> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	range-diff: avoid segfault with -I	René Scharfe	1	-0/+3
	output() reuses the same struct diff_options for multiple calls of diff_flush(). Set the option no_free to instruct it to keep the ignore regexes between calls and release them explicitly at the end. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	diff-index: restore -c/--cc options handling	Sergey Organov	3	-14/+8
	This fixes 19b2517f (diff-merges: move specific diff-index "-m" handling to diff-index, 2021-05-21). That commit disabled handling of all diff for merges options in diff-index on an assumption that they are unused. However, it later appeared that -c and --cc, even though undocumented and not being covered by tests, happen to have had particular effect on diff-index output. Restore original -c/--cc options handling by diff-index. Signed-off-by: Sergey Organov <sorganov@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	tr2: log N parent process names on Linux	Ævar Arnfjörð Bjarmason	1	-17/+132
	In 2f732bf15e6 (tr2: log parent process name, 2021-07-21) we started logging parent process names, but only logged all parents on Windows. on Linux only the name of the immediate parent process was logged. Extend the functionality added there to also log full parent chain on Linux. This requires us to lookup "/proc/<getppid()>/stat" instead of "/proc/<getppid()>/comm". The "comm" file just contains the name of the process, but the "stat" file has both that information, and the parent PID of that process, see procfs(5). We parse out the parent PID of our own parent, and recursively walk the chain of "/proc//stat" files all the way up the chain. A parent PID of 0 indicates the end of the chain. It's possible given the semantics of Linux's PID files that we end up getting an entirely nonsensical chain of processes. It could happen if e.g. we have a chain of processes like: 1 (init) => 321 (bash) => 123 (git) Let's assume that "bash" was started a while ago, and that as shown the OS has already cycled back to using a lower PID for us than our parent process. In the time it takes us to start up and get to trace2_collect_process_info(TRACE2_PROCESS_INFO_STARTUP) our parent process might exit, and be replaced by an entirely different process! We'd racily look up our own getppid(), but in the meantime our parent would exit, and Linux would have cycled all the way back to starting an entirely unrelated process as PID 321. If that happens we'll just silently log incorrect data in our ancestry chain. Luckily we don't need to worry about this except in this specific cycling scenario, as Linux does not have PID randomization. It appears it once did through a third-party feature, but that it was removed around 2006[1]. For anyone worried about this edge case raising PID_MAX via "/proc/sys/kernel/pid_max" will mitigate it, but not eliminate it. One thing we don't need to worry about is getting into an infinite loop when walking "/proc//stat". See 353d3d77f4f (trace2: collect Windows-specific process information, 2019-02-22) for the related Windows code that needs to deal with that, and [2] for an explanation of that edge case. Aside from potential race conditions it's also a bit painful to correctly parse the process name out of "/proc/*/stat". A simpler approach is to use fscanf(), see [3] for an implementation of that, but as noted in the comment being added here it would fail in the face of some weird process names, so we need our own parse_proc_stat() to parse it out. With this patch the "ancestry" chain for a trace2 event might look like this: $ GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version \| grep ancestry \| jq -r .ancestry [ "bash", "screen", "systemd" ] And in the case of naughty process names like the following. This uses perl's ability to use prctl(PR_SET_NAME, ...). See Perl/perl5@7636ea95c5 (Set the legacy process name with prctl() on assignment to $0 on Linux, 2010-04-15)[4]: $ perl -e '$0 = "(naughty\nname)"; system "GIT_TRACE2_EVENT=/dev/stdout ~/g/git/git version"' \| grep ancestry \| jq -r .ancestry [ "sh", "(naughty\nname)", "bash", "screen", "systemd" ] 1. https://grsecurity.net/news#grsec2110 2. https://lore.kernel.org/git/48a62d5e-28e2-7103-a5bb-5db7e197a4b9@jeffhostetler.com/ 3. https://lore.kernel.org/git/87o8agp29o.fsf@evledraar.gmail.com/ 4. https://github.com/Perl/perl5/commit/7636ea95c57762930accf4358f7c0c2dec086b5e Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	tr2: do compiler enum check in trace2_collect_process_info()	Ævar Arnfjörð Bjarmason	1	-6/+7
	Change code added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to use a switch statement without a "default" branch to have the compiler error if this code ever drifts out of sync with the members of the "enum trace2_process_info_reason". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	tr2: leave the parent list empty upon failure & don't leak memory	Ævar Arnfjörð Bjarmason	1	-3/+5
	In a subsequent commit I'll be replacing most of this code to log N parents, but let's first fix bugs introduced in the recent 2f732bf15e6 (tr2: log parent process name, 2021-07-21). It was using the strbuf_read_file() in the wrong way, its return value is either a length or a negative value on error. If we didn't have a procfs, or otherwise couldn't access it we'd end up pushing an empty string to the trace2 ancestry array. It was also using the strvec_push() API the wrong way. That API always does an xstrdup(), so by detaching the strbuf here we'd leak memory. Let's instead pass in our pointer for strvec_push() to xstrdup(), and then free our own strbuf. I do have some WIP changes to make strvec_push_nodup() non-static, which makes this and some other callsites nicer, but let's just follow the prevailing pattern of using strvec_push() for now. We'll also need to free that "procfs_path" strbuf whether or not strbuf_read_file() succeeds, which was another source of memory leaks in 2f732bf15e6, i.e. we'd leak that memory as well if we weren't on a system where we could read the file from procfs. Let's move all the freeing of the memory to the end of the function. If we're still at STRBUF_INIT with "name" due to not having taken the branch where the strbuf_read_file() succeeds freeing it is redundant. So we could move it into the body of the "if", but just handling freeing the same way for all branches of the function makes it more readable. In combination with the preceding commit this makes all of t[0-9]trace2.sh pass under SANITIZE=leak on Linux. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	tr2: stop leaking "thread_name" memory	Ævar Arnfjörð Bjarmason	1	-0/+1
	Fix a memory leak introduced in ee4512ed481 (trace2: create new combined trace facility, 2019-02-22), we were doing a free() of other memory allocated in tr2tls_create_self(), but not the "thread_name" "struct strbuf". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	tr2: clarify TRACE2_PROCESS_INFO_EXIT comment under Linux	Ævar Arnfjörð Bjarmason	1	-1/+5
	Rewrite a comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21) to describe what we might do under TRACE2_PROCESS_INFO_EXIT in the future, instead of vaguely referring to "something extra". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	tr2: remove NEEDSWORK comment for "non-procfs" implementations	Ævar Arnfjörð Bjarmason	1	-1/+0
	I'm fairly sure that there is no way on Linux to inspect the process tree without using procfs, any tool such as ps(1), top(1) etc. that shows this sort of information ultimately looks the information up in procfs. So let's remove this comment added in 2f732bf15e6 (tr2: log parent process name, 2021-07-21), it's setting us up for an impossible task. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	bundle: show progress on "unbundle"	Ævar Arnfjörð Bjarmason	2	-1/+8
	The "unbundle" command added in 2e0afafebd8 (Add git-bundle: move objects and references by archive, 2007-02-22) did not show progress output, even though the underlying API learned how to show progress in be042aff24c (Teach progress eye-candy to fetch_refs_from_bundle(), 2011-09-18). Now we'll show "Unbundling objects" using the new --progress-title option to "git index-pack", to go with its existing "Receiving objects" and "Indexing objects" (which it shows when invoked with "--stdin", and with a pack file, respectively). Unlike "git bundle create" we don't handle "--quiet" here, nor "--all-progress" and "--all-progress-implied". Those are all specific to "create" (and "verify", in the case of "--quiet"). The structure of the existing documentation is a bit unclear, e.g. the documentation for the "--quiet" option added in 79862b6b77c (bundle-create: progress output control, 2019-11-10) only describes how it works for "create", and not for "verify". That and other issues in it should be fixed, but I'd like to avoid untangling that mess right now. Let's just support the standard "--no-progress" implicitly here, and leave cleaning up the general behavior of "git bundle" for a later change. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	index-pack: add --progress-title option	Ævar Arnfjörð Bjarmason	2	-0/+12
	Add a --progress-title option to index-pack, when data is piped into index-pack its progress is a proxy for whatever's feeding it data. This option will allow us to set a more relevant progress bar title in "git bundle unbundle", and is also used in my "bundle-uri" RFC patches[1] by a new caller in fetch-pack.c. The code change in cmd_index_pack() won't handle "--progress-title=xyz", only "--progress-title xyz", and the "(i+1)" style (as opposed to "i + 1") is a bit odd. Not using the "--long-option=value" style is inconsistent with existing long options handled by cmd_index_pack(), but makes the code that needs to call it better (two strvec_push(), instead of needing a strvec_pushf()). Since the option is internal-only the inconsistency shouldn't matter. I'm copying the pattern to handle it as-is from the handling of the existing "-o" option in the same function, see 9cf6d3357aa (Add git-index-pack utility, 2005-10-12) for its addition. That's a short option, but the code to implement the two is the same in functionality and style. Eventually we'd like to migrate all of this this to parse_options(), which would make these differences in behavior go away. 1. https://lore.kernel.org/git/RFC-cover-00.13-0000000000-20210805T150534Z-avarab@gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	bundle API: change "flags" to be "extra_index_pack_args"	Ævar Arnfjörð Bjarmason	4	-10/+19
	Since the "flags" parameter was added in be042aff24c (Teach progress eye-candy to fetch_refs_from_bundle(), 2011-09-18) there's never been more than the one flag: BUNDLE_VERBOSE. Let's have the only caller who cares about that pass "-v" itself instead through new "extra_index_pack_args" parameter. The flexibility of being able to pass arbitrary arguments to "unbundle" will be used in a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	maintenance: add support for systemd timers on Linux	Lénaïc Huard	3	-11/+330
	The existing mechanism for scheduling background maintenance is done through cron. On Linux systems managed by systemd, systemd provides an alternative to schedule recurring tasks: systemd timers. The main motivations to implement systemd timers in addition to cron are: * cron is optional and Linux systems running systemd might not have it installed. * The execution of `crontab -l` can tell us if cron is installed but not if the daemon is actually running. * With systemd, each service is run in its own cgroup and its logs are tagged by the service inside journald. With cron, all scheduled tasks are running in the cron daemon cgroup and all the logs of the user-scheduled tasks are pretended to belong to the system cron service. Concretely, a user that doesn’t have access to the system logs won’t have access to the log of their own tasks scheduled by cron whereas they will have access to the log of their own tasks scheduled by systemd timer. Although `cron` attempts to send email, that email may go unseen by the user because these days, local mailboxes are not heavily used anymore. In order to schedule git maintenance, we need two unit template files: * ~/.config/systemd/user/git-maintenance@.service to define the command to be started by systemd and * ~/.config/systemd/user/git-maintenance@.timer to define the schedule at which the command should be run. Those units are templates that are parameterized by the frequency. Based on those templates, 3 timers are started: * git-maintenance@hourly.timer * git-maintenance@daily.timer * git-maintenance@weekly.timer The command launched by those three timers are the same as with the other scheduling methods: /path/to/git for-each-repo --exec-path=/path/to --config=maintenance.repo maintenance run --schedule=%i with the full path for git to ensure that the version of git launched for the scheduled maintenance is the same as the one used to run `maintenance start`. The timer unit contains `Persistent=true` so that, if the computer is powered down when a maintenance task should run, the task will be run when the computer is back powered on. Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	maintenance: `git maintenance run` learned `--scheduler=<scheduler>`	Lénaïc Huard	3	-75/+354
	Depending on the system, different schedulers can be used to schedule the hourly, daily and weekly executions of `git maintenance run`: * `launchctl` for MacOS, * `schtasks` for Windows and * `crontab` for everything else. `git maintenance run` now has an option to let the end-user explicitly choose which scheduler he wants to use: `--scheduler=auto\|crontab\|launchctl\|schtasks`. When `git maintenance start --scheduler=XXX` is run, it not only registers `git maintenance run` tasks in the scheduler XXX, it also removes the `git maintenance run` tasks from all the other schedulers to ensure we cannot have two schedulers launching concurrent identical tasks. The default value is `auto` which chooses a suitable scheduler for the system. `git maintenance stop` doesn't have any `--scheduler` parameter because this command will try to remove the `git maintenance run` tasks from all the available schedulers. Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	cache.h: Introduce a generic "xdg_config_home_for(…)" function	Lénaïc Huard	2	-3/+17
	Current implementation of `xdg_config_home(filename)` returns `$XDG_CONFIG_HOME/git/$filename`, with the `git` subdirectory inserted between the `XDG_CONFIG_HOME` environment variable and the parameter. This patch introduces a `xdg_config_home_for(subdir, filename)` function which is more generic. It only concatenates "$XDG_CONFIG_HOME", or "$HOME/.config" if the former isn’t defined, with the parameters, without adding `git` in between. `xdg_config_home(filename)` is now implemented by calling `xdg_config_home_for("git", filename)` but this new generic function can be used to compute the configuration directory of other programs. Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	test-lib-functions: keep user's debugger config files and TERM in 'debug'	Philippe Blain	2	-16/+50
	The 'debug' function in test-lib-functions.sh is used to invoke a debugger at a specific line in a test. It inherits the value of HOME and TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb. Changing the value of HOME means that any customization configured in a developers' debugger configuration file (like $HOME/.gdbinit or $HOME/.lldbinit) are not available in the debugger invoked by 'test_pause'. Changing the value of TERM to 'dumb' means that colored output is disabled in the debugger. To make the debugging experience with 'debug' more pleasant, leverage the variable USER_HOME, added in the previous commit, to copy a developer's ~/.gdbinit and ~/.lldbinit to the test HOME. We do not set HOME to USER_HOME as in 'test_pause' to avoid user configuration in $USER_HOME/.gitconfig from interfering with the command being debugged. Also, add a flag to launch the debugger with the original value of TERM, and add the same warning as for 'test_pause'. Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	test-lib-functions: optionally keep HOME, TERM and SHELL in 'test_pause'	Philippe Blain	3	-5/+53
	The 'test_pause' function, which is designed to help interactive debugging and exploration of tests, currently inherits the value of HOME and TERM set by 'test-lib.sh': HOME="$TRASH_DIRECTORY" and TERM=dumb. It also invokes the shell defined by TEST_SHELL_PATH, which defaults to /bin/sh (through SHELL_PATH). Changing the value of HOME means that any customization configured in a developers' shell startup files and any Git aliases defined in their global Git configuration file are not available in the shell invoked by 'test_pause'. Changing the value of TERM to 'dumb' means that colored output is disabled for all commands in that shell. Using /bin/sh as the shell invoked by 'test_pause' is not ideal since some platforms (i.e. Debian and derivatives) use Dash as /bin/sh, and this shell is usually compiled without readline support, which makes for a poor interactive command line experience. To make the interactive command line experience in the shell invoked by 'test_pause' more pleasant, save the values of HOME and TERM in USER_HOME and USER_TERM before changing them in test-lib.sh, and add options to 'test_pause' to optionally use these variables to invoke the shell. Also add an option to invoke SHELL instead of TEST_SHELL_PATH, so that developer's interactive shell is used. We use options instead of changing the behaviour unconditionally since these three variables can slightly change command behaviour. Moreover, using the original HOME means commands could overwrite files in a user's home directory. Be explicit about these caveats in the new 'Usage' section in test-lib-functions.sh. Finally, add '[options]' to the test_pause synopsys in t/README, and mention that the full list of helper functions and their options can be found in test-lib-functions.sh. Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-07	test-lib-functions: use 'TEST_SHELL_PATH' in 'test_pause'	Philippe Blain	1	-1/+1
	3f824e91c8 (t/Makefile: introduce TEST_SHELL_PATH, 2017-12-08) made it easy to use a different shell for the tests than 'SHELL_PATH' used at compile time. But 'test_pause' still invokes 'SHELL_PATH'. If TEST_SHELL_PATH is set, invoke that shell in 'test_pause' for consistency. Suggested-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-06	make: add INSTALL_STRIP option variable	Bagas Sanjaya	1	-3/+12
	Add $(INSTALL_STRIP), which allows passing stripping options to $(INSTALL). For this to work, installing executables must be split to installing compiled binaries and scripts portions, since $(INSTALL_STRIP) is only meaningful to the former. Users can set this variable depending on their system. For example, Linux users can use `-s --strip-program=strip`, while FreeBSD users can simply set to `-s` and choose strip program with $STRIPBIN. [original outline by Đoàn Trần Công Danh] Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-06	apply: resolve trivial merge without hitting ll-merge with "--3way"	Junio C Hamano	2	-0/+66
	The ll_binary_merge() function assumes that the ancestor blob is different from either side of the new versions, and always fails the merge in conflict, unless -Xours or -Xtheirs is in effect. The normal "merge" machineries all resolve the trivial cases (e.g. if our side changed while their side did not, the result is ours) without triggering the file-level merge drivers, so the assumption is warranted. The code path in "git apply --3way", however, does not check for the trivial three-way merge situation and always calls the file-level merge drivers. This used to be perfectly OK back when we always first attempted a straight patch application and used the three-way code path only as a fallback. Any binary patch that can be applied as a trivial three-way merge (e.g. the patch is based exactly on the version we happen to have) would always cleanly apply, so the ll_binary_merge() that is not prepared to see the trivial case would not have to handle such a case. This no longer is true after we made "--3way" to mean "first try three-way and then fall back to straight application", and made "git apply -3" on a binary patch that is based on the current version no longer apply. Teach "git apply -3" to first check for the trivial merge cases and resolve them without hitting the file-level merge drivers. Signed-off-by: Jerry Zhang <jerry@skydio.com> [jc: stolen tests from Jerry's patch] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03	The third batch	Junio C Hamano	1	-0/+25
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03	developer: enable pedantic by default	Carlo Marcelo Arenas Belón	2	-3/+5
	With the codebase firmly C99 compatible and most compilers supporting newer versions by default, it could help bring visibility to problems. Reverse the DEVOPTS=pedantic flag to provide a fallback for people stuck with gcc < 5 or some other compiler that either doesn't support this flag or has issues with it, and while at it also enable -Wpedantic which used to be controversial[1] when Apple compilers and clang had widely divergent version numbers. Ideally any compiler found to have issues with these flags will be added to an exception, and indeed, one was added to safely process windows headers that would use non standard print identifiers, but it is expected that more will be needed, so it could be considered a weather balloon. [1] https://lore.kernel.org/git/20181127100557.53891-1-carenas@gmail.com/ Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-03	win32: allow building with pedantic mode enabled	Carlo Marcelo Arenas Belón	3	-7/+10
	In preparation to building with pedantic mode enabled, change a couple of places where the current mingw gcc compiler provided with the SDK reports issues. A full fix for the incompatible use of (void *) to store function pointers has been punted, with the minimal change to instead use a generic function pointer (FARPROC), and therefore the (hopefully) temporary need to disable incompatible pointer warnings. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>