summaryrefslogtreecommitdiffstats
path: root/diff-merges.h (unfollow)
Commit message (Collapse)AuthorFilesLines
2022-01-06grep: extract grep_binexp() from grep_or_expr()Taylor Blau1-2/+9
When constructing an OR node, the grep.c code uses `grep_or_expr()` to make a node, assign its kind, and set its left and right children. The same is not done for AND nodes. Prepare to introduce a new `grep_and_expr()` function which will share code with the existing implementation of `grep_or_expr()` by introducing a new function which compiles either kind of binary expression, and reimplement `grep_or_expr()` in terms of it. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-06grep: use grep_not_expr() in compile_pattern_not()René Scharfe1-13/+11
Move the definition of grep_not_expr() up and use this function in compile_pattern_not() to simplify the code and reduce duplication. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-06grep: use grep_or_expr() in compile_pattern_or()René Scharfe1-15/+11
Move the definition of grep_or_expr() up and use this function in compile_pattern_or() to reduce code duplication. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-05The seventh batchJunio C Hamano1-0/+49
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-04The sixth batchJunio C Hamano1-0/+11
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-23sparse-checkout: remove stray trailing spaceElijah Newren1-1/+1
Reported-by: Jiang Xin <worldhello.net@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-23The fifth batchJunio C Hamano1-0/+13
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-22The fourth batchJunio C Hamano1-1/+33
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20merge: allow to pretend a merge is made into a different branchJunio C Hamano7-5/+67
When a series of patches for a topic-B depends on having topic-A, the workflow to prepare the topic-B branch would look like this: $ git checkout -b topic-B main $ git merge --no-ff --no-edit topic-A $ git am <mbox-for-topic-B When topic-A gets updated, recreating the first merge and rebasing the rest of the topic-B, all on detached HEAD, is a useful technique. After updating topic-A with its new round of patches: $ git checkout topic-B $ prev=$(git rev-parse 'HEAD^{/^Merge branch .topic-A. into}') $ git checkout --detach $prev^1 $ git merge --no-ff --no-edit topic-A $ git rebase --onto HEAD $prev @{-1}^0 $ git checkout -B @{-1} This will (0) check out the current topic-B. (1) find the previous merge of topic-A into topic-B. (2) detach the HEAD to the parent of the previous merge. (3) merge the updated topic-A to it. (4) reapply the patches to rebuild the rest of topic-B. (5) update topic-B with the result. without contaminating the reflog of topic-B too much. topic-B@{1} is the "logically previous" state before topic-A got updated, for example. At (4), comparison (e.g. range-diff) between HEAD and @{-1} is a meaningful way to sanity check the result, and the same can be done at (5) by comparing topic-B and topic-B@{1}. But there is one glitch. The merge into the detached HEAD done in the step (3) above gives us "Merge branch 'topic-A' into HEAD", and does not say "into topic-B". Teach the "--into-name=<branch>" option to "git merge" and its underlying "git fmt-merge-message", to pretend as if we were merging into <branch>, no matter what branch we are actually merging into, when they prepare the merge message. The pretend name honors the usual "into <target>" suppression mechanism, which can be seen in the tests added here. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20grep/pcre2: factor out literal variableRené Scharfe1-2/+2
Patterns that contain no wildcards and don't have to be case-folded are literal. Give this condition a name to increase the readability of the boolean expression for enabling the option PCRE2_UTF. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20grep/pcre2: use PCRE2_UTF even with ASCII patternsRené Scharfe2-1/+7
compile_pcre2_pattern() currently uses the option PCRE2_UTF only for patterns with non-ASCII characters. Patterns with ASCII wildcards can match non-ASCII strings, though. Without that option PCRE2 mishandles UTF-8 input, though -- it matches parts of multi-byte characters. Fix that by using PCRE2_UTF even for ASCII-only patterns. This is a remake of the reverted ae39ba431a (grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data, 2021-10-15). The change to the condition and the test are simplified and more targeted. Original-patch-by: Hamza Mahfooz <someguy@effective-light.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20daemon: plug memory leak on overlong pathRené Scharfe1-1/+1
Release the strbuf containing the interpolated path after copying it to a stack buffer and before erroring out if it's too long. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20repack: make '--quiet' disable progressDerrick Stolee3-4/+14
While testing some ideas in 'git repack', I ran it with '--quiet' and discovered that some progress output was still shown. Specifically, the output for writing the multi-pack-index showed the progress. The 'show_progress' variable in cmd_repack() is initialized with isatty(2) and is not modified at all by the '--quiet' flag. The '--quiet' flag modifies the po_args.quiet option which is translated into a '--quiet' flag for the 'git pack-objects' child process. However, 'show_progress' is used to directly send progress information to the multi-pack-index writing logic which does not use a child process. The fix here is to modify 'show_progress' to be false if po_opts.quiet is true, and isatty(2) otherwise. This new expectation simplifies a later condition that checks both. Update the documentation to make it clear that '-q' will disable all progress in addition to ensuring the 'git pack-objects' child process will receive the flag. Use 'test_terminal' to check that this works to get around the isatty(2) check. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-20repack: respect kept objects with '--write-midx -b'Derrick Stolee3-1/+41
Historically, we needed a single packfile in order to have reachability bitmaps. This introduced logic that when 'git repack' had a '-b' option that we should stop sending the '--honor-pack-keep' option to the 'git pack-objects' child process, ensuring that we create a packfile containing all reachable objects. In the world of multi-pack-index bitmaps, we no longer need to repack all objects into a single pack to have valid bitmaps. Thus, we should continue sending the '--honor-pack-keep' flag to 'git pack-objects'. The fix is very simple: only disable the flag when writing bitmaps but also _not_ writing the multi-pack-index. This opens the door to new repacking strategies that might want to keep some historical set of objects in a stable pack-file while only repacking more recent objects. To test, create a new 'test_subcommand_inexact' helper that is more flexible than 'test_subcommand'. This allows us to look for the --honor-pack-keep flag without over-indexing on the exact set of arguments. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-18t4202: fix patternType setting in --invert-grep testRené Scharfe1-1/+1
Actually use extended regexes as indicated in the comment. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-18docs: add missing colon to Documentation/config/gpg.txtGreg Hurrell1-1/+1
Add missing colon to ensure correct rendering of definition list item. Without the proper number of colons, it renders as just another top-level paragraph rather than a list item. Signed-off-by: Greg Hurrell <greg@hurrell.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-17log: let --invert-grep only invert --grepRené Scharfe5-7/+42
The option --invert-grep is documented to filter out commits whose messages match the --grep filters. However, it also affects the header matches (--author, --committer), which is not intended. Move the handling of that option to grep.c, as only the code there can distinguish between matches in the header from those in the message body. If --invert-grep is given then enable extended expressions (not the regex type, we just need git grep's --not to work), negate the body patterns and check if any of them match by piggy-backing on the collect_hits mechanism of grep_source_1(). Collecting the matches in struct grep_opt is a bit iffy, but with "last_shown" we have a precedent for writing state information to that struct. Reported-by: Dotan Cohen <dotancohen@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-17format-patch: mark rev_info with UNLEAKJunio C Hamano1-0/+1
The comand uses a single instance of rev_info on stack, makes a single revision traversal and exit. Mark the resources held by the rev_info structure with UNLEAK(). We do not do this at lower level in revision.c or cmd_log_walk(), as a new caller of the revision traversal API can make unbounded number of rev_info during a single run, and UNLEAK() would not a be suitable mechanism to deal with such a caller. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-17t4204 is not sanitizer clean at allJunio C Hamano1-12/+17
Earlier we marked that this patch-id test is leak-sanitizer clean, but if we read the test script carefully, it is merely because we have too many invocations of commands in the "git log" family on the upstream side of the pipe, hiding breakages from them. Split the pipeline so that breakages from these commands can be caught (not limited to aborts due to leak-sanitizer) and unmark the script as not passing the test with leak-sanitizer in effect. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16git-p4: resolve RCS keywords in bytes not utf-8Joel Holdsworth2-7/+23
RCS keywords are strings that are replaced with information from Perforce. Examples include $Date$, $Author$, $File$, $Change$ etc. Perforce resolves these by expanding them with their expanded values when files are synced, but Git's data model requires these expanded values to be converted back into their unexpanded form. Previously, git-p4.py would implement this behaviour through the use of regular expressions. However, the regular expression substitution was applied using decoded strings i.e. the content of incoming commit diffs was first decoded from bytes into UTF-8, processed with regular expressions, then converted back to bytes. Not only is this behaviour inefficient, but it is also a cause of a common issue caused by text files containing invalid UTF-8 data. For files created in Windows, CP1252 Smart Quote Characters (0x93 and 0x94) are seen fairly frequently. These codes are invalid in UTF-8, so if the script encountered any file containing them, on Python 2 the symbols will be corrupted, and on Python 3 the script will fail with an exception. This patch replaces this decoding/encoding with bytes object regular expressions, so that the substitution is performed directly upon the source data with no conversions. A test for smart quote handling has been added to the t9810-git-p4-rcs.sh test suite. Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16git-p4: open temporary patch file for write onlyJoel Holdsworth1-1/+1
The patchRCSKeywords method creates a temporary file in which to store the patched output data. Previously this file was opened in "w+" mode (write and read), but the code never reads the contents of the file while open, so it only needs to be opened in "w" mode (write-only). Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16git-p4: add raw option to read_pipelinesJoel Holdsworth1-3/+5
Previously the read_lines function always decoded the result lines. In order to improve support for non-decoded binary processing of data in git-p4.py, this patch adds a raw option to the function that allows decoding to be disabled. Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16git-p4: pre-compile RCS keyword regexesJoel Holdsworth1-30/+18
Previously git-p4.py would compile one of two regular expressions for ever RCS keyword-enabled file. This patch improves simplifies the code by pre-compiling the two regular expressions when the script first loads. Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16git-p4: use with statements to close files after use in patchRCSKeywordsJoel Holdsworth1-8/+5
Python with statements are used to wrap the execution of a block of code so that an object can be safely released when execution leaves the scope. They are desirable for improving code tidyness, and to ensure that objects are properly destroyed even when exceptions are thrown. Signed-off-by: Joel Holdsworth <jholdsworth@nvidia.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16help: make auto-correction prompt more consistentKashav Madan1-1/+1
There are three callsites of git_prompt() helper function that ask a "yes/no" question to the end user, but one of them in help.c that asks if the suggested auto-correction is OK, which is given when the user makes a possible typo in a Git subcommand name, is formatted differently from the others. Update the format string to make the prompt string look more consistent. Signed-off-by: Kashav Madan <kshvmdn@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16am: support --allow-empty to record specific empty patches徐沛文 (Aleen)5-10/+95
This option helps to record specific empty patches in the middle of an am session, which does create empty commits only when: 1. the index has not changed 2. lacking a branch When the index has changed, "--allow-empty" will create a non-empty commit like passing "--continue" or "--resolved". Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16am: support --empty=<option> to handle empty patches徐沛文 (Aleen)3-5/+111
Since that the command 'git-format-patch' can include patches of commits that emit no changes, the 'git-am' command should also support an option, named as '--empty', to specify how to handle those empty patches. In this commit, we have implemented three valid options ('stop', 'drop' and 'keep'). Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-16doc: git-format-patch: describe the option --always徐沛文 (Aleen)1-1/+5
This commit has described how to use '--always' option in the command 'git-format-patch' to include patches for commits that emit no changes. Signed-off-by: 徐沛文 (Aleen) <aleen42@vip.qq.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15doc/config: mark ssh allowedSigners example as literalJeff King1-1/+1
The discussion for gpg.ssh.allowedSignersFile shows an example string that contains "user1@example.com,user2@example.com". Asciidoc thinks these are real email addresses and generates "mailto" footnotes for them. This makes the rendered content more confusing, as it has extra "[1]" markers: The file consists of one or more lines of principals followed by an ssh public key. e.g.: user1@example.com[1],user2@example.com[2] ssh-rsa AAAAX1... See ssh-keygen(1) "ALLOWED SIGNERS" for details. and also generates pointless notes at the end of the page: NOTES 1. user1@example.com mailto:user1@example.com 2. user2@example.com mailto:user2@example.com We can fix this by putting the example into a backtick literal block. That inhibits the mailto generation, and as a bonus typesets the example text in a way that sets it off from the regular prose (a tt font for html, or bold in the roff manpage). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15upload-pack.c: increase output buffer sizeJacob Vosmaer1-5/+12
When serving a fetch, git upload-pack copies data from a git pack-objects stdout pipe to its stdout. This commit increases the size of the buffer used for that copying from 8192 to 65515, the maximum sideband-64k packet size. Previously, this buffer was allocated on the stack. Because the new buffer size is nearly 64KB, we switch this to a heap allocation. On GitLab.com we use GitLab's pack-objects cache which does writes of 65515 bytes. Because of the default 8KB buffer size, propagating these cache writes requires 8 pipe reads and 8 pipe writes from git-upload-pack, and 8 pipe reads from Gitaly (our Git RPC service). If we increase the size of the buffer to the maximum Git packet size, we need only 1 pipe read and 1 pipe write in git-upload-pack, and 1 pipe read in Gitaly to transfer the same amount of data. In benchmarks with a pure fetch and 100% cache hit rate workload we are seeing CPU utilization reductions of over 30%. Signed-off-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15clone: avoid using deprecated `sparse-checkout init`Elijah Newren1-1/+1
The previous commits marked `sparse-checkout init` as deprecated; we can just use `set` instead here and pass it no paths. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15Documentation: clarify/correct a few sparsity related statementsElijah Newren2-5/+5
Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15git-sparse-checkout.txt: update to document init/set/reapply changesElijah Newren1-43/+55
As noted in the previous commit, using separate `init` and `set` steps with sparse-checkout result in a number of issues. The previous commits made `set` able to handle the work of both commands, and enabled reapply to tweak the {cone,sparse-index} settings. Update the documentation to reflect this, and mark `init` as deprecated. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: enable reapply to take --[no-]{cone,sparse-index}Elijah Newren1-1/+17
Folks may want to switch to or from cone mode, or to or from a sparse-index without changing their sparsity paths. Allow them to do so using the reapply command. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: enable `set` to initialize sparse-checkout modeElijah Newren1-1/+26
The previously suggested workflow: git sparse-checkout init ... git sparse-checkout set ... Suffered from three problems: 1) It would delete nearly all files in the first step, then restore them in the second. That was poor performance and forced unnecessary rebuilds. 2) The two-step process resulted in two progress bars, which was suboptimal from a UI point of view for wrappers that invoked both of these commands but only exposed a single command to their end users. 3) With cone mode, the first step would delete nearly all ignored files everywhere, because everything was considered to be outside of the specified sparsity paths. (The user was not allowed to specify any sparsity paths in the `init` step.) Avoid these problems by teaching `set` to understand the extra parameters that `init` takes and performing any necessary initialization if not already in a sparse checkout. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: split out code for tweaking settings configElijah Newren1-19/+37
`init` has some code for handling updates to either cone mode or the sparse-index setting. We would like to be able to reuse this elsewhere, namely in `set` and `reapply`. Split this function out, and make it slightly more general so it can handle being called from the new callers. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: disallow --no-stdin as an argument to setElijah Newren1-2/+3
We intentionally added --stdin as an option to `sparse-checkout set`, but didn't intend for --no-stdin to be permitted as well. Reported-by: Victoria Dye <vdye@github.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: add sanity-checks on initial sparsity stateElijah Newren2-1/+29
Most sparse-checkout subcommands (list, add, reapply) only make sense when already in a sparse state. Add a quick check that will error out early if this is not the case. Also document with a comment why we do not exit early in `disable` even when core.sparseCheckout starts as false. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: break apart functions for sparse_checkout_(set|add)Elijah Newren1-14/+40
sparse_checkout_set() was reused by sparse_checkout_add() with the only difference being a single parameter being passed to that function. However, we would like sparse_checkout_set() to do the same work that sparse_checkout_init() does if sparse checkouts are not already enabled. To facilitate this transition, give each mode their own copy of the function. This does not introduce any behavioral changes; that will come in a subsequent patch. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15sparse-checkout: pass use_stdin as a parameter instead of as a globalElijah Newren1-12/+16
add_patterns_from_input() has relied on a global variable, set_opts.use_stdin, which has been used by both the `set` and `add` subcommands of sparse-checkout. Once we introduce an add_opts.use_stdin, the hardcoding of set_opts.use_stdin will be incorrect. Pass the value as function parameter instead to allow us to make subsequent changes. Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Victoria Dye <vdye@github.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-15The third batchJunio C Hamano1-0/+65
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13git-apply: add --allow-empty flagJerry Zhang4-7/+30
Some users or scripts will pipe "git diff" output to "git apply" when replaying diffs or commits. In these cases, they will rely on the return value of "git apply" to know whether the diff was applied successfully. However, for empty commits, "git apply" will fail. This complicates scripts since they have to either buffer the diff and check its length, or run diff again with "exit-code", essentially doing the diff twice. Add the "--allow-empty" flag to "git apply" which allows it to handle both empty diffs and empty commits created by "git format-patch --always" by doing nothing and returning 0. Add tests for both with and without --allow-empty. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13git-apply: add --quiet flagJerry Zhang2-2/+7
Replace OPT_VERBOSE with OPT_VERBOSITY. This adds a --quiet flag to "git apply" so the user can turn down the verbosity. Signed-off-by: Jerry Zhang <jerry@skydio.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: stop splitting "(..." into separate lines "(" and "..."Eric Sunshine9-35/+37
Because `sed` is line-oriented, for ease of implementation, when chainlint.sed encounters an opening subshell in which the first command is cuddled with the "(", it splits the line into two lines: one containing only "(", and the other containing whatever follows "(". This allows chainlint.sed to get by with a single set of regular expressions for matching shell statements rather than having to duplicate each expression (one set for matching non-cuddled statements, and one set for matching cuddled statements). However, although syntactically and semantically immaterial, this transformation has no value to test authors and might even confuse them into thinking that the linter is misbehaving by inserting (whitespace) line-noise into the shell code it is validating. Moreover, it also allows an implementation detail of chainlint.sed to seep into the chainlint self-test "expect" files, which potentially makes it difficult to reuse the self-tests should a more capable chainlint ever be developed. To address these concerns, stop splitting cuddled "(..." into two lines. Note that, as an implementation artifact, due to sed's line-oriented nature, this change inserts a blank line at output time just before the "(..." line is emitted. It would be possible to suppress this blank line but doing so would add a fair bit of complexity to chainlint.sed. Therefore, rather than suppressing the extra blank line, the Makefile's `check-chainlint` target which runs the chainlint self-tests is instead modified to ignore blank lines when comparing chainlint output against the self-test "expect" output. This is a reasonable compromise for two reasons. First, the purpose of the chainlint self-tests is to verify that the ?!AMP?! annotations are being correctly added; precise whitespace is immaterial. Second, by necessity, chainlint.sed itself already throws away all blank lines within subshells since, when checking for a broken &&-chain, it needs to check the final _statement_ in a subshell, not the final _line_ (which might be blank), thus it has never made any attempt to precisely reproduce blank lines in its output. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: swallow comments consistentlyEric Sunshine6-4/+52
When checking for broken a &&-chain, chainlint.sed knows that the final statement in a subshell should not end with `&&`, so it takes care to make a distinction between the final line which is an actual statement and any lines which may be mere comments preceding the closing ')'. As such, it swallows comment lines so that they do not interfere with the &&-chain check. However, since `sed` does not provide any sort of real recursion, chainlint.sed only checks &&-chains in subshells one level deep; it doesn't do any checking in deeper subshells or in `{...}` blocks within subshells. Furthermore, on account of potential implementation complexity, it doesn't check &&-chains within `case` arms. Due to an oversight, it also doesn't swallow comments inside deep subshells, `{...}` blocks, or `case` statements, which makes its output inconsistent (swallowing comments in some cases but not others). Unfortunately, this inconsistency seeps into the chainlint self-test "expect" files, which potentially makes it difficult to reuse the self-tests should a more capable chainlint ever be developed. Therefore, teach chainlint.sed to consistently swallow comments in all cases. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: stop throwing away here-doc tagsEric Sunshine11-29/+35
The purpose of chainlint is to highlight problems it finds in test code by inserting annotations at the location of each problem. Arbitrarily eliding bits of the code it is checking is not helpful, yet this is exactly what chainlint.sed does by cavalierly and unnecessarily dropping the here-doc operator and tag; i.e. `cat <<TAG` becomes simply `cat` in the output. This behavior can make it more difficult for the test writer to align the annotated output of chainlint.sed with the original test code. Address this by retaining here-doc tags. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: don't mistake `<< word` in string as here-doc operatorEric Sunshine3-2/+36
Tighten here-doc recognition to prevent it from being fooled by text which looks like a here-doc operator but happens merely to be the content of a string, such as this real-world case from t7201: echo "<<<<<<< ours" && echo ourside && echo "=======" && echo theirside && echo ">>>>>>> theirs" This problem went unnoticed because chainlint.sed is not a real parser, but rather applies heuristics to pretend to understand shell code. In this case, it saw what it thought was a here-doc operator (`<< ours`), and fell off the end of the test looking for the closing tag "ours" which it never found, thus swallowed the remainder of the test without checking it for &&-chain breakage. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: make here-doc "<<-" operator recognition more POSIX-likeEric Sunshine1-4/+4
According to POSIX, "<<" and "<<-" are distinct shell operators. For the latter to be recognized, no whitespace is allowed before the "-", though whitespace is allowed after the operator. However, the chainlint patterns which identify here-docs are both too loose and too tight, incorrectly allowing whitespace between "<<" and "-" but disallowing it between "-" and the here-doc tag. Fix the patterns to better match POSIX. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: drop subshell-closing ">" annotationEric Sunshine39-89/+80
chainlint.sed inserts a ">" annotation at the beginning of a line to signal that its heuristics have identified an end-of-subshell. This was useful as a debugging aid during development of the script, but it has no value to test writers and might even confuse them into thinking that the linter is misbehaving by inserting line-noise into the shell code it is validating. Moreover, its presence also potentially makes it difficult to reuse the chainlint self-test "expect" output should a more capable linter ever be developed. Therefore, drop the ">" annotation. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-12-13chainlint.sed: drop unnecessary distinction between ?!AMP?! and ?!SEMI?!Eric Sunshine5-25/+24
>From inception, when chainlint.sed encountered a line using semicolon to separate commands rather than `&&`, it would insert a ?!SEMI?! annotation at the beginning of the line rather ?!AMP?! even though the &&-chain is also broken by the semicolon. Given a line such as: ?!SEMI?! cmd1; cmd2 && the ?!SEMI?! annotation makes it easier to see what the problem is than if the output had been: ?!AMP?! cmd1; cmd2 && which might confuse the test author into thinking that the linter is broken (since the line clearly ends with `&&`). However, now that the ?!AMP?! an ?!SEMI?! annotations are inserted at the point of breakage rather than at the beginning of the line, and taking into account that both represent a broken &&-chain, there is little reason to distinguish between the two. Using ?!AMP?! alone is sufficient to point the test author at the problem. For instance, in: cmd1; ?!AMP?! cmd2 && cmd3 it is clear that the &&-chain is broken between `cmd1` and `cmd2`. Likewise, in: cmd1 && cmd2 ?!AMP?! cmd3 it is clear that the &&-chain is broken between `cmd2` and `cmd3`. Finally, in: cmd1; ?!AMP?! cmd2 ?!AMP?! cmd3 it is clear that the &&-chain is broken between each command. Hence, there is no longer a good reason to make a distinction between a broken &&-chain due to a semicolon and a broken chain due to a missing `&&` at end-of-line. Therefore, drop the ?!SEMI?! annotation and use ?!AMP?! exclusively. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>