| Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The word "whitelist" has cultural implications that are not inclusive.
Thankfully, it is not difficult to reword and avoid its use.
The GIT_ALLOW_PROTOCOL environment variable was referred to as a
"whitelist", but the word "allow" is already part of the variable.
Replace "whitelist" with "allow_list" in these cases to demonstrate that
we are processing a list of allowed protocols.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The word "whitelist" has cultural implications that are not inclusive.
Thankfully, it is not difficult to reword and avoid its use.
Focus on changes in the test scripts, since most of the changes are in
comments and test names. The renamed test_allow_var helper is only used
once inside the widely-used test_proto helper.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The documentation for GIT_ALLOW_PROTOCOL has a sentence that adds no
value, since it repeats the meaning from the previous sentence (twice!).
The word "whitelist" has cultural implications that are not inclusive,
which brought attention to this sentence.
Helped-by: Jeff King <peff@peff.net>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The documentation and error messages for git-cvsserver include some
references to a "whitelist" that is not otherwise included in the
documentation. When different parts of the documentation do not use
common language, this can lead to confusion as to how things are meant
to operate.
Further, the word "whitelist" has cultural implications that make its
use non-inclusive. Thankfully, we can remove it while increasing
clarity.
Update Documentation/git-cvsserver.txt in a similar way to the previous
change to Documentation/git-daemon.txt. The optional '<directory>...'
list can specify a list of allowed directories. We refer to that list
directly inside of the documentation for the GIT_CVSSERVER_ROOT
environment variable.
While modifying this documentation, update the environment variables to
use a list format. We use the modern way of tabbing the description of
each variable in this section. We do _not_ update the description of
'<directory>...' to use tabs this way since the rest of the items in the
OPTIONS list do not use this modern formatting.
A single error message in the actual git-cvsserver.perl code refers to
the whitelist during argument parsing. Instead, refer to the directory
list that has been clarified in the documentation.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The undecorated arguments to the 'git-daemon' command provide a list of
directories. When at least one directory is specified, then 'git-daemon'
only serves requests that are within that directory list. The boolean
'--strict-paths' option makes the list more explicit in that
subdirectories are no longer included.
The existing documentation and error messages around this directory list
refer to it and its behavior as a "whitelist". The word "whitelist" has
cultural implications that are not inclusive. Thankfully, it is not
difficult to reword and avoid its use. In the process, we can define the
purpose of this directory list directly.
In Documentation/git-daemon.txt, rewrite the OPTIONS section around the
'<directory>' option. Add additional clarity to the other options that
refer to these directories.
Some error messages can also be improved in daemon.c. The
'--strict-paths' option requires '<directory>' arguments, so refer to
that section of the documentation directly. A logerror() call points out
that a requested directory is not in the specified directory list. We
can use "list" here without any loss of information.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "open_midx_bitmap()", we do a loop with the MIDX(es) in repo, when
the first one has been found, then will break out by a "return"
directly.
But actually, it's better to continue the loop until we have visited
both the MIDX in our repository, as well as any alternates (along with
_their_ alternates, recursively).
The reason for this is, there may exist more than one MIDX file in
a repo. The "multi_pack_index" struct is actually designed as a singly
linked list, and if a MIDX file has been already opened successfully,
then the other MIDX files will be skipped and left with a warning
"ignoring extra bitmap file." to the output.
The discussion link of community:
https://public-inbox.org/git/YjzCTLLDCby+kJrZ@nand.local/
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "open_pack_bitmap_1()" and "open_midx_bitmap_1()", it's better to
return error() instead of "-1" when some unexpected error occurs like
"stat bitmap file failed", "bitmap header is invalid" or "checksum
mismatch", etc.
There are places where we do not replace, such as when the bitmap
does not exist (no bitmap in repository is allowed) or when another
bitmap has already been opened (in which case it should be a warning
rather than an error).
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Calls to git_open() to open the pack bitmap file and
multi-pack bitmap file do not report any error when they
fail. These files are optional and it is not an error if
open failed due to ENOENT, but we shouldn't be ignoring
other kinds of errors.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In "open_pack_bitmap_1()" and "open_midx_bitmap_1()" we use
a var named "idx_name" to represent the bitmap filename which
is computed by "midx_bitmap_filename()" or "pack_bitmap_filename()"
before we open it.
There may bring some confusion in this "idx_name" naming, which
might lead us to think of ".idx "or" multi-pack-index" files,
although bitmap is essentially can be understood as a kind of index,
let's define this name a little more accurate here.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In pack-bitmap.c, some printed texts are translated, some are not.
Let's support the translations of the bitmap related output.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There are some text output issues in 'pack-bitmap.c', they exist in
die(), error() etc. This includes issues with capitalization the
first letter, newlines, error() instead of BUG(), and substitution
that don't have quotes around them.
Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Adapt the content from 'contrib/scalar/README.md' into a design document in
'Documentation/technical/'. In addition to reformatting for asciidoc,
elaborate on the background, purpose, and design choices that went into
Scalar.
Most of this document will persist in the 'Documentation/technical/' after
Scalar has been moved out of 'contrib/' and into the root of Git. Until that
time, it will also contain a temporary "Roadmap" section detailing the
remaining series needed to finish the initial version of Scalar. The section
will be removed once Scalar is moved to the repo root, but in the meantime
serves as a guide for readers to keep up with progress on the feature.
Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Rephrase documentation to describe scalar as a "large repo management tool"
rather than an "opinionated management tool". The new description is
intended to more directly reflect the utility of scalar to better guide
users in preparation for scalar being built and installed as part of Git.
Signed-off-by: Victoria Dye <vdye@github.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
While setting up an unresolved merge for `git rerere`, we run `git
rev-parse` and `git fmt-merge-msg` to create a variable `$fifth` and a
commit-message file `msg`, which we then never actually use. This has
been like that since these tests were added in 672d1b789b ("rerere:
migrate to parse-options API", 2010-08-05). This does exercise `git
rev-parse` and `git fmt-merge-msg`, but doesn't contribute to testing
`git rerere`. Drop these lines.
Reported-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Fix the overly verbose trace2 logging added in 9a4987677d3 (trace2:
add stats for fsync operations, 2022-03-30) (first released with
v2.36.0).
Since that change every single "git" command invocation has included
these "data" events, even though we'll only make use of these with
core.fsyncMethod=batch, and even then only have non-zero values if
we're writing object data to disk. See c0f4752ed2f (core.fsyncmethod:
batched disk flushes for loose-objects, 2022-04-04) for that feature.
As we're needing to indent the trace2_data_intmax() lines let's
introduce helper variables to ensure that our resulting lines (which
were already too) don't exceed the recommendations of the
CodingGuidelines. Doing that requires either wrapping them twice, or
introducing short throwaway variable names, let's do the latter.
The result was that e.g. "git version" would previously emit a total
of 6 trace2 events with the GIT_TRACE2_EVENT target (version, start,
cmd_ancestry, cmd_name, exit, atexit), but afterwards would emit
8. We'd emit 2 "data" events before the "exit" event.
The reason we didn't catch this was that the trace2 unit tests added
in a15860dca3f (trace2: t/helper/test-trace2, t0210.sh, t0211.sh,
t0212.sh, 2019-02-22) would omit any "data" events that weren't the
ones it cared about. Before this change to the C code 6/7 of our
"t/t0212-trace2-event.sh" tests would fail if this change was applied
to "t/t0212/parse_events.perl".
Let's make the trace2 testing more strict, and further append any new
events types we don't know about in "t/t0212/parse_events.perl". Since
we only invoke the "test-tool trace2" there's no guarantee that we'll
catch other overly verbose events in the future, but we'll at least
notice if we start emitting new events that are issues every time we
log anything with trace2's JSON target.
We exclude the "data_json" event type, we'd otherwise would fail on
both "win test" and "win+VS test" CI due to the logging added in
353d3d77f4f (trace2: collect Windows-specific process information,
2019-02-22). It looks like that logging should really be using
trace2_cmd_ancestry() instead, which was introduced later in
2f732bf15e6 (tr2: log parent process name, 2021-07-21), but let's
leave it for now.
The fix-up to aaf81223f48 (unpack-objects: use stream_loose_object()
to unpack large objects, 2022-06-11) is needed because we're changing
the behavior of these events as discussed above. Since we'd always
emit a "hardware-flush" event the test added in aaf81223f48 wasn't
testing anything except that this trace2 data was unconditionally
logged. Even if "core.fsyncMethod" wasn't set to "batch" we'd pass the
test.
Now we'll check the expected number of "writeout" v.s. "flush" calls
under "core.fsyncMethod=batch", but note that this doesn't actually
test if we carried out the sync using that method, on a platform where
we'd have to fall back to fsync() each of those "writeout" would
really be a "flush" (i.e. a full fsync()).
But in this case what we're testing is that the logic in
"unpack-objects" behaves as expected, not the OS-specific question of
whether we actually were able to use the "bulk" method.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The sparse checkout feature can be used in "cone mode" or "non-cone
mode". In this one instance in the documentation, we refer to the latter
as "non cone mode" with whitespace rather than a hyphen. Align this with
the rest of our documentation.
A few words later in the same paragraph, there's mention of "a more
flexible patterns". Drop that leading "a" to fix the grammar.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Acked-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The first section of 'Documentation/technical/index-format.txt'
mentions that "Git currently supports cache tree and resolve undo
extensions", but then goes on, and in the "Extensions" section
describes not only these two, but six other extensions [1].
Remove this sentence, as it's misleading about the status of all those
other extensions.
Alternatively we could keep that sentence and update the list of
extensions, but that might well lead to a recurring issue, because
apparently this list is never updated when a new index extension is
added.
[1] Split index, untracked cache, FS monitor cache, end of index
entry, index entry offset table and sparse directory entries.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git config's tab completion does not yet know about the "include"
and "includeIf" sections, nor the related "path" variable.
Add a description for these two sections in
'Documentation/config/includeif.txt', which points to git-config's
documentation, specifically the "Includes" and "Conditional Includes"
subsections.
As a side effect, tab completion can successfully complete the
'include', 'includeIf', and 'include.add' expressions.
This effect is tested by two new ad-hoc tests.
Variable completion only works for "include" for now.
Credit for the ideas behind this patch goes to
Ævar Arnfjörð Bjarmason.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Manuel Boni <ziosombrero@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The implementation of mkstemp() for MinGW uses mktemp() and open()
without the flag O_EXCL, which is racy. It's not a security problem
for now because all of its callers only create files within the
repository (incl. worktrees). Replace it with a call to our more
secure internal function, git_mkstemp_mode(), to prevent possible
future issues.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The parse_commit_graph() function takes a 'struct repository *' pointer,
but it only ever accesses config settings (either directly or through
the .settings field of the repo struct). Move all relevant config
settings into the repo_settings struct, and update parse_commit_graph()
and its existing callers so that it takes 'struct repo_settings *'
instead.
Callers of parse_commit_graph() will now need to call
prepare_repo_settings() themselves, or initialize a 'struct
repo_settings' directly.
Prior to ab14d0676c (commit-graph: pass a 'struct repository *' in more
places, 2020-09-09), parsing a commit-graph was a pure function
depending only on the contents of the commit-graph itself. Commit
ab14d0676c introduced a dependency on a `struct repository` pointer, and
later commits such as b66d84756f (commit-graph: respect
'commitGraph.readChangedPaths', 2020-09-09) added dependencies on config
settings, which were accessed through the `settings` field of the
repository pointer. This field was initialized via a call to
`prepare_repo_settings()`.
Additionally, this fixes an issue in fuzz-commit-graph: In 44c7e62
(2021-12-06, repo-settings:prepare_repo_settings only in git repos),
prepare_repo_settings was changed to issue a BUG() if it is called by a
process whose CWD is not a Git repository.
The combination of commits mentioned above broke fuzz-commit-graph,
which attempts to parse arbitrary fuzzing-engine-provided bytes as a
commit graph file. Prior to this change, parse_commit_graph() called
prepare_repo_settings(), but since we run the fuzz tests without a valid
repository, we are hitting the BUG() from 44c7e62 for every test case.
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.
A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.
Create a config variable, `safe.bareRepository`, that tells Git whether
or not to die() when working with a bare repository. This config is an
enum of:
- "all": allow all bare repositories (this is the default)
- "explicit": only allow bare repositories specified via --git-dir
or GIT_DIR.
If we want to protect users from such attacks by default, neither value
will suffice - "all" provides no protection, but "explicit" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.
[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.
As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].
[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
`uploadpack.packObjectsHook` is the only 'protected configuration only'
variable today, but we've noted that `safe.directory` and the upcoming
`safe.bareRepository` should also be 'protected configuration only'. So,
for consistency, we'd like to have a single implementation for protected
configuration.
The primary constraints are:
1. Reading from protected configuration should be fast. Nearly all "git"
commands inside a bare repository will read both `safe.directory` and
`safe.bareRepository`, so we cannot afford to be slow.
2. Protected configuration must be readable when the gitdir is not
known. `safe.directory` and `safe.bareRepository` both affect
repository discovery and the gitdir is not known at that point [1].
The chosen implementation in this commit is to read protected
configuration and cache the values in a global configset. This is
similar to the caching behavior we get with the_repository->config.
Introduce git_protected_config(), which reads protected configuration
and caches them in the global configset protected_config. Then, refactor
`uploadpack.packObjectsHook` to use git_protected_config().
The protected configuration functions are named similarly to their
non-protected counterparts, e.g. git_protected_config_check_init() vs
git_config_check_init().
In light of constraint 1, this implementation can still be improved.
git_protected_config() iterates through every variable in
protected_config, which is wasteful, but it makes the conversion simple
because it matches existing patterns. We will likely implement constant
time lookup functions for protected configuration in a future series
(such functions already exist for non-protected configuration, i.e.
repo_config_get_*()).
An alternative that avoids introducing another configset is to continue
to read all config using git_config(), but only accept values that have
the correct config scope [2]. This technically fulfills constraint 2,
because git_config() simply ignores the local and worktree config when
the gitdir is not known. However, this would read incomplete config into
the_repository->config, which would need to be reset when the gitdir is
known and git_config() needs to read the local and worktree config.
Resetting the_repository->config might be reasonable while we only have
these 'protected configuration only' variables, but it's not clear
whether this extends well to future variables.
[1] In this case, we do have a candidate gitdir though, so with a little
refactoring, it might be possible to provide a gitdir.
[2] This is how `uploadpack.packObjectsHook` was implemented prior to
this commit.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For security reasons, there are config variables that are only trusted
when they are specified in certain configuration scopes, which are
sometimes referred to on-list as 'protected configuration' [1]. A future
commit will introduce another such variable, so let's define our terms
so that we can have consistent documentation and implementation.
In our documentation, define 'protected configuration' as the system,
global and command config scopes. As a shorthand, I will refer to
variables that are only respected in protected configuration as
'protected configuration only', but this term is not used in the
documentation.
This definition of protected configuration is based on whether or not
Git can reasonably protect the user by ignoring the configuration scope:
- System, global and command line config are considered protected
because an attacker who has control over any of those can do plenty of
harm without Git, so we gain very little by ignoring those scopes.
- On the other hand, local (and similarly, worktree) config are not
considered protected because it is relatively easy for an attacker to
control local config, e.g.:
- On some shared user environments, a non-admin attacker can create a
repository high up the directory hierarchy (e.g. C:\.git on
Windows), and a user may accidentally use it when their PS1
automatically invokes "git" commands.
`safe.directory` prevents attacks of this form by making sure that
the user intended to use the shared repository. It obviously
shouldn't be read from the repository, because that would end up
trusting the repository that Git was supposed to reject.
- "git upload-pack" is expected to run in repositories that may not be
controlled by the user. We cannot ignore all config in that
repository (because "git upload-pack" would fail), but we can limit
the risks by ignoring `uploadpack.packObjectsHook`.
Only `uploadpack.packObjectsHook` is 'protected configuration only'. The
following variables are intentionally excluded:
- `safe.directory` should be 'protected configuration only', but it does
not technically fit the definition because it is not respected in the
"command" scope. A future commit will fix this.
- `trace2.*` happens to read the same scopes as `safe.directory` because
they share an implementation. However, this is not for security
reasons; it is because we want to start tracing so early that
repository-level config and "-c" are not available [2].
This requirement is unique to `trace2.*`, so it does not makes sense
for protected configuration to be subject to the same constraints.
[1] For example,
https://lore.kernel.org/git/6af83767-576b-75c4-c778-0284344a8fe7@github.com/
[2] https://lore.kernel.org/git/a0c89d0d-669e-bf56-25d2-cbb09b012e70@jeffhostetler.com/
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a subsequent commit, we will introduce "protected configuration",
which is easiest to describe in terms of configuration scopes (i.e. it's
the union of the 'system', 'global', and 'command' scopes). This
description is fine for ML discussions, but it's inadequate for end
users because we don't provide a good description of "configuration
scopes" in the public docs.
145d59f482 (config: add '--show-scope' to print the scope of a config
value, 2020-02-10) introduced the word "scope" to our public docs, but
that only enumerates the scopes and assumes the user can figure out
what those values mean.
Add a SCOPES section to Documentation/git-config.txt that describes the
configuration scopes, their corresponding CLI options, and mentions that
some configuration options are only respected in certain scopes. Then,
use the word "scope" to simplify the FILES section and change some
confusing wording.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When sorting the output of `git shortlog` by count, a list of authors in
alphabetical order is then sorted by contribution count. Obviously, the
idea is to maintain the alphabetical order for items with identical
contribution count.
At the moment, this job is performed by `qsort()`. As that function is
not guaranteed to implement a stable sort algorithm, this can lead to
inconsistent and/or surprising behavior: items with identical
contribution count could lose their alphabetical sub-order.
The `qsort()` in MS Visual C's runtime does _not_ implement a stable
sort algorithm, and under certain circumstances this even causes a test
failure in t4201.21 "shortlog can match multiple groups", where two
authors both are listed with 2 contributions, and are listed in inverse
alphabetical order.
Let's instead use the stable sort provided by `git_stable_qsort()` to
avoid this inconsistency.
This is a companion to 2049b8dc65 (diffcore_rename(): use a stable sort,
2019-09-30).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 0041797449d (vimdiff: new implementation with layout support,
2022-03-30), we introduced a completely new implementation of the
`vimdiff` backend for `git mergetool`.
In this implementation, we no longer call `vim` directly but we
accumulate in the variable `FINAL_CMD` an arbitrary number of commands
for `vim` to execute, which necessitates the use of `eval` to split the
commands properly into multiple command-line arguments.
That same `eval` command also needs to pass the paths to `vim`, and
while it looks as if they are quoted correctly, that quoting only
reaches the `eval` instruction and is lost after that, therefore paths
that contain whitespace characters (or other characters that are
interpreted by the POSIX shell) are handled incorrectly.
This is a simple reproducer:
git init -b main bam-merge-fail
cd bam-merge-fail
echo a>"a file.txt"
git add "a file.txt"
git commit -m "added 'a file.txt'"
echo b>"a file.txt"
git add "a file.txt"
git commit -m "diverged b 'a file.txt'"
git checkout -b c HEAD~
echo c>"a file.txt"
git add "a file.txt"
git commit -m "diverged c 'a file.txt'"
git checkout main
git merge c
git mergetool --tool=vimdiff
With Git v2.37.0/v2.37.1, this will open 7 buffers, not four, and not
display the correct contents at all.
To fix this, let's not expand the variables containing the path
parameters before passing them to the `eval` command, but let that
command expand the variables instead.
This fixes https://github.com/git-for-windows/git/issues/3945
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In 78d5e4cfb4b (tests: refactor --write-junit-xml code, 2022-05-21),
this developer refactored the `--write-junit-xml` code a bit, including
the part where the current test case's title was used in a `set`
invocation, but failed to account for the fact that some test cases'
titles start with a long option, which the `set` misinterprets as being
intended for parsing.
Let's fix this by using the `set -- <...>` form.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
run_with_limited_processses() is used to end the loop faster when an
infinite loop happen. But "ulimit" is tied to the entire development
station, and the test will fail due to too many other processes or using
"--stress".
Without run_with_limited_processses() the infinite loop can also be
stopped due to global configrations or quotas, and the verification
still works fine. So let's remove run_with_limited_processses().
Signed-off-by: Han Xin <hanxin.hx@bytedance.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 0139c58ab9 (revisions API users: add "goto cleanup" for
release_revisions(), 2022-04-13) converted an early return in
cmd_diff_files() into a goto. But it put the cleanup label too early: if
read_cache_preload() returns an error, we'll set result to "-1", but
then jump to calling run_diff_files(), overwriting our result.
We should jump past the call to run_diff_files(). Likewise, we should go
past diff_result_code(), which is expecting to see a code from an actual
diff, not a negative error code.
In practice, I suspect this bug cannot actually be triggered, because
read_cache_preload() does not seem to ever return an error. Its return
value (eventually) comes from do_read_index(), which gives the number of
cache entries found, and calls die() on error. Still, it makes sense to
fix the inadvertent change from 0139c58ab9 first, and we can look into
the overall error handling of read_cache() separately (which is present
in many other callsites).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
When we found an invalid object recorded in the resolve-undo data,
we would have ended up dereferencing NULL while fsck. Reporting the
problem and going on to the next object is the right thing to do
here.
Noticed by SZEDER Gábor.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Various ref-filter options like "--contains" or "--merged" may cause us
to traverse large segments of the history graph. It's counter-productive
to have save_commit_buffer turned on, as that will instruct the commit
code to cache in-memory the object contents for each commit we traverse.
This increases the amount of heap memory used while providing little or
no benefit, since we're not actually planning to display those commits
(which is the usual reason that tools like git-log want to keep them
around). We can easily disable this feature while ref-filter is running.
This lowers peak heap (as measured by massif) for running:
git tag --contains 1da177e4c3
in linux.git from ~100MB to ~20MB. It also seems to improve runtime by
4-5% (600ms vs 630ms).
A few points to note:
- it should be safe to temporarily disable save_commit_buffer like
this. The saved buffers are accessed through get_commit_buffer(),
which treats the saved ones like a cache, and loads on-demand from
the object database on a cache miss. So any code that was using this
would not be wrong, it might just incur an extra object lookup for
some objects. But...
- I don't think any ref-filter related code is using the cache. While
it's true that an option like "--format=%(*contents:subject)" or
"--sort=*authordate" will need to look at the commit contents,
ref-filter doesn't use get_commit_buffer() to do so! It always reads
the objects directly via read_object_file(), though it does avoid
re-reading objects if the format can be satisfied without them.
Timing "git tag --format=%(*authordate)" shows that we're the same
before and after, as expected.
- Note that all of this assumes you don't have a commit-graph file. if
you do, then the heap usage is even lower, and the runtime is 10x
faster. So in that sense this is not urgent, as there's a much
better solution. But since it's such an obvious and easy win for
fallback cases (including commits which aren't yet in the graph
file), there's no reason not to.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Prior to 4f37d45706 (clone: respect remote unborn HEAD, 2021-02-05),
creation of the local HEAD was always done in update_head(). That commit
added code to handle an unborn head in an empty repository, and just did
all symref creation and config setup there.
This makes the code flow a little bit confusing, especially as new
corner cases have been covered (like the previous commit to match our
default branch name to a non-HEAD remote branch).
Let's move the creation of the unborn symref into update_head(). This
matches the other HEAD-creation cases, and now the logic is consistently
separated: the main cmd_clone() function only examines the situation and
sets variables based on what it finds, and update_head() actually
performs the update.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Git server end's ability to accept Accept-Language header was introduced
in f18604bbf2 (http: add Accept-Language header if possible, 2015-01-28),
but this is only used by very early phase of the transfer, which is HTTP
GET request to discover references. For other phases, like POST request
in the smart HTTP, the server does not know what language the client
speaks.
Teach git client to learn end-user's preferred language and throw
accept-language header to the server side. Once the server gets this header,
it has the ability to talk to end-user with language they understand.
This would be very helpful for many non-English speakers.
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add new helper function `gpg_trust_level_to_str()` which will
convert a given member of `enum signature_trust_level` to its
corresponding string (in lowercase). For example, `TRUST_ULTIMATE`
will yield the string "ultimate".
This will abstract out some code in `pretty.c` relating to gpg
signature trust levels.
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Hariom Verma <hariom18599@gmail.com>
Signed-off-by: Jaydeep Das <jaydeepjd.8914@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Although parse_options() can handle unknown --options just fine, none
of 'git multi-pack-index's subcommands rely on it, but do it on their
own: they invoke parse_options() with the PARSE_OPT_KEEP_UNKNOWN flag,
then check whether there are any unparsed arguments left, and print
usage and quit if necessary.
Drop that PARSE_OPT_KEEP_UNKNOWN flag to let parse_options() handle
unknown options instead, which has the additional benefit that it
prints not only the usage but an "error: unknown option `foo'" message
as well.
Do leave the unparsed arguments check to catch any unexpected
non-option arguments, though, e.g. 'git multi-pack-index write foo'.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Some of the rules for using COPY_ARRAY instead of memcpy with sizeof are
intended to reduce the number of sizeof variants to deal with. They can
have unintended side effects if only they match, but not the one for the
COPY_ARRAY conversion at the end.
Avoid these side effects by instead using a self-contained rule for each
combination of array and pointer for source and destination which lists
all sizeof variants inline.
This lets "make contrib/coccinelle/array.cocci.patch" take 15% longer on
my machine, but gives peace of mind that no incomplete transformation
will be generated.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
For SHA-256, we currently have support for OpenSSL and libgcrypt because
these two libraries contain optimized implementations that can take
advantage of native processor instructions. However, OpenSSL is not
suitable for linking against for Linux distros due to licensing
incompatibilities with the GPLv2, and libgcrypt has been less favored by
cryptographers due to some security-related implementation issues,
which, while not affecting our use of hash algorithms, has affected its
reputation.
Let's add another option that's compatible with the GPLv2, which is
Nettle. This is an option which is generally better than libgcrypt
because on many distros GnuTLS (which uses Nettle) is used for HTTPS and
therefore as a practical matter it will be available on most systems.
As a result, prefer it over libgcrypt and our built-in implementation.
Nettle also has recently gained support for Intel's SHA-NI instructions,
which compare very favorably to other implementations, as well as
assembly implementations for when SHA-NI is not available.
A git gc on git.git sees a 12% performance improvement with Nettle over
our block SHA-256 implementation due to general assembly improvements.
With SHA-NI, the performance of raw SHA-256 on a 2 GiB file goes from
7.296 seconds with block SHA-256 to 1.523 seconds with Nettle.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The variables 'source', 'destination', and 'submodule_gitfile' are
all of type "const char **", and an element of such an array is of
"type const char *", but these memmove() calls were written as if
these variables are of type "char **".
Once these memmove() calls are fixed to use the correct type to
compute the number of bytes to be moved, e.g.
- memmove(source + i, source + i + 1, n * sizeof(char *));
+ memmove(source + i, source + i + 1, n * sizeof(const char *));
existing contrib/coccinelle/array.cocci rules can recognize them as
candidates for turning into MOVE_ARRAY().
While at it, use CALLOC_ARRAY() instead of xcalloc() to allocate the
modes[] array that is involved in the change.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'vim' has two configuration options ('splitbelow' and 'splitright') that
change the way the 'split' command behaves. When they are set, the
commands that the layout engine generates no longer work as expected.
In order to fix this we can append special keyword 'leftabove' to each
'split' and 'vertical split' subcommand found inside the command string
generated by the layout engine.
This works because whatever comes after 'leftabove' will temporally
ignore settings 'splitbelow' and 'splitright'.
Reported-by: Matthew Klein <mklein994@gmail.com>
Signed-off-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Usually clone tries to use the same local HEAD as the remote (unless the
user has given --branch explicitly). Even if the remote HEAD is detached
or unborn, we can detect those situations with modern versions of Git.
If the remote is too old to support the "unborn" extension (or it has
been disabled via config), then we can't know the name of the remote's
unborn HEAD, and we fall back whatever the local default branch name is
configured to be.
But that leads to one weird corner case. It's rare because it needs a
number of factors:
- the remote has an unborn HEAD
- the remote is too old to support "unborn", or has disabled it
- the remote has another branch "foo"
- the local default branch name is "foo"
In that case you end up with a local clone on an unborn "foo" branch,
disconnected completely from the remote's "foo". This is rare in
practice, but the result is quite confusing.
When choosing "foo", we can double check whether the remote has such a
name, and if so, start our local "foo" at the same spot, rather than
making it unborn.
Note that this causes a test failure in t5605, which is cloning from a
bundle that doesn't contain HEAD (so it behaves like a remote that
doesn't support "unborn"), but has a single "main" branch. That test
expects that we end up in the weird "unborn main" case, where we don't
actually check out the remote branch of the same name. Even though we
have to update the test, this seems like an argument in favor of this
patch: checking out main is what I'd expect from such a bundle.
So this patch updates the test for the new behavior and adds an adjacent
one that checks what the original was going for: if there's no HEAD and
the bundle _doesn't_ have a branch that matches our local default name,
then we end up with nothing checked out.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Unless "--branch" was given, clone generally tries to match the local
HEAD to the remote one. For most repositories, this is easy: the remote
tells us which branch HEAD was pointing to, and we call our local
checkout() function on that branch.
When cloning an empty repository, it's a little more tricky: we have
special code that checks the transport's "unborn" extension, or falls back
to our local idea of what the default branch should be. In either case,
we point the new HEAD to that, and set up the branch.* config.
But that leaves one case unhandled: when the remote repository _isn't_
empty, but its HEAD is unborn. The checkout() function is smart enough
to realize we didn't fetch the remote HEAD and it bails with a warning.
But we'll have ignored any information the remote gave us via the unborn
extension. This leads to nonsense outcomes:
- If the remote has its HEAD pointing to an unborn "foo" and contains
another branch "bar", cloning will get branch "bar" but leave the
local HEAD pointing at "master" (or whatever our local default is),
which is useless. The project does not use "master" as a branch.
- Worse, if the other branch "bar" is instead called "master" (but
again, the remote HEAD is not pointing to it), then we end up with a
local unborn branch "master", which is not connected to the remote
"master" (it shares no history, and there's no branch.* config).
Instead, we should try to use the remote's HEAD, even if its unborn, to
be consistent with the other cases.
The reason this case was missed is that cmd_clone() handles empty and
non-empty repositories on two different sides of a conditional:
if (we have any refs) {
fetch refs;
check for --branch;
otherwise, try to point our head at remote head;
otherwise, our head is NULL;
} else {
check for --branch;
otherwise, try to use "unborn" extension;
otherwise, fall back to our default name name;
}
So the smallest change would be to repeat the "unborn" logic at the end
of the first block. But we can note some other overlaps and
inconsistencies:
- both sides have to handle --branch (though note that it's always an
error for the empty repo case, since an empty repo by definition
does not have a matching branch)
- the fall back to the default name is much more explicit in the
empty-repo case. The non-empty case eventually ends up bailing
from checkout() with a warning, which produces a similar result, but
fails to set up the branch config we do in the empty case.
So let's pull the HEAD setup out of this conditional entirely. This
de-duplicates some of the code and the result is easy to follow, because
helper functions like find_ref_by_name() do the right thing even in the
empty-repo case (i.e., by returning NULL).
There are two subtleties:
- for a remote with a detached HEAD, it will advertise an oid for HEAD
(which we store in our "remote_head" variable), but we won't find a
matching refname (so our "remote_head_points_at" is NULL). In this
case we make a local detached HEAD to match. Right now this happens
implicitly by reaching update_head() with a non-NULL remote_head
(since we skip all of the unborn-fallback). We'll now need to
account for it explicitly before doing the fallback.
- for an empty repo, we issue a warning to the user that they've
cloned an empty repo. The text of that warning doesn't make sense
for a non-empty repo with an unborn HEAD, so we'll have to
differentiate the two cases there. We could just use different text,
but instead let's allow the code to continue down to checkout(),
which will issue an appropriate warning, like:
remote HEAD refers to nonexistent ref, unable to checkout
Continuing down to checkout() will make it easier to do more fixes
on top (see below).
Note that this patch fixes the case where the other side reports an
unborn head to us using the protocol extension. It _doesn't_ fix the
case where the other side doesn't tell us, we locally guess "master",
and the other side happens to have a "master" which its HEAD doesn't
point. But it doesn't make anything worse there, and it should actually
make it easier to fix that problem on top.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We don't need to put a "\n" in calls to warning(), since it adds one
itself (and the user sees an extra blank line). Drop it, and while we're
here, drop the full-stop from the message, which goes against our
guidelines.
This bug dates all the way back to 8434c2f1af (Build in clone,
2008-04-27), but presumably nobody noticed because it's hard to trigger:
you have to clone a repository whose HEAD is unborn, but which is not
otherwise empty.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Generalize the newly added "unused.cocci" rule to find more than just
"struct strbuf", let's have it find the same unused patterns for
"struct string_list", as well as other code that uses
similar-looking *_{release,clear,free}() and {release,clear,free}_*()
functions.
We're intentionally loose in accepting e.g. a "strbuf_init(&sb)"
followed by a "string_list_clear(&sb, 0)". It's assumed that the
compiler will catch any such invalid code, i.e. that our
constructors/destructors don't take a "void *".
See [1] for example of code that would be covered by the
"get_worktrees()" part of this rule. We'd still need work that the
series is based on (we were passing "worktrees" to a function), but
could now do the change in [1] automatically.
1. https://lore.kernel.org/git/Yq6eJFUPPTv%2Fzc0o@coredump.intra.peff.net/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Add a coccinelle rule to remove "struct strbuf" initialization
followed by calling "strbuf_release()" function, without any uses of
the strbuf in the same function.
See the tests in contrib/coccinelle/tests/unused.{c,res} for what it's
intended to find and replace.
The inclusion of "contrib/scalar/scalar.c" is because "spatch" was
manually run on it (we don't usually run spatch on contrib).
Per the "buggy code" comment we also match a strbuf_init() before the
xmalloc(), but we're not seeking to be so strict as to make checks
that the compiler will catch for us redundant. Saying we'll match
either "init" or "xmalloc" lines makes the rule simpler.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|