| Commit message (Collapse) | Author | Files | Lines |
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We've recently forbidden .gitmodules to be a symlink in
verify_path(). And it's an easy way to circumvent our fsck
checks for .gitmodules content. So let's complain when we
see it.
Signed-off-by: Jeff King <peff@peff.net>
|
|
Now that the internal fsck code has all of the plumbing we
need, we can start checking incoming .gitmodules files.
Naively, it seems like we would just need to add a call to
fsck_finish() after we've processed all of the objects. And
that would be enough to cover the initial test included
here. But there are two extra bits:
1. We currently don't bother calling fsck_object() at all
for blobs, since it has traditionally been a noop. We'd
actually catch these blobs in fsck_finish() at the end,
but it's more efficient to check them when we already
have the object loaded in memory.
2. The second pass done by fsck_finish() needs to access
the objects, but we're actually indexing the pack in
this process. In theory we could give the fsck code a
special callback for accessing the in-pack data, but
it's actually quite tricky:
a. We don't have an internal efficient index mapping
oids to packfile offsets. We only generate it on
the fly as part of writing out the .idx file.
b. We'd still have to reconstruct deltas, which means
we'd basically have to replicate all of the
reading logic in packfile.c.
Instead, let's avoid running fsck_finish() until after
we've written out the .idx file, and then just add it
to our internal packed_git list.
This does mean that the objects are "in the repository"
before we finish our fsck checks. But unpack-objects
already exhibits this same behavior, and it's an
acceptable tradeoff here for the same reason: the
quarantine mechanism means that pushes will be
fully protected.
In addition to a basic push test in t7415, we add a sneaky
pack that reverses the usual object order in the pack,
requiring that index-pack access the tree and blob during
the "finish" step.
This already works for unpack-objects (since it will have
written out loose objects), but we'll check it with this
sneaky pack for good measure.
Signed-off-by: Jeff King <peff@peff.net>
|
|
As with the previous commit, we must call fsck's "finish"
function in order to catch any queued objects for
.gitmodules checks.
This second pass will be able to access any incoming
objects, because we will have exploded them to loose objects
by now.
This isn't quite ideal, because it means that bad objects
may have been written to the object database (and a
subsequent operation could then reference them, even if the
other side doesn't send the objects again). However, this is
sufficient when used with receive.fsckObjects, since those
loose objects will all be placed in a temporary quarantine
area that will get wiped if we find any problems.
Signed-off-by: Jeff King <peff@peff.net>
|
|
Now that the internal fsck code is capable of checking
.gitmodules files, we just need to teach its callers to use
the "finish" function to check any queued objects.
With this, we can now catch the malicious case in t7415 with
git-fsck.
Signed-off-by: Jeff King <peff@peff.net>
|
|
This patch detects and blocks submodule names which do not
match the policy set forth in submodule-config. These should
already be caught by the submodule code itself, but putting
the check here means that newer versions of Git can protect
older ones from malicious entries (e.g., a server with
receive.fsckObjects will block the objects, protecting
clients which fetch from it).
As a side effect, this means fsck will also complain about
.gitmodules files that cannot be parsed (or were larger than
core.bigFileThreshold).
Signed-off-by: Jeff King <peff@peff.net>
|
|
If we have a tree that points to a .gitmodules blob but
don't have that blob, we can't check its contents. This
produces an fsck error when we encounter it.
But in the case of a promisor object, this absence is
expected, and we must not complain. Note that this can
technically circumvent our transfer.fsckObjects check.
Imagine a client fetches a tree, but not the matching
.gitmodules blob. An fsck of the incoming objects will show
that we don't have enough information. Later, we do fetch
the actual blob. But we have no idea that it's a .gitmodules
file.
The only ways to get around this would be to re-scan all of
the existing trees whenever new ones enter (which is
expensive), or to somehow persist the gitmodules_found set
between fsck runs (which is complicated).
In practice, it's probably OK to ignore the problem. Any
repository which has all of the objects (including the one
serving the promisor packs) can perform the checks. Since
promisor packs are inherently about a hierarchical topology
in which clients rely on upstream repositories, those
upstream repositories can protect all of their downstream
clients from broken objects.
Signed-off-by: Jeff King <peff@peff.net>
|
|
In preparation for performing fsck checks on .gitmodules
files, this commit plumbs in the actual detection of the
files. Note that unlike most other fsck checks, this cannot
be a property of a single object: we must know that the
object is found at a ".gitmodules" path at the root tree of
a commit.
Since the fsck code only sees one object at a time, we have
to mark the related objects to fit the puzzle together. When
we see a commit we mark its tree as a root tree, and when
we see a root tree with a .gitmodules file, we mark the
corresponding blob to be checked.
In an ideal world, we'd check the objects in topological
order: commits followed by trees followed by blobs. In that
case we can avoid ever loading an object twice, since all
markings would be complete by the time we get to the marked
objects. And indeed, if we are checking a single packfile,
this is the order in which Git will generally write the
objects. But we can't count on that:
1. git-fsck may show us the objects in arbitrary order
(loose objects are fed in sha1 order, but we may also
have multiple packs, and we process each pack fully in
sequence).
2. The type ordering is just what git-pack-objects happens
to write now. The pack format does not require a
specific order, and it's possible that future versions
of Git (or a custom version trying to fool official
Git's fsck checks!) may order it differently.
3. We may not even be fscking all of the relevant objects
at once. Consider pushing with transfer.fsckObjects,
where one push adds a blob at path "foo", and then a
second push adds the same blob at path ".gitmodules".
The blob is not part of the second push at all, but we
need to mark and check it.
So in the general case, we need to make up to three passes
over the objects: once to make sure we've seen all commits,
then once to cover any trees we might have missed, and then
a final pass to cover any .gitmodules blobs we found in the
second pass.
We can simplify things a bit by loosening the requirement
that we find .gitmodules only at root trees. Technically
a file like "subdir/.gitmodules" is not parsed by Git, but
it's not unreasonable for us to declare that Git is aware of
all ".gitmodules" files and make them eligible for checking.
That lets us drop the root-tree requirement, which
eliminates one pass entirely. And it makes our worst case
much better: instead of potentially queueing every root tree
to be re-examined, the worst case is that we queue each
unique .gitmodules blob for a second look.
This patch just adds the boilerplate to find .gitmodules
files. The actual content checks will come in a subsequent
commit.
Signed-off-by: Jeff King <peff@peff.net>
|
|
Because fscking a blob has always been a noop, we didn't
bother passing around the blob data. In preparation for
content-level checks, let's fix up a few things:
1. The fsck_object() function just returns success for any
blob. Let's a noop fsck_blob(), which we can fill in
with actual logic later.
2. The fsck_loose() function in builtin/fsck.c
just threw away blob content after loading it. Let's
hold onto it until after we've called fsck_object().
The easiest way to do this is to just drop the
parse_loose_object() helper entirely. Incidentally,
this also fixes a memory leak: if we successfully
loaded the object data but did not parse it, we would
have left the function without freeing it.
3. When fsck_loose() loads the object data, it
does so with a custom read_loose_object() helper. This
function streams any blobs, regardless of size, under
the assumption that we're only checking the sha1.
Instead, let's actually load blobs smaller than
big_file_threshold, as the normal object-reading
code-paths would do. This lets us fsck small files, and
a NULL return is an indication that the blob was so big
that it needed to be streamed, and we can pass that
information along to fsck_blob().
Signed-off-by: Jeff King <peff@peff.net>
|
|
There's no need for us to manually check for ".git"; it's a
subset of the other filesystem-specific tests. Dropping it
makes our code slightly shorter. More importantly, the
existing code may make a reader wonder why ".GIT" is not
covered here, and whether that is a bug (it isn't, as it's
also covered in the filesystem-specific tests).
Signed-off-by: Jeff King <peff@peff.net>
|
|
If fsck reports an error, we say only "Error in object".
This isn't quite as bad as it might seem, since the fsck
code would have dumped some errors to stderr already. But it
might help to give a little more context. The earlier output
would not have even mentioned "fsck", and that may be a clue
that the "fsck.*" or "*.fsckObjects" config may be relevant.
Signed-off-by: Jeff King <peff@peff.net>
|
|
There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:
1. It won't be portable to systems without symlinks.
2. It may behave inconsistently, since Git may look at
this file in the index or a tree without bothering to
resolve any symbolic links. We don't do this _yet_, but
the config infrastructure is there and it's planned for
the future.
With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:
a. A symlinked .gitmodules file may circumvent any fsck
checks of the content.
b. Git may read and write from the on-disk file without
sanity checking the symlink target. So for example, if
you link ".gitmodules" to "../oops" and run "git
submodule add", we'll write to the file "oops" outside
the repository.
Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.
Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.
Signed-off-by: Jeff King <peff@peff.net>
|
|
In the update_one(), we check verify_path() on the proposed
path before doing anything else. In preparation for having
verify_path() look at the file mode, let's stat the file
earlier, so we can check the mode accurately.
This is made a bit trickier by the fact that this function
only does an lstat in a few code paths (the ones that flow
down through process_path()). So we can speculatively do the
lstat() here and pass the results down, and just use a dummy
mode for cases where we won't actually be updating the index
from the filesystem.
Signed-off-by: Jeff King <peff@peff.net>
|
|
We're more restrictive than we need to be in matching ".GIT"
on case-sensitive filesystems; let's make a note that this
is intentional.
Signed-off-by: Jeff King <peff@peff.net>
|
|
We check ".git" and ".." in the same switch statement, and
fall through the cases to share the end-of-component check.
While this saves us a line or two, it makes modifying the
function much harder. Let's just write it out.
Signed-off-by: Jeff King <peff@peff.net>
|
|
We have the convenient skip_prefix() helper, but if you want
to do case-insensitive matching, you're stuck doing it by
hand. We could add an extra parameter to the function to
let callers ask for this, but the function is small and
somewhat performance-critical. Let's just re-implement it
for the case-insensitive version.
Signed-off-by: Jeff King <peff@peff.net>
|
|
This tests primarily for NTFS issues, but also adds one example of an
HFS+ issue.
Thanks go to Congyi Wu for coming up with the list of examples where
NTFS would possibly equate the filename with `.gitmodules`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
|
|
When we started to catch NTFS short names that clash with .git, we only
looked for GIT~1. This is sufficient because we only ever clone into an
empty directory, so .git is guaranteed to be the first subdirectory or
file in that directory.
However, even with a fresh clone, .gitmodules is *not* necessarily the
first file to be written that would want the NTFS short name GITMOD~1: a
malicious repository can add .gitmodul0000 and friends, which sorts
before `.gitmodules` and is therefore checked out *first*. For that
reason, we have to test not only for ~1 short names, but for others,
too.
It's hard to just adapt the existing checks in is_ntfs_dotgit(): since
Windows 2000 (i.e., in all Windows versions still supported by Git),
NTFS short names are only generated in the <prefix>~<number> form up to
number 4. After that, a *different* prefix is used, calculated from the
long file name using an undocumented, but stable algorithm.
For example, the short name of .gitmodules would be GITMOD~1, but if it
is taken, and all of ~2, ~3 and ~4 are taken, too, the short name
GI7EBA~1 will be used. From there, collisions are handled by
incrementing the number, shortening the prefix as needed (until ~9999999
is reached, in which case NTFS will not allow the file to be created).
We'd also want to handle .gitignore and .gitattributes, which suffer
from a similar problem, using the fall-back short names GI250A~1 and
GI7D29~1, respectively.
To accommodate for that, we could reimplement the hashing algorithm, but
it is just safer and simpler to provide the known prefixes. This
algorithm has been reverse-engineered and described at
https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but
still available via https://web.archive.org/.
These can be recomputed by running the following Perl script:
-- snip --
use warnings;
use strict;
sub compute_short_name_hash ($) {
my $checksum = 0;
foreach (split('', $_[0])) {
$checksum = ($checksum * 0x25 + ord($_)) & 0xffff;
}
$checksum = ($checksum * 314159269) & 0xffffffff;
$checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000);
$checksum -= (($checksum * 1152921497) >> 60) * 1000000007;
return scalar reverse sprintf("%x", $checksum & 0xffff);
}
print compute_short_name_hash($ARGV[0]);
-- snap --
E.g., running that with the argument ".gitignore" will
result in "250a" (which then becomes "gi250a" in the code).
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
|
|
Both verify_path() and fsck match ".git", ".GIT", and other
variants specific to HFS+. Let's allow matching other
special files like ".gitmodules", which we'll later use to
enforce extra restrictions via verify_path() and fsck.
Signed-off-by: Jeff King <peff@peff.net>
|
|
We walk through the "name" string using an int, which can
wrap to a negative value and cause us to read random memory
before our array (e.g., by creating a tree with a name >2GB,
since "int" is still 32 bits even on most 64-bit platforms).
Worse, this is easy to trigger during the fsck_tree() check,
which is supposed to be protecting us from malicious
garbage.
Note one bit of trickiness in the existing code: we
sometimes assign -1 to "len" at the end of the loop, and
then rely on the "len++" in the for-loop's increment to take
it back to 0. This is still legal with a size_t, since
assigning -1 will turn into SIZE_MAX, which then wraps
around to 0 on increment.
Signed-off-by: Jeff King <peff@peff.net>
|
|
Submodule "names" come from the untrusted .gitmodules file,
but we blindly append them to $GIT_DIR/modules to create our
on-disk repo paths. This means you can do bad things by
putting "../" into the name (among other things).
Let's sanity-check these names to avoid building a path that
can be exploited. There are two main decisions:
1. What should the allowed syntax be?
It's tempting to reuse verify_path(), since submodule
names typically come from in-repo paths. But there are
two reasons not to:
a. It's technically more strict than what we need, as
we really care only about breaking out of the
$GIT_DIR/modules/ hierarchy. E.g., having a
submodule named "foo/.git" isn't actually
dangerous, and it's possible that somebody has
manually given such a funny name.
b. Since we'll eventually use this checking logic in
fsck to prevent downstream repositories, it should
be consistent across platforms. Because
verify_path() relies on is_dir_sep(), it wouldn't
block "foo\..\bar" on a non-Windows machine.
2. Where should we enforce it? These days most of the
.gitmodules reads go through submodule-config.c, so
I've put it there in the reading step. That should
cover all of the C code.
We also construct the name for "git submodule add"
inside the git-submodule.sh script. This is probably
not a big deal for security since the name is coming
from the user anyway, but it would be polite to remind
them if the name they pick is invalid (and we need to
expose the name-checker to the shell anyway for our
test scripts).
This patch issues a warning when reading .gitmodules
and just ignores the related config entry completely.
This will generally end up producing a sensible error,
as it works the same as a .gitmodules file which is
missing a submodule entry (so "submodule update" will
barf, but "git clone --recurse-submodules" will print
an error but not abort the clone.
There is one minor oddity, which is that we print the
warning once per malformed config key (since that's how
the config subsystem gives us the entries). So in the
new test, for example, the user would see three
warnings. That's OK, since the intent is that this case
should never come up outside of malicious repositories
(and then it might even benefit the user to see the
message multiple times).
Credit for finding this vulnerability and the proof of
concept from which the test script was adapted goes to
Etienne Stalmans.
Signed-off-by: Jeff King <peff@peff.net>
|
|
Add --dissociate option to add and update commands, both clone helper commands
that already have the --reference option --dissociate pairs with.
Signed-off-by: Casey Fitzpatrick <kcghost@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The '--progress' was introduced in 72c5f88311d (clone: pass --progress
decision to recursive submodules, 2016-09-22) to fix the progress reporting
of the clone command. Also add the progress option to the 'submodule add'
command. The update command already supports the progress flag, but it
is not documented.
Signed-off-by: Casey Fitzpatrick <kcghost@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
'recommend_shallow' and 'jobs' variables do not need quotes. They only hold a
single token value, and even if they were multi-token it is likely we would want
them split at IFS rather than pass a single string.
'progress' is a boolean value. Treat it like the other boolean values in the
script by using a substitution.
Signed-off-by: Casey Fitzpatrick <kcghost@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Switch from gcc-4.8 to gcc-8. Newer compilers come with more warning
checks (usually in -Wextra). Since -Wextra is enabled in developer
mode (which is also enabled in travis), this lets travis report more
warnings before other people do it.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
If we don't have a repository, then we can't initialize the
ref store. Prior to 64a741619d (refs: store the main ref
store inside the repository struct, 2018-04-11), we'd try to
access get_git_dir(), and outside a repository that would
trigger a BUG(). After that commit, though, we directly use
the_repository->git_dir; if it's NULL we'll just segfault.
Let's catch this case and restore the BUG() behavior.
Obviously we don't ever want to hit this code, but a BUG()
is a lot more helpful than a segfault if we do.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The tests added in 2f271cd9cf (t9902-completion: add tests
demonstrating issues with quoted pathnames, 2018-05-08) and in
2ab6eab4fe (completion: improve handling quoted paths in 'git
ls-files's output, 2018-03-28) have a few shortcomings:
- All these tests use the 'test_completion' helper function, thus
they are exercising the whole completion machinery, although they
are only interested in how git-aware path completion, specifically
the __git_complete_index_file() function deals with unusual
characters in pathnames and on the command line.
- These tests can't satisfactorily test the case of pathnames
containing spaces, because 'test_completion' gets the words on the
command line as a single argument and it uses space as word
separator.
- Some of the tests are protected by different FUNNYNAMES_* prereqs
depending on whether they put backslashes and double quotes or
separator characters (FS, GS, RS, US) in pathnames, although a
filesystem not allowing one likely doesn't allow the others
either.
- One of the tests operates on paths containing '|' and '&'
characters without being protected by a FUNNYNAMES prereq, but
some filesystems (notably on Windows) don't allow these characters
in pathnames, either.
Replace these tests with basically equivalent, more focused tests that
call __git_complete_index_file() directly. Since this function only
looks at the current word to be completed, i.e. the $cur variable, we
can easily include pathnames containing spaces in the tests, so use
such pathnames instead of pathnames containing '|' and '&'. Finally,
use only a single FUNNYNAMES prereq for all kinds of special
characters.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In __gitcomp_file_direct() we tell Bash that it should handle our
possible completion words as filenames with the following piece of
cleverness:
# use a hack to enable file mode in bash < 4
compopt -o filenames +o nospace 2>/dev/null ||
compgen -f /non-existing-dir/ > /dev/null
Unfortunately, this makes this function always return with error
when it is not invoked in real completion, but e.g. in tests of
't9902-completion.sh':
- First the 'compopt' line errors out
- either because in Bash v3.x there is no such command,
- or because in Bash v4.x it complains about "not currently
executing completion function",
- then 'compgen' just silently returns with error because of the
non-existing directory.
Since __gitcomp_file_direct() is now the last command executed in
__git_complete_index_file(), that function returns with error as well,
which prevents it from being invoked in tests directly as is, and
would require extra steps in test to hide its error code.
So let's make sure that __gitcomp_file_direct() doesn't return with
error, because in the tests coming in the following patch we do want
to exercise __git_complete_index_file() directly,
__gitcomp_file() contains the same construct, and thus it, too, always
returns with error. Update that function accordingly as well.
While at it, also remove the space from between the redirection
operator and the filename in both functions.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The last two tests 'editor with a space' and 'core.editor with a
space' in 't7005-editor.sh' need the SPACES_IN_FILENAMES prereq to
ensure that they are only run on filesystems that allow, well, spaces
in filenames. However, we have been putting a space in the name of
the trash directory for just over a decade now, so we wouldn't be able
to run any of our tests on such a filesystem in the first place.
This prereq is therefore unnecessary, remove it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Do the preparatory fetch inside the test of ls-remote --symref to avoid
cluttering the test output and to be able to catch unexpected fetch
failures.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
Commit 941ba8db57 (Eliminate recursion in setting/clearing
marks in commit list, 2012-01-14) used a clever double-loop
to avoid allocations for single-parent chains of history.
However, it did so only when following parents of parents
(which was an uncommon case), and _always_ incurred at least
one allocation to populate the list of pending parents in
the first place.
We can turn this into zero-allocation in the common case by
iterating directly over the initial parent list, and then
following up on any pending items we might have discovered.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The mark_parents_uninteresting() function uses two loops:
an outer one to process our queue of pending parents, and an
inner one to process first-parent chains. This is a clever
optimization from 941ba8db57 (Eliminate recursion in
setting/clearing marks in commit list, 2012-01-14) to limit
the number of linked-list allocations when following
single-parent chains.
Unfortunately, this makes the result a little hard to read.
Let's replace the list with a stack. Then we don't have to
worry about doing this double-loop optimization, as we'll
just reuse the top element of the stack as we pop/push.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We allow UNINTERESTING objects in a traversal to be
unavailable. As part of this, mark_parents_uninteresting()
checks whether we have a particular uninteresting parent; if
not, we will mark it as "parsed" so that later code skips
it.
This code is redundant and even a little bit harmful, so
let's drop it.
It's redundant because when our parse_object() call in
add_parents_to_list() fails, we already quietly skip
UNINTERESTING parents. This redundancy is a historical
artifact. The mark_parents_uninteresting() protection is
from 454fbbcde3 (git-rev-list: allow missing objects when
the parent is marked UNINTERESTING, 2005-07-10). Much later,
aeeae1b771 (revision traversal: allow UNINTERESTING objects
to be missing, 2009-01-27) covered more cases by making the
actual parse more gentle.
As an aside, even if this weren't redundant, it would be
insufficient. The gentle parsing handles both missing and
corrupted objects, whereas the has_object_file() check
we're getting rid of covers only missing ones.
And the code we're dropping is harmful for two reasons:
1. We spend extra time on the object lookup, even though
we don't actually need the information at this point
(and will just repeat that lookup later when we parse
for the common case that we _do_ have the object).
2. It "lies" about the commit by setting the parsed flag,
even though we didn't load any useful data into the
struct. This shouldn't matter for the UNINTERESTING
case, but we may later clear our flags and do another
traversal in the same process. While pretty unlikely,
it's possible that we could then look at the same
commit without the UNINTERESTING flag, in which case
we'd produce the wrong result (we'd think it's a commit
with no parents, when in fact we should probably die
due to the missing object).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
It's generally acceptable for UNINTERESTING objects in a
traversal to be unavailable (e.g., see aeeae1b771). When
marking trees UNINTERESTING, we access the object database
twice: once to check if the object is missing (and return
quietly if it is), and then again to actually parse it.
We can instead just try to parse; if that fails, we can then
return quietly. That halves the effort we spend on locating
the object.
Note that this isn't _exactly_ the same as the original
behavior, as the parse failure could be due to other
problems than a missing object: it could be corrupted, in
which case the original code would have died. But the new
behavior is arguably better, as it covers the object being
unavailable for any reason. We'll also still issue a warning
to stderr in such a case.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
On linux 64-bit architecture, pahole finds that there's a 4 bytes
padding after 'index'. Moving it to the end reduces this struct's size
from 72 to 64 bytes (because of another 4 bytes padding after
graph_pos). On linux 32-bit, the struct size remains 52 bytes like
before.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After performing a merge that has conflicts git status will, by default,
attempt to detect renames which causes many objects to be examined. In a
virtualized repo, those objects do not exist locally so the rename logic
triggers them to be fetched from the server. This results in the status call
taking hours to complete on very large repos vs seconds with this patch.
Add a new config status.renames setting to enable turning off rename
detection during status and commit. This setting will default to the value
of diff.renames.
Add a new config status.renamelimit setting to to enable bounding the time
spent finding out inexact renames during status and commit. This setting
will default to the value of diff.renamelimit.
Add --no-renames command line option to status that enables overriding the
config setting from the command line. Add --find-renames[=<n>] command line
option to status that enables detecting renames and optionally setting the
similarity index.
Reviewed-by: Elijah Newren <newren@gmail.com>
Original-Patch-by: Alejandro Pauly <alpauly@microsoft.com>
Signed-off-by: Ben Peart <Ben.Peart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
By default we want to fill the whole screen if possible, but we do not
want to use up _all_ terminal columns because the last character is
going hit the border, push the cursor over and wrap. Keep it at
default value zero, which will make print_columns() set the width at
term_columns() - 1.
This affects the test in t7004 because effective column width before
was 40 but now 39 so we need to compensate it by one or the output at
39 columns has a different layout.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
After we invoke the pager, our stdout goes to a pipe, not the
terminal, meaning we can no longer use an ioctl to get the
terminal width. For that reason, ad6c3739a3 (pager: find out
the terminal width before spawning the pager, 2012-02-12)
started caching the terminal width.
But that cache is only an in-process variable. Any programs
we spawn will also not be able to run that ioctl, but won't
have access to our cache. They'll end up falling back to our
80-column default.
You can see the problem with:
git tag --column=row
Since git-tag spawns a pager these days, its spawned
git-column helper will see neither the terminal on stdout
nor a useful COLUMNS value (assuming you do not export it
from your shell already). And you'll end up with 80-column
output in the pager, regardless of your terminal size.
We can fix this by setting COLUMNS right before spawning the
pager. That fixes this case, as well as any more complicated
ones (e.g., a paged program spawns another script which then
generates columnized output).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
According to the documentation, it is possible to "specify 40 '0' or an
empty string as <oldvalue> to make sure that the ref you are creating
does not exist." But in the code for pseudorefs, we do not implement
this, as demonstrated by the failing tests added in the previous commit.
If we fail to read the old ref, we immediately die. But a failure to
read would actually be a good thing if we have been given the zero oid.
With the zero oid, allow -- and even require -- the ref-reading to fail.
This implements the "make sure that the ref ... does not exist" part of
the documentation and fixes both failing tests from the previous commit.
Since we have a `strbuf err` for collecting errors, let's use it and
signal an error to the caller instead of dying hard.
Reported-by: Rafael Ascensão <rafa.almas@gmail.com>
Helped-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
I have not been able to find any tests around adding pseudorefs using
`git update-ref`. Add some as outlined in this table (original design by
Michael Haggerty; modified and extended by me):
Pre-update value | ref-update old OID | Expected result
-------------------|----------------------|----------------
missing | value | reject
missing | none given | accept
set | none given | accept
set | correct value | accept
set | wrong value | reject
missing | zero | accept *
set | zero | reject *
The tests marked with a * currently fail, despite git-update-ref(1)
claiming that it is possible to "specify 40 '0' or an empty string as
<oldvalue> to make sure that the ref you are creating does not exist."
These failing tests will be fixed in the next commit.
It is only natural to test deletion as well. Test deletion without an
old OID, with a correct one and with an incorrect one.
Suggested-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We have two error messages that complain about the "sha1". Because we
are about to touch one of these sites and add some tests, let's first
modernize the messages to say "object ID" instead.
While at it, make the second one use `error()` instead of `warning()`.
After printing the message, we do not continue, but actually drop the
lock and return -1 without deleting the pseudoref.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
The current document mentions OBJ_* constants without their actual
values. A git developer would know these are from cache.h but that's
not very friendly to a person who wants to read this file to implement
a pack file parser.
Similarly, the deltified representation is not documented at all (the
"document" is basically patch-delta.c). Translate that C code to
English with a bit more about what ofs-delta and ref-delta mean.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
We're not really removing slashes, but slash-separated path
components. Let's make that more clear.
Reported-by: kelly elton <its.the.doc@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
b2dc968e60 (t5516: refactor oddball tests, 2008-11-07) accidentaly
broke the &&-chain in the test 'push does not update local refs on
failure', but since it was in a subshell, chain-lint couldn't notice
it.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|
|
In a while-at-it cleanup replacing a 'cd dir && <...> && cd ..' with a
subshell, commit 28391a80a9 (receive-pack: allow deletion of corrupt
refs, 2007-11-29) also moved the assignment of the $old_commit
variable to that subshell. This variable, however, is used outside of
that subshell as a parameter of check_push_result(), to check that a
ref still points to the commit where it is supposed to. With the
variable remaining unset outside the subshell check_push_result()
doesn't perform that check at all.
Use 'git -C <dir> cmd...', so we don't need to change directory, and
thus don't need the subshell either when setting $old_commit.
Furthermore, change check_push_result() to require at least three
parameters (the repo, the oid, and at least one ref), so it will catch
similar issues earlier should they ever arise.
Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
|