git - git

	Commit message (Collapse)	Author	Files	Lines
2016-08-08	Eleventh batch for 2.10	Junio C Hamano	1	-20/+65
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-08	Hopefully final batch for 2.9.3	Junio C Hamano	1	-0/+34
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-06	nedmalloc: work around overzealous GCC 6 warning	René Scharfe	1	-5/+4
	With GCC 6, the strdup() function is declared with the "nonnull" attribute, stating that it is not allowed to pass a NULL value as parameter. In nedmalloc()'s reimplementation of strdup(), Postel's Law is heeded and NULL parameters are handled gracefully. GCC 6 complains about that now because it thinks that NULL cannot be passed to strdup() anyway. Because the callers in this project of strdup() must be prepared to call any implementation of strdup() supplied by the platform, so it is pointless to pretend that it is OK to call it with NULL. Remove the conditional based on NULL-ness of the input; this squelches the warning. Check the return value of malloc() instead to make sure we actually got the memory to write to. See https://gcc.gnu.org/gcc-6/porting_to.html for details. Diagnosed-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-06	use strbuf_addstr() instead of strbuf_addf() with "%s"	René Scharfe	3	-3/+3
	Call strbuf_addstr() for adding a simple string to a strbuf instead of using the heavier strbuf_addf(). This is shorter and documents the intent more clearly. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-04	Tenth batch for 2.10	Junio C Hamano	1	-0/+6
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-04	pager: move pager-specific setup into the build	Eric Wong	5	-8/+67
	Allowing PAGER_ENV to be set at build-time allows us to move pager-specific knowledge out of our build. This allows us to set a better default for FreeBSD more(1), which pretends not to understand ANSI color escapes if the MORE environment variable is left empty, but accepts the same variables as less(1) Originally-from: https://public-inbox.org/git/xmqq61piw4yf.fsf@gitster.dls.corp.google.com/ Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-04	nedmalloc: fix misleading indentation	Johannes Schindelin	1	-4/+4
	Some code in nedmalloc is indented in a funny way that could be misinterpreted as if a line after a for loop was included in the loop body, when it is not. GCC 6 complains about this in DEVELOPER=YepSure mode. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-04	t7063: work around FreeBSD's lazy mtime update feature	Nguyễn Thái Ngọc Duy	1	-1/+16
	Let's start with the commit message of [1] from freebsd.git [2] Sync timestamp changes for inodes of special files to disk as late as possible (when the inode is reclaimed). Temporarily only do this if option UFS_LAZYMOD configured and softupdates aren't enabled. UFS_LAZYMOD is intentionally left out of /sys/conf/options. This is mainly to avoid almost useless disk i/o on battery powered machines. It's silly to write to disk (on the next sync or when the inode becomes inactive) just because someone hit a key or something wrote to the screen or /dev/null. PR: 5577 [3] The short version of that, in the context of t7063, is that when a directory is updated, its mtime may be updated later, not immediately. This can be shown with a simple command sequence date; sleep 1; touch abc; rm abc; sleep 10; ls -lTd . One would expect that the date shown in `ls` would be one second from `date`, but it's 10 seconds later. If we put another `ls -lTd .` in front of `sleep 10`, then the date of the last `ls` comes as expected. The first `ls` somehow forces mtime to be updated. t7063 is really sensitive to directory mtime. When mtime is too "new", git code suspects racy timestamps and will not trigger the shortcut in untracked cache, in t7063.24 and eventually be detected in t7063.27 We have two options thanks to this special FreeBSD feature: 1) Stop supporting untracked cache on FreeBSD. Skip t7063 entirely when running on FreeBSD 2) Work around this problem (using the same 'ls' trick) and continue to support untracked cache on FreeBSD I initially wanted to go with 1) because I didn't know the exact nature of this feature and feared that it would make untracked cache work unreliably, using the cached version when it should not. Since the behavior of this thing is clearer now. The picture is not that bad. If this indeed happens often, untracked cache would assume racy condition more often and _fall back_ to non-untracked cache code paths. Which means it may be less effective, but it will not show wrong things. This patch goes with option 2. PS. For those who want to look further in FreeBSD source code, this flag is now called IN_LAZYMOD. I can see it's effective in ext2 and ufs. zfs is not affected. [1] 660e6408e6df99a20dacb070c5e7f9739efdf96d [2] git://github.com/freebsd/freebsd.git [3] https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=5577 Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-04	Ninth batch of topics for 2.10	Junio C Hamano	1	-0/+61
	Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03	clarify %f documentation	Joey Hess	1	-0/+5
	It's natural to expect %f to be an actual file on disk; help avoid that mistake. Signed-off-by: Joey Hess <joeyh@joeyh.name> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03	gitmodules: document shallow recommendation	Stefan Beller	1	-0/+5
	Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03	blame: drop strdup of string literal	Eric Sunshine	1	-1/+1
	This strdup was added as part of 58dbfa2 (blame: accept multiple -L ranges, 2013-08-06) to be consistent with parse_opt_string_list(), which appends to the same list. But as of 7a7a517 (parse_opt_string_list: stop allocating new strings, 2016-06-13), we should stop using strdup (to match parse_opt_string_list, and for all the reasons described in that commit; namely that it does nothing useful and causes us to leak the memory). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-03	t4130: work around Windows limitation	Johannes Sixt	1	-3/+7
	On Windows, it is already pretty expensive to try to recreate the stat() data that Git assumes is cheap to obtain. To make things halfway decent in performance, we even have to skip emulating the inode and to determine the number of hard links. This is not a huge problem, usually, as either the size or the mtime or the ctime are tell-tale enough to say when a file has changed, and even if not, those changes are typically made after the index file was written, triggering a rehashing of the files' contents. The t4130-apply-criss-cross-rename test case, however, requires the inode to determine that files of equal size were swapped, as renaming files does not update their mtime. Every once in a while, t4130 fails on Windows because of this missing piece. Equal file sizes are not crucial for the test cases, however. Hence, generate files with different sizes so that there is some property that the swapped files can be discovered reliably even on Windows. Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-02	hashmap: clarify that hashmap_entry can safely be discarded	Junio C Hamano	1	-0/+5
	The API documentation said that the hashmap_entry structure to be embedded in the caller's structure is to be treated as opaque, which left the reader wondering if it can safely be discarded when it no longer is necessary. If the hashmap_entry structure had references to external resources such as allocated memory or an open file descriptor, merely free(3)ing the containing structure (when the caller's structure is on the heap) or letting it go out of scope (when it is on the stack) would end up leaking the external resource. Document that there is no need for hashmap_entry_clear() that corresponds to hashmap_entry_init() to give the API users a little bit of peace of mind. Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01	t3700: add a test_mode_in_index helper function	Ingo Brückl	1	-32/+22
	The case statement to check the file mode of a staged file appears a number of times. Simplify the test by utilizing a test_mode_in_index helper function. Signed-off-by: Ingo BrÃ¼ckl <ib@wupperonline.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01	t3700: merge two tests into one	Ingo Brückl	1	-12/+6
	Depending on the underlying platform a chmod may be a noop. Although it wouldn't harm the result of the '--chmod=-x' test, there is a more robust way to make sure the --chmod option works both ways. Merge the two separate tests for the --chmod option into one, checking both permissions on the same file. Signed-off-by: Ingo BrÃ¼ckl <ib@wupperonline.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01	t3700: remove unwanted leftover files before running new tests	Ingo Brückl	1	-0/+3
	When an earlier test that has prerequisite is skipped, files used by later tests may be left in the working tree in an unexpected state. For example, a test runs this sequence: echo foo >xfoo1 && chmod 755 xfoo1 to create an executable file xfoo1, expecting that xfoo1 does not exist before it runs in the test sequence. However, the absence of this file depends on "git reset --hard" done in an earlier test, that is skipped when SANITY prerequisite is not met, and worse yet, xfoo1 originally is created as a symbolic link, which means the chmod does not affect the modes of xfoo1 as this test expects. Fix this by starting the test with "rm -f xfoo1" to make sure the file is created from scratch, and do the same to other similar tests. Signed-off-by: Ingo BrÃ¼ckl <ib@wupperonline.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01	pass constants as first argument to st_mult()	René Scharfe	3	-3/+3
	The result of st_mult() is the same no matter the order of its arguments. It invokes the macro unsigned_mult_overflows(), which divides the second parameter by the first one. Pass constants first to allow that division to be done already at compile time. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01	use strbuf_addstr() for adding constant strings to a strbuf	René Scharfe	4	-6/+6
	Replace uses of strbuf_addf() for adding strings with more lightweight strbuf_addstr() calls. In http-push.c it becomes easier to see what's going on without having to verfiy that the definition of PROPFIND_ALL_REQUEST doesn't contain any format specifiers. Signed-off-by: Rene Scharfe <l.s.r@web.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-01	gitweb: escape link body in format_ref_marker	Andreas Brauchli	1	-1/+1
	Fix a case where an html link can be generated from unescaped input resulting in invalid strict xhtml or potentially injected code. An overview of a repo with a tag "1.0.0&0.0.1" would previously result in an unescaped ampersand in the link body. Signed-off-by: Andreas Brauchli <a.brauchli@elementarea.net> Acked-by: Jakub Narębski <jnareb@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29	pack-objects: compute local/ignore_pack_keep early	Jeff King	1	-1/+25
	In want_object_in_pack(), we can exit early from our loop if neither "local" nor "ignore_pack_keep" are set. If they are, however, we must examine each pack to see if it has the object and is non-local or has a ".keep". It's quite common for there to be no non-local or .keep packs at all, in which case we know ahead of time that looking further will be pointless. We can pre-compute this by simply iterating over the list of packs ahead of time, and dropping the flags if there are no packs that could match. Another similar strategy would be to modify the loop in want_object_in_pack() to notice that we have already found the object once, and that we are looping only to check for "local" and "keep" attributes. If a pack has neither of those, we can skip the call to find_pack_entry_one(), which is the expensive part of the loop. This has two advantages: - it isn't all-or-nothing; we still get some improvement when there's a small number of kept or non-local packs, and a large number of non-kept local packs - it eliminates any possible race where we add new non-local or kept packs after our initial scan. In practice, I don't think this race matters; we already cache the packed_git information, so somebody who adds a new pack or .keep file after we've started will not be noticed at all, unless we happen to need to call reprepare_packed_git() because a lookup fails. In other words, we're already racy, and the race is not a big deal (losing the race means we might include an object in the pack that would not otherwise be, which is an acceptable outcome). However, it also has a disadvantage: we still loop over the rest of the packs for each object to check their flags. This is much less expensive than doing the object lookup, but still not free. So if we wanted to implement that strategy to cover the non-all-or-nothing cases, we could do so in addition to this one (so you get the most speedup in the all-or-nothing case, and the best we can do in the other cases). But given that the all-or-nothing case is likely the most common, it is probably not worth the trouble, and we can revisit this later if evidence points otherwise. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29	pack-objects: break out of want_object loop early	Jeff King	1	-0/+16
	When pack-objects collects the list of objects to pack (either from stdin, or via its internal rev-list), it filters each one through want_object_in_pack(). This function loops through each existing packfile, looking for the object. When we find it, we mark the pack/offset combo for later use. However, we can't just return "yes, we want it" at that point. If --honor-pack-keep is in effect, we must keep looking to find it in _all_ packs, to make sure none of them has a .keep. Likewise, if --local is in effect, we must make sure it is not present in any non-local pack. As a result, the sum effort of these calls is effectively O(nr_objects * nr_packs). In an ordinary repository, we have only a handful of packs, and this doesn't make a big difference. But in pathological cases, it can slow the counting phase to a crawl. This patch notices the case that we have neither "--local" nor "--honor-pack-keep" in effect and breaks out of the loop early, after finding the first instance. Note that our worst case is still "objects * packs" (i.e., we might find each object in the last pack we look in), but in practice we will often break out early. On an "average" repo, my git.git with 8 packs, this shows a modest 2% (a few dozen milliseconds) improvement in the counting-objects phase of "git pack-objects --all <foo" (hackily instrumented by sticking exit(0) right after list_objects). But in a much more pathological case, it makes a bigger difference. I ran the same command on a real-world example with ~9 million objects across 1300 packs. The counting time dropped from 413s to 45s, an improvement of about 89%. Note that this patch won't do anything by itself for a normal "git gc", as it uses both --honor-pack-keep and --local. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29	find_pack_entry: replace last_found_pack with MRU cache	Jeff King	2	-18/+25
	Each pack has an index for looking up entries in O(log n) time, but if we have multiple packs, we have to scan through them linearly. This can produce a measurable overhead for some operations. We dealt with this long ago in f7c22cc (always start looking up objects in the last used pack first, 2007-05-30), which keeps what is essentially a 1-element most-recently-used cache. In theory, we should be able to do better by keeping a similar but longer cache, that is the same length as the pack-list itself. Since we now have a convenient generic MRU structure, we can plug it in and measure. Here are the numbers for running p5303 against linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5303.3: rev-list (1) 31.56(31.28+0.27) 31.30(31.08+0.20) -0.8% 5303.4: repack (1) 40.62(39.35+2.36) 40.60(39.27+2.44) -0.0% 5303.6: rev-list (50) 31.31(31.06+0.23) 31.23(31.00+0.22) -0.3% 5303.7: repack (50) 58.65(69.12+1.94) 58.27(68.64+2.05) -0.6% 5303.9: rev-list (1000) 38.74(38.40+0.33) 31.87(31.62+0.24) -17.7% 5303.10: repack (1000) 367.20(441.80+4.62) 342.00(414.04+3.72) -6.9% The main numbers of interest here are the rev-list ones (since that is exercising the normal object lookup code path). The single-pack case shouldn't improve at all; the 260ms speedup there is just part of the run-to-run noise (but it's important to note that we didn't make anything worse with the overhead of maintaining our cache). In the 50-pack case, we see similar results. There may be a slight improvement, but it's mostly within the noise. The 1000-pack case does show a big improvement, though. That carries over to the repack case, as well. Even though we haven't touched its pack-search loop yet, it does still do a lot of normal object lookups (e.g., for the internal revision walk), and so improves. As a point of reference, I also ran the 1000-pack test against a version of HEAD^ with the last_found_pack optimization disabled. It takes ~60s, so that gives an indication of how much even the single-element cache is helping. For comparison, here's a smaller repository, git.git: Test HEAD^ HEAD --------------------------------------------------------------------- 5303.3: rev-list (1) 1.56(1.54+0.01) 1.54(1.51+0.02) -1.3% 5303.4: repack (1) 1.84(1.80+0.10) 1.82(1.80+0.09) -1.1% 5303.6: rev-list (50) 1.58(1.55+0.02) 1.59(1.57+0.01) +0.6% 5303.7: repack (50) 2.50(3.18+0.04) 2.50(3.14+0.04) +0.0% 5303.9: rev-list (1000) 2.76(2.71+0.04) 2.24(2.21+0.02) -18.8% 5303.10: repack (1000) 13.21(19.56+0.25) 11.66(18.01+0.21) -11.7% You can see that the percentage improvement is similar. That's because the lookup we are optimizing is roughly O(nr_objects * nr_packs). Since the number of packs is constant in both tests, we'd expect the improvement to be linear in the number of objects. But the whole process is also linear in the number of objects, so the improvement is a constant factor. The exact improvement does also depend on the contents of the packs. In p5303, the extra packs all have 5 first-parent commits in them, which is a reasonable simulation of a pushed-to repository. But it also means that only 250 first-parent commits are in those packs (compared to almost 50,000 total in linux.git), and the rest are in the huge "base" pack. So once we start looking at history in taht big pack, that's where we'll find most everything, and even the 1-element cache gets close to 100% cache hits. You could almost certainly show better numbers with a more pathological case (e.g., distributing the objects more evenly across the packs). But that's simply not that realistic a scenario, so it makes more sense to focus on these numbers. The implementation itself is a straightforward application of the MRU code. We provide an MRU-ordered list of packs that shadows the packed_git list. This is easy to do because we only create and revise the pack list in one place. The "reprepare" code path actually drops the whole MRU and replaces it for simplicity. It would be more efficient to just add new entries, but there's not much point in optimizing here; repreparing happens rarely, and only after doing a lot of other expensive work. The key things to keep optimized are traversal (which is just a normal linked list, albeit with one extra level of indirection over the regular packed_git list), and marking (which is a constant number of pointer assignments, though slightly more than the old last_found_pack was; it doesn't seem to create a measurable slowdown, though). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29	add generic most-recently-used list	Jeff King	3	-0/+96
	There are a few places in Git that would benefit from a fast most-recently-used cache (e.g., the list of packs, which we search linearly but would like to order based on locality). This patch introduces a generic list that can be used to store arbitrary pointers in most-recently-used order. The implementation is just a doubly-linked list, where "marking" an item as used moves it to the front of the list. Insertion and marking are O(1), and iteration is O(n). There's no lookup support provided; if you need fast lookups, you are better off with a different data structure in the first place. There is also no deletion support. This would not be hard to do, but it's not necessary for handling pack structs, which are created and never removed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-29	sha1_file: drop free_pack_by_name	Jeff King	3	-32/+0
	The point of this function is to drop an entry from the "packed_git" cache that points to a file we might be overwriting, because our contents may not be the same (and hence the only caller was pack-objects as it moved a temporary packfile into place). In older versions of git, this could happen because the names of packfiles were derived from the set of objects they contained, not the actual bits on disk. But since 1190a1a (pack-objects: name pack files after trailer hash, 2013-12-05), the name reflects the actual bits on disk, and any two packfiles with the same name can be used interchangeably. Dropping this function not only saves a few lines of code, it makes the lifetime of "struct packed_git" much easier to reason about: namely, we now do not ever free these structs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>