summaryrefslogtreecommitdiffstats
path: root/delta-islands.c (unfollow)
Commit message (Collapse)AuthorFilesLines
2024-04-17credential: add method for querying capabilitiesbrian m. carlson5-1/+60
Right now, there's no specific way to determine whether a credential helper or git credential itself supports a given set of capabilities. It would be helpful to have such a way, so let's let credential helpers and git credential take an argument, "capability", which has it list the capabilities and a version number on standard output. Specifically choose a format that is slightly different from regular credential output and assume that no capabilities are supported if a non-zero exit status occurs or the data deviates from the format. It is common for users to write small shell scripts as the argument to credential.helper, which will almost never be designed to emit capabilities. We want callers to gracefully handle this case by assuming that they are not capable of extended support because that is almost certainly the case, and specifying the error behavior up front does this and preserves backwards compatibility in a graceful way. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential-cache: implement authtype capabilitybrian m. carlson4-6/+28
Now that we have full support in Git for the authtype capability, let's add support to the cache credential helper. When parsing data, we always set the initial capabilities because we're the helper, and we need both the initial and helper capabilities to be set in order to have the helper capabilities take effect. When emitting data, always emit the supported capability and make sure we emit items only if we have them and they're supported by the caller. Since we may no longer have a username or password, be sure to emit those conditionally as well so we don't segfault on a NULL pointer. Similarly, when comparing credentials, consider both the password and credential fields when we're matching passwords. Adjust the partial credential detection code so that we can store credentials missing a username or password as long as they have an authtype and credential. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17t: add credential tests for authtypebrian m. carlson1-0/+123
It's helpful to have some basic tests for credential helpers supporting the authtype and credential fields. Let's add some tests for this case so that we can make sure newly supported helpers work correctly. Note that we explicitly check that credential helpers can produce different sets of authtype and credential values based on the username. While the username is not used in the HTTP protocol with authtype and credential, it can still be specified in the URL and thus may be part of the protocol. Additionally, because it is common for users to have multiple accounts on one service (say, both personal and professional accounts), it's very helpful to be able to store different credentials for different accounts in the same helper, and that doesn't become less useful if one is using, say, Bearer authentication instead of Basic. Thus, credential helpers should be expected to support this functionality as basic functionality, so verify here that they do so. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: add support for multistage credential roundsbrian m. carlson6-35/+184
Over HTTP, NTLM and Kerberos require two rounds of authentication on the client side. It's possible that there are custom authentication schemes that also implement this same approach. Since these are tricky schemes to implement and the HTTP library in use may not always handle them gracefully on all systems, it would be helpful to allow the credential helper to implement them instead for increased portability and robustness. To allow this to happen, add a boolean flag, continue, that indicates that instead of failing when we get a 401, we should retry another round of authentication. However, this necessitates some changes in our current credential code so that we can make this work. Keep the state[] headers between iterations, but only use them to send to the helper and only consider the new ones we read from the credential helper to be valid on subsequent iterations. That avoids us passing stale data when we finally approve or reject the credential. Similarly, clear the multistage and wwwauth[] values appropriately so that we don't pass stale data or think we're trying a multiround response when we're not. Remove the credential values so that we can actually fill a second time with new responses. Limit the number of iterations of reauthentication we do to 3. This means that if there's a problem, we'll terminate with an error message instead of retrying indefinitely and not informing the user (and possibly conducting a DoS on the server). In our tests, handle creating multiple response output files from our helper so we can verify that each of the messages sent is correct. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17t5563: refactor for multi-stage authenticationbrian m. carlson2-47/+66
Some HTTP authentication schemes, such as NTLM- and Kerberos-based options, require more than one round trip to authenticate. Currently, these can only be supported in libcurl, since Git does not have support for this in the credential helper protocol. However, in a future commit, we'll add support for this functionality into the credential helper protocol and Git itself. Because we don't really want to implement either NTLM or Kerberos, both of which are complex protocols, we'll want to test this using a fake credential authentication scheme. In order to do so, update t5563 and its backend to allow us to accept multiple sets of credentials and respond with different behavior in each case. Since we can now provide any number of possible status codes, provide a non-specific reason phrase so we don't have to generate a more specific one based on the response. The reason phrase is mandatory according to the status-line production in RFC 7230, but clients SHOULD ignore it, and curl does (except to print it). Each entry in the authorization and challenge fields contains an ID, which indicates a corresponding credential and response. If the response is a 200 status, then we continue to execute git-http-backend. Otherwise, we print the corresponding status and response. If no ID is matched, we use the default response with a status of 401. Note that there is an implicit order to the parameters. The ID is always first and the creds or response value is always last, and therefore may contain spaces, equals signs, or other arbitrary data. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17docs: set a limit on credential line lengthbrian m. carlson1-1/+3
We recently introduced a way for credential helpers to add arbitrary state as part of the protocol. Set some limits on line length to avoid helpers passing extremely large amounts of data. While Git doesn't have a fixed parsing length, there are other tools which support this protocol and it's kind to allow them to use a reasonable fixed-size buffer for parsing. In addition, we would like to be moderate in our memory usage and imposing reasonable limits is helpful for that purpose. In the event a credential helper is incapable of storing its serialized state in 64 KiB, it can feel free to serialize it on disk and store a reference instead. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: enable state capabilitybrian m. carlson2-0/+11
Now that we've implemented the state capability, let's send it along by default when filling credentials so we can make use of it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: add an argument to keep statebrian m. carlson4-12/+71
Until now, our credential code has mostly deal with usernames and passwords and we've let libcurl deal with the variant of authentication to be used. However, now that we have the credential value, the credential helper can take control of the authentication, so the value provided might be something that's generated, such as a Digest hash value. In such a case, it would be helpful for a credential helper that gets an erase or store command to be able to keep track of an identifier for the original secret that went into the computation. Furthermore, some types of authentication, such as NTLM and Kerberos, actually need two round trips to authenticate, which will require that the credential helper keep some state. In order to allow for these use cases and others, allow storing state in a field called "state[]". This value is passed back to the credential helper that created it, which avoids confusion caused by parsing values from different helpers. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17http: add support for authtype and credentialbrian m. carlson4-12/+176
Now that we have the credential helper code set up to handle arbitrary authentications schemes, let's add support for this in the HTTP code, where we really want to use it. If we're using this new functionality, don't set a username and password, and instead set a header wherever we'd normally do so, including for proxy authentication. Since we can now handle this case, ask the credential helper to enable the appropriate capabilities. Finally, if we're using the authtype value, set "Expect: 100-continue". Any type of authentication that requires multiple rounds (such as NTLM or Kerberos) requires a 100 Continue (if we're larger than http.postBuffer) because otherwise we send the pack data before we're authenticated, the push gets a 401 response, and we can't rewind the stream. We don't know for certain what other custom schemes might require this, the HTTP/1.1 standard has required handling this since 1999, the broken HTTP server for which we disabled this (Google's) is now fixed and has been for some time, and libcurl has a 1-second fallback in case the HTTP server is still broken. In addition, it is not unreasonable to require compliance with a 25-year old standard to use new Git features. For all of these reasons, do so here. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17docs: indicate new credential protocol fieldsbrian m. carlson1-1/+48
Now that we have new fields (authtype and credential), let's document them for users and credential helper implementers. Indicate specifically what common values of authtype are and what values are allowed. Note that, while common, digest and NTLM authentication are insecure because they require unsalted, uniterated password hashes to be stored. Tell users that they can continue to use a username and password even if the new capability is supported. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: add a field called "ephemeral"brian m. carlson3-0/+35
Now that we have support for a wide variety of types of authentication, it's important to indicate to other credential helpers whether they should store credentials, since not every credential helper may intuitively understand all possible values of the authtype field. Do so with a boolean field called "ephemeral", to indicate whether the credential is expected to be temporary. For example, in HTTP Digest authentication, the Authorization header value is based off a nonce. It isn't useful to store this value for later use because reusing the credential long term will not result in successful authentication due to the nonce necessarily differing. An additional case is potentially short-lived credentials, which may last only a few hours. It similarly wouldn't be helper for other credential helpers to attempt to provide these much later. We do still pass the value to "git credential store" or "git credential erase", since it may be helpful to the original helper to know whether the operation was successful. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: gate new fields on capabilitybrian m. carlson9-24/+215
We support the new credential and authtype fields, but we lack a way to indicate to a credential helper that we'd like them to be used. Without some sort of indication, the credential helper doesn't know if it should try to provide us a username and password, or a pre-encoded credential. For example, the helper might prefer a more restricted Bearer token if pre-encoded credentials are possible, but might have to fall back to more general username and password if not. Let's provide a simple way to indicate whether Git (or, for that matter, the helper) is capable of understanding the authtype and credential fields. We send this capability when we generate a request, and the other side may reply to indicate to us that it does, too. For now, don't enable sending capabilities for the HTTP code. In a future commit, we'll introduce appropriate handling for that code, which requires more in-depth work. The logic for determining whether a capability is supported may seem complex, but it is not. At each stage, we emit the capability to the following stage if all preceding stages have declared it. Thus, if the caller to git credential fill didn't declare it, then we won't send it to the helper, and if fill's caller did send but the helper doesn't understand it, then we won't send it on in the response. If we're an internal user, then we know about all capabilities and will request them. For "git credential approve" and "git credential reject", we set the helper capability before calling the helper, since we assume that the input we're getting from the external program comes from a previous call to "git credential fill", and thus we'll invoke send a capability to the helper if and only if we got one from the standard input, which is the correct behavior. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: add a field for pre-encoded credentialsbrian m. carlson2-4/+11
At the moment, our credential code wants to find a username and password for access, which, for HTTP, it will pass to libcurl to encode and process. However, many users want to use authentication schemes that libcurl doesn't support, such as Bearer authentication. In these schemes, the secret is not a username and password pair, but some sort of token that meets the production for authentication data in the RFC. In fact, in general, it's useful to allow our credential helper to have knowledge about what specifically to put in the protocol header. Thus, add a field, credential, which contains data that's preencoded to be suitable for the protocol in question. If we have such data, we need neither a username nor a password, so make that adjustment as well. It is in theory possible to reuse the password field for this. However, if we do so, we must know whether the credential helper supports our new scheme before sending it data, which necessitates some sort of capability inquiry, because otherwise an uninformed credential helper would store our preencoded data as a password, which would fail the next time we attempted to connect to the remote server. This design is substantially simpler, and we can hint to the credential helper that we support this approach with a simple new field instead of needing to query it first. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17http: use new headers for each object requestbrian m. carlson2-9/+13
Currently we create one set of headers for all object requests and reuse it. However, we'll need to adjust the headers for authentication purposes in the future, so let's create a new set for each request so that we can adjust them if the authentication changes. Note that the cost of allocation here is tiny compared to the fact that we're making a network call, not to mention probably a full TLS connection, so this shouldn't have a significant impact on performance. Moreover, nobody who cares about performance is using the dumb HTTP protocol anyway, since it often makes huge numbers of requests compared to the smart protocol. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17remote-curl: reset headers on new requestbrian m. carlson1-2/+4
When we retry a post_rpc request, we currently reuse the same headers as before. In the future, we'd like to be able to modify them based on the result we get back, so let's reset them on each retry so we can avoid sending potentially duplicate headers if the values change. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17credential: add an authtype fieldbrian m. carlson2-0/+11
When Git makes an HTTP request, it can negotiate the type of authentication to use with the server provided the authentication scheme is one of a few well-known types (Basic, Digest, NTLM, or Negotiate). However, some servers wish to use other types of authentication, such as the Bearer type from OAuth2. Since libcurl doesn't natively support this type, it isn't possible to use it, and the user is forced to specify the Authorization header using the http.extraheader setting. However, storing a plaintext token in the repository configuration is not very secure, especially if a repository can be shared by multiple parties. We already have support for many types of secure credential storage by using credential helpers, so let's teach credential helpers how to produce credentials for an arbitrary scheme. If the credential helper specifies an authtype field, then it specifies an authentication scheme (e.g., Bearer) and the password field specifies the raw authentication token, with any encoding already specified. We reuse the password field for this because some credential helpers store the metadata without encryption even though the password is encrypted, and we'd like to avoid insecure storage if an older version of the credential helper gets ahold of the data. The username is not used in this case, but it is still preserved for the purpose of finding the right credential if the user has multiple accounts. If the authtype field is not specified, then the password behaves as normal and it is passed along with the username to libcurl. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-21The tenth batchJunio C Hamano1-0/+35
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-18The ninth batchJunio C Hamano1-0/+8
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-16The eighth batchJunio C Hamano1-0/+12
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14The seventh batchJunio C Hamano1-0/+32
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-14t0006: add more tests with a negative TZ offsetBeat Bolli1-0/+8
This test doesn't systematically check a negative timezone offset. Add a test for each format that outputs the offset to improve our test coverage. Signed-off-by: Beat Bolli <dev+git@drbeat.li>
2024-03-14date: make "iso-strict" conforming for the UTC timezoneBeat Bolli2-5/+10
ISO 8601-1:2020-12 specifies that a zero timezone offset must be denoted with a "Z" suffix instead of the numeric "+00:00". Add the correponding special case to show_date() and a new test. Changing an established output format which might be depended on by scripts is always problematic, but here we choose to adhere more closely to the published standard. Reported-by: Michael Osipov <michael.osipov@innomotics.com> Signed-off-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-13doc: status.showUntrackedFiles does not take "false"Jonas Wunderlich1-1/+1
The `status.showUntrackedFiles` config option only accepts the values "no", "normal" or "all", but not as this part of the man page suggested "false". While we are at it, camel-case the name of the variable. Signed-off-by: Jonas Wunderlich <git@03j.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-12Documentation/user-manual.txt: example for generating object hashesDirk Gouders1-2/+34
Add a simple example on how object hashes can be generated manually. Further, because the document suggests to have a look at the initial commit, clarify that some details changed since that time. Signed-off-by: Dirk Gouders <dirk@gouders.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11The sixth batchJunio C Hamano1-0/+4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11setup: notice more types of implicit bare repositoriesJunio C Hamano2-5/+49
Setting the safe.bareRepository configuration variable to explicit stops git from using a bare repository, unless the repository is explicitly specified, either by the "--git-dir=<path>" command line option, or by exporting $GIT_DIR environment variable. This may be a reasonable measure to safeguard users from accidentally straying into a bare repository in unexpected places, but often gets in the way of users who need valid accesses to the repository. Earlier, 45bb9162 (setup: allow cwd=.git w/ bareRepository=explicit, 2024-01-20) loosened the rule such that being inside the ".git" directory of a non-bare repository does not really count as accessing a "bare" repository. The reason why such a loosening is needed is because often hooks and third-party tools run from within $GIT_DIR while working with a non-bare repository. More importantly, the reason why this is safe is because a directory whose contents look like that of a "bare" repository cannot be a bare repository that came embedded within a checkout of a malicious project, as long as its directory name is ".git", because ".git" is not a name allowed for a directory in payload. There are at least two other cases where tools have to work in a bare-repository looking directory that is not an embedded bare repository, and accesses to them are still not allowed by the recent change. - A secondary worktree (whose name is $name) has its $GIT_DIR inside "worktrees/$name/" subdirectory of the $GIT_DIR of the primary worktree of the same repository. - A submodule worktree (whose name is $name) has its $GIT_DIR inside "modules/$name/" subdirectory of the $GIT_DIR of its superproject. As long as the primary worktree or the superproject in these cases are not bare, the pathname of these "looks like bare but not really" directories will have "/.git/worktrees/" and "/.git/modules/" as a substring in its leading part, and we can take advantage of the same security guarantee allow git to work from these places. Extend the earlier "in a directory called '.git' we are OK" logic used for the primary worktree to also cover the secondary worktree's and non-embedded submodule's $GIT_DIR, by moving the logic to a helper function "is_implicit_bare_repo()". We deliberately exclude secondary worktrees and submodules of a bare repository, as these are exactly what safe.bareRepository=explicit setting is designed to forbid accesses to without an explicit GIT_DIR/--git-dir=<path> Helped-by: Kyle Lippincott <spectral@google.com> Helped-by: Kyle Meyer <kyle@kyleam.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11ci(github): make Windows test artifacts name uniquePhilippe Blain1-2/+2
If several jobs in the windows-test or vs-test matrices fail, the upload-artifact action in each job tries to upload the test directories of the failed tests as "failed-tests-windows.zip", which fails for all jobs except the one which finishes first with the following error: Error: Failed to CreateArtifact: Received non-retryable error: Failed request: (409) Conflict: an artifact with this name already exists on the workflow run Make the artifacts name unique by using the 'matrix.nr' token, and disambiguate the vs-test artifacts from the windows-test ones. Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11doc: git-clone: format placeholdersJean-Noël Avila2-15/+15
With the new formatting rules, we use _<placeholders>_. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11doc: git-clone: format verbatim wordsJean-Noël Avila2-6/+6
We also apply the formatting to urls.txt which is included. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11doc: git-init: rework config item init.templateDirJean-Noël Avila2-3/+8
When included into a the manpage of git-init, the param section must not refer to the manpage. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11doc: git-init: rework definition listsJean-Noël Avila1-4/+8
In all cases of option description, each option is in its own term. Use the same format here. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11doc: git-init: format placeholdersJean-Noël Avila1-5/+5
With the new doc format conventions, we use _<placeholders>_. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-11doc: git-init: format verbatim partsJean-Noël Avila1-16/+16
Verbatim parts are all formatted as `fixed font`. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-09merge-ort/merge-recursive: do report errors in `merge_submodule()`Johannes Schindelin2-0/+13
In 24876ebf68b (commit-reach(repo_in_merge_bases_many): report missing commits, 2024-02-28), I taught `merge_submodule()` to handle errors reported by `repo_in_merge_bases_many()`. However, those errors were not passed through to the callers. That was unintentional, and this commit remedies that. Note that `find_first_merges()` can now also return -1 (because it passes through that return value from `repo_in_merge_bases()`), and this commit also adds the forgotten handling for that scenario. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-09merge-recursive: prepare for `merge_submodule()` to report errorsJohannes Schindelin1-6/+7
The `merge_submodule()` function returns an integer that indicates whether the merge was clean (returning 1) or unclean (returning 0). Like the version in `merge-ort.c`, the version in `merge-recursive.c` does not report any errors (such as repository corruption) by returning -1 as of time of writing, even if the callers in `merge-ort.c` are prepared for exactly such errors. However, we want to teach (both variants of) the `merge_submodule()` function that trick: to report errors by returning -1. Therefore, prepare the caller in `merge-recursive.c` to handle that scenario. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Acked-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-08The fifth batchJunio C Hamano1-0/+34
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07reftable/block: fix binary search over restart counterPatrick Steinhardt1-1/+1
Records store their keys prefix-compressed. As many records will share a common prefix (e.g. "refs/heads/"), this can end up saving quite a bit of disk space. The downside of this is that it is not possible to just seek into the middle of a block and consume the corresponding record because it may depend on prefixes read from preceding records. To help with this usecase, the reftable format writes every n'th record without using prefix compression, which is called a "restart". The list of restarts is stored at the end of each block so that a reader can figure out entry points at which to read a full record without having to read all preceding records. This allows us to do a binary search over the records in a block when searching for a particular key by iterating through the restarts until we have found the section in which our record must be located. From thereon we perform a linear search to locate the desired record. This mechanism is broken though. In `block_reader_seek()` we call `binsearch()` over the count of restarts in the current block. The function we pass to compare records with each other computes the key at the current index and then compares it to our search key by calling `strbuf_cmp()`, returning its result directly. But `binsearch()` expects us to return a truish value that indicates whether the current index is smaller than the searched-for key. And unless our key exactly matches the value at the restart counter we always end up returning a truish value. The consequence is that `binsearch()` essentially always returns 0, indicacting to us that we must start searching right at the beginning of the block. This works by chance because we now always do a linear scan from the start of the block, and thus we would still end up finding the desired record. But needless to say, this makes the optimization quite useless. Fix this bug by returning whether the current key is smaller than the searched key. As the current behaviour was correct it is not possible to write a test. Furthermore it is also not really possible to demonstrate in a benchmark that this fix speeds up seeking records. This may cause the reader to question whether this binary search makes sense in the first place if it doesn't even help with performance. But it would end up helping if we were to read a reftable with a much larger block size. Blocks can be up to 16MB in size, in which case it will become much more important to avoid the linear scan. We are not yet ready to read or write such larger blocks though, so we have to live without a benchmark demonstrating this. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07reftable/record: fix memory leak when decoding object recordsPatrick Steinhardt1-0/+2
When decoding records it is customary to reuse a `struct reftable_ref_record` across calls. Thus, it may happen that the record already holds some allocated memory. When decoding ref and log records we handle this by releasing or reallocating held memory. But we fail to do this for object records, which causes us to leak memory. Fix this memory leak by releasing object records before we decode into them. We may eventually want to reuse memory instead to avoid needless reallocations. But for now, let's just plug the leak and be done. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07wt-status: don't find scissors line beyond buf lenFlorian Schmidt2-2/+19
If (a) There is a "---" divider in a commit message, (b) At some point beyond that divider, there is a cut-line (that is, "# ------------------------ >8 ------------------------") in the commit message, (c) the user does not explicitly set the "no-divider" option, then "git interpret-trailers" will hang indefinitively. This is because when (a) is true, find_end_of_log_message() will invoke ignored_log_message_bytes() with a len that is intended to make it ignore the part of the commit message beyond the divider. However, ignored_log_message_bytes() calls wt_status_locate_end(), and that function ignores the length restriction when it tries to locate the cut line. If it manages to find one, the returned cutoff value is greater than len. At this point, ignored_log_message_bytes() goes into an infinite loop, because it won't advance the string parsing beyond len, but the exit condition expects to reach cutoff. Make wt_status_locate_end() honor the length parameter passed in, to fix this issue. In general, if wt_status_locate_end() is given a piece of the memory that lacks NUL at all, strstr() may continue across page boundaries and run into an unmapped page. For our current callers, this is not a problem, as all of them except one uses a memory owned by a strbuf (which guarantees an implicit NUL-termination after its payload), and the one exception in trailer.c:find_end_of_log_message() uses strlen() to compute the length before calling this function. Signed-off-by: Florian Schmidt <flosch@nutanix.com> Reviewed-by: Jonathan Davies <jonathan.davies@nutanix.com> [jc: tweaked the commit log message and the implementation a bit] Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07reftable/stack: register compacted tables as tempfilesPatrick Steinhardt1-24/+30
We do not register tables resulting from stack compaction with the tempfile API. Those tables will thus not be deleted in case Git gets killed. Refactor the code to register compacted tables as tempfiles. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07reftable/stack: register lockfiles during compactionPatrick Steinhardt2-134/+123
We do not register any of the locks we acquire when compacting the reftable stack via our lockfiles interfaces. These locks will thus not be released when Git gets killed. Refactor the code to register locks as lockfiles. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07reftable/stack: register new tables as tempfilesPatrick Steinhardt1-17/+12
We do not register new tables which we're about to add to the stack with the tempfile API. Those tables will thus not be deleted in case Git gets killed. Refactor the code to register tables as tempfiles. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07lockfile: report when rollback failsPatrick Steinhardt3-12/+17
We do not report to the caller when rolling back a lockfile fails, which will be needed by the reftable compaction logic in a subsequent commit. It also cannot really report on all errors because the function calls `delete_tempfile()`, which doesn't return an error either. Refactor the code so that both `delete_tempfile()` and `rollback_lock_file()` return an error code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07doc/gitremote-helpers: fix missing single-quoteJeff King1-1/+1
The formatting around "option push-option" was missing its closing quote, leading to the output having a stray opening quote, rather than rendering the item in italics (as we do for all of the other options in the list). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07trace2: emit 'def_param' set with 'cmd_name' eventJeff Hostetler3-9/+6
Some commands do not cause a set of 'def_param' events to be emitted. This includes "git-remote-https", "git-http-fetch", and various "query" commands, like "git --man-path". Since all of these commands do emit a 'cmd_name' event, add code to the "trace2_cmd_name()" function to generate the set of 'def_param' events. Remove explicit calls to "trace2_cmd_list_config()" and "trace2_cmd_list_env_vars()" in git.c since they are no longer needed. Reviewed-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07trace2: avoid emitting 'def_param' set more than onceJeff Hostetler2-1/+13
During nested alias expansion it is possible for "trace2_cmd_list_config()" and "trace2_cmd_list_env_vars()" to be called more than once. This causes a full set of 'def_param' events to be emitted each time. Let's avoid that. Add code to those two functions to only emit them once. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07t0211: demonstrate missing 'def_param' events for certain commandsJeff Hostetler1-0/+231
Some Git commands fail to emit 'def_param' events for interesting config and environment variable settings. Add unit tests to demonstrate this. Most commands are considered "builtin" and are based upon git.c. These typically do emit 'def_param' events. Exceptions are some of the "query" commands, the "run-dashed" mechanism, and alias handling. Commands built from remote-curl.c (instead of git.c), such as "git-remote-https", do not emit 'def_param' events. Likewise, "git-http-fetch" is built http-fetch.c and does not emit them. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-07t7301: use test_path_is_(missing|file)Vincenzo Mezzela1-245/+245
Replace "test -f" and friends to use the test_path_is_file helper function and friends from test-lib-functions.sh. These functions perform identical operations while enhancing debugging capabilities in case of test failures. The original used 'test ! -f' to check if the file has been correctly cleaned, so 'test ! -e' would have been a better choice. Replace them with 'test_path_is_missing'. Signed-off-by: Vincenzo Mezzela <vincenzo.mezzela@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06fsmonitor: support case-insensitive eventsJeff Hostetler2-10/+137
Teach fsmonitor_refresh_callback() to handle case-insensitive lookups if case-sensitive lookups fail on case-insensitive systems. This can cause 'git status' to report stale status for files if there are case issues/errors in the worktree. The FSMonitor daemon sends FSEvents using the observed spelling of each pathname. On case-insensitive file systems this may be different than the expected case spelling. The existing code uses index_name_pos() to find the cache-entry for the pathname in the FSEvent and clear the CE_FSMONITOR_VALID bit so that the worktree scan/index refresh will revisit and revalidate the path. On a case-insensitive file system, the exact match lookup may fail to find the associated cache-entry. This causes status to think that the cached CE flags are correct and skip over the file. Update event handling to optionally use the name-hash and dir-name-hash if necessary. Also update t7527 to convert the "test_expect_failure" to "_success" now that we have fixed the bug. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-06fsmonitor: refactor bit invalidation in refresh callbackJeff Hostetler1-2/+18
Refactor code in the fsmonitor_refresh_callback() call chain dealing with invalidating the CE_FSMONITOR_VALID bit and add a trace message. During the refresh, we clear the CE_FSMONITOR_VALID bit in response to data from the FSMonitor daemon (so that a later phase will lstat() and verify the true state of the file). Create a new function to clear the bit and add some unique tracing for it to help debug edge cases. This is similar to the existing `mark_fsmonitor_invalid()` function, but it also does untracked-cache invalidation and we've already handled that in the refresh-callback handlers, so but we don't need to repeat that. Signed-off-by: Jeff Hostetler <jeffhostetler@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>