summaryrefslogtreecommitdiffstats
path: root/convert.h (follow)
Commit message (Collapse)AuthorAgeFilesLines
* convert: stream from fd to required clean filter to reduce used address spaceSteffen Prohaska2014-08-281-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The data is streamed to the filter process anyway. Better avoid mapping the file if possible. This is especially useful if a clean filter reduces the size, for example if it computes a sha1 for binary data, like git media. The file size that the previous implementation could handle was limited by the available address space; large files for example could not be handled with (32-bit) msysgit. The new implementation can filter files of any size as long as the filter output is small enough. The new code path is only taken if the filter is required. The filter consumes data directly from the fd. If it fails, the original data is not immediately available. The condition can easily be handled as a fatal error, which is expected for a required filter anyway. If the filter was not required, the condition would need to be handled in a different way, like seeking to 0 and reading the data. But this would require more restructuring of the code and is probably not worth it. The obvious approach of falling back to reading all data would not help achieving the main purpose of this patch, which is to handle large files with limited address space. If reading all data is an option, we can simply take the old code path right away and mmap the entire file. The environment variable GIT_MMAP_LIMIT, which has been introduced in a previous commit is used to test that the expected code path is taken. A related test that exercises required filters is modified to verify that the data actually has been modified on its way from the file system to the object store. Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* convert: drop arguments other than 'path' from would_convert_to_git()Steffen Prohaska2014-08-221-3/+2
| | | | | | | | | It is only the path that matters in the decision whether to filter or not. Clarify this by making path the only argument of would_convert_to_git(). Signed-off-by: Steffen Prohaska <prohaska@zib.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* typofix: in-code commentsOndřej Bílka2013-07-231-1/+1
| | | | | Signed-off-by: Ondřej Bílka <neleai@seznam.cz> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* teach convert_to_git a "dry run" modeJeff King2012-02-241-0/+5
| | | | | | | | | | | | | | | | | Some callers may want to know whether convert_to_git will actually do anything before performing the conversion itself (e.g., to decide whether to stream or handle blobs in-core). This patch lets callers specify the dry run mode by passing a NULL destination buffer. The return value, instead of indicating whether conversion happened, will indicate whether conversion would occur. For readability, we also include a wrapper function which makes it more obvious we are not actually performing the conversion. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* stream filter: add "no more input" to the filtersJunio C Hamano2011-05-271-0/+7
| | | | | | | | | | | Some filters may need to buffer the input and look-ahead inside it to decide what to output, and they may consume more than zero bytes of input and still not produce any output. After feeding all the input, pass NULL as input as keep calling stream_filter() to let such filters know there is no more input coming, and it is time for them to produce the remaining output based on the buffered input. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Add streaming filter APIJunio C Hamano2011-05-271-1/+22
| | | | | | | | | | | | | | | | This introduces an API to plug custom filters to an input stream. The caller gets get_stream_filter("path") to obtain an appropriate filter for the path, and then uses it when opening an input stream via open_istream(). After that, the caller can read from the stream with read_istream(), and close it with close_istream(), just like an unfiltered stream. This only adds a "null" filter that is a pass-thru filter, but later changes can add LF-to-CRLF and other filters, and the callers of the streaming API do not have to change. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* convert.h: move declarations for conversion from cache.hJunio C Hamano2011-05-271-0/+44
Before adding the streaming filter API to the conversion layer, move the existing declarations related to the conversion to its own header file. Signed-off-by: Junio C Hamano <gitster@pobox.com>