Bring in dependencies of ds/maintenance-part-3 #292

derrickstolee · 2020-09-30T19:02:18Z

These commits on master are all reachable from the maintenance builtin code. Merging these in on their own makes it easier to review the maintenance code coming in.

Submodules should be handled the same as regular directories with respect to the presence of a trailing slash, i.e. commands like: git diff rev1 rev2 -- $path git rev-list HEAD -- $path should produce the same output whether $path is 'submod' or 'submod/'. This has been fixed in commit 74b4f7f (tree-walk.c: ignore trailing slash on submodule in tree_entry_interesting(), 2014-01-23). Unfortunately, that commit had the unintended side effect to handle 'submod/anything' the same as 'submod' and 'submod/' as well, e.g.: $ git log --oneline --name-only -- sha1collisiondetection/whatever 4125f78 sha1dc: update from upstream sha1collisiondetection 07a20f5 Makefile: fix unaligned loads in sha1dc with UBSan sha1collisiondetection 23e37f8 sha1dc: update from upstream sha1collisiondetection 86cfd61 sha1dc: optionally use sha1collisiondetection as a submodule sha1collisiondetection Fix this by rejecting submodules as partial pathnames when their trailing slash is followed by anything. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The commit-graph file format specifies that the chunks may be in any order. However, if the OID Lookup chunk happens to be the last one in the file, then any command attempting to access the commit-graph data will fail with: fatal: invalid commit position. commit-graph is likely corrupt In this case the error is wrong, the commit-graph file does conform to the specification, but the parsing of the Chunk Lookup table is a bit buggy, and leaves the field holding the number of commits in the commit-graph zero-initialized. The number of commits in the commit-graph is determined while parsing the Chunk Lookup table, by dividing the size of the OID Lookup chunk with the hash size. However, the Chunk Lookup table doesn't actually store the size of the chunks, but it stores their starting offset. Consequently, the size of a chunk can only be calculated by subtracting the starting offsets of that chunk from the offset of the subsequent chunk, or in case of the last chunk from the offset recorded in the terminating label. This is currenly implemented in a bit complicated way: as we iterate over the entries of the Chunk Lookup table, we check the ID of each chunk and store its starting offset, then we check the ID of the last seen chunk and calculate its size using its previously saved offset if necessary (at the moment it's only necessary for the OID Lookup chunk). Alas, while parsing the Chunk Lookup table we only interate through the "real" chunks, but never look at the terminating label, thus don't even check whether it's necessary to calulate the size of the last chunk. Consequently, if the OID Lookup chunk is the last one, then we don't calculate its size and turn don't run the piece of code determining the number of commits in the commit graph, leaving the field holding that number unchanged (i.e. zero-initialized), eventually triggering the sanity check in load_oid_from_graph(). Fix this by iterating through all entries in the Chunk Lookup table, including the terminating label. Note that this is the minimal fix, suitable for the maintenance track. A better fix would be to simplify how the chunk sizes are calculated, but that is a more invasive change, less suitable for 'maint', so that will be done in later patches. This additional flexibility of scanning more chunks breaks a test for "git commit-graph verify" so alter that test to mutate the commit-graph to have an even lower chunk count. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

…rder The commit-graph format specifies that "All 4-byte numbers are in network order", but the commit-graph contains 8-byte integers as well (file offsets in the Chunk Lookup table), and their byte order is unspecified. Clarify that all multi-byte integers are in network byte order. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

clear_##slabname() frees only the memory allocated for a commit slab itself, but entries in the commit slab might own additional memory outside the slab that should be freed as well. We already have (at least) one such commit slab, and this patch series is about to add one more. To free all additional memory owned by entries on the commit slab the user of such a slab could iterate over all commits it knows about, peek whether there is a valid entry associated with each commit, and free the additional memory, if any. Or it could rely on intimate knowledge about the internals of the commit slab implementation, and could itself iterate directly through all entries in the slab, and free the additional memory. Or it could just leak the additional memory... Introduce deep_clear_##slabname() to allow releasing memory owned by commit slab entries by invoking the 'void free_fn(elemtype *ptr)' function specified as parameter for each entry in the slab. Use it in get_shallow_commits() in 'shallow.c' to replace an open-coded iteration over a commit slab's entries. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

ll_diff_tree_oid() has only ever returned 0 [1], so it's return value is basically useless. It's only caller diff_tree_oid() has only ever returned the return value of ll_diff_tree_oid() as-is [2], so its return value is just as useless. Most of diff_tree_oid()'s callers simply ignore its return value, except: - diff_root_tree_oid() is a thin wrapper around diff_tree_oid() and returns with its return value, but all of diff_root_tree_oid()'s callers ignore its return value. - rev_compare_tree() and rev_same_tree_as_empty() do look at the return value in a condition, but, since the return value is always 0, the former's < 0 condition is never fulfilled, while the latter's >= 0 condition is always fulfilled. So let's drop the return value of ll_diff_tree_oid(), diff_tree_oid() and diff_root_tree_oid(), and drop those conditions from rev_compare_tree() and rev_same_tree_as_empty() as well. [1] ll_diff_tree_oid() and its ancestors have been returning only 0 ever since it was introduced as diff_tree() in 9174026 (Add "diff-tree" program to show which files have changed between two trees., 2005-04-09). [2] diff_tree_oid() traces back to diff-tree.c:main() in 9174026 as well. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Our CodingGuidelines says that it's sufficient to include one of 'git-compat-util.h' and 'cache.h', but both 'commit-graph.c' and 'commit-graph.h' include both. Let's include only 'git-compat-util.h' to loose a bunch of unnecessary dependencies; but include 'hash.h', because 'commit-graph.h' does require the definition of 'struct object_id'. 'commit-graph.h' explicitly includes 'repository.h' and 'string-list.h', but only needs the declaration of a few structs from them. Drop these includes and forward-declare the necessary structs instead. 'commit-graph.c' includes 'dir.h', but doesn't actually use anything from there, so let's drop that #include as well. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

While we iterate over all entries of the Chunk Lookup table we make sure that we don't attempt to read past the end of the mmap-ed commit-graph file, and check in each iteration that the chunk ID and offset we are about to read is still within the mmap-ed memory region. However, these checks in each iteration are not really necessary, because the number of chunks in the commit-graph file is already known before this loop from the just parsed commit-graph header. So let's check that the commit-graph file is large enough for all entries in the Chunk Lookup table before we start iterating over those entries, and drop those per-iteration checks. While at it, take into account the size of everything that is necessary to have a valid commit-graph file, i.e. the size of the header, the size of the mandatory OID Fanout chunk, and the size of the signature in the trailer as well. Note that this necessitates the change of the error message as well, and, consequently, have to update the 'detect incorrect chunk count' test in 't5318-commit-graph.sh' as well. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The Chunk Lookup table stores the chunks' starting offset in the commit-graph file, not their sizes. Consequently, the size of a chunk can only be calculated by subtracting its offset from the offset of the subsequent chunk (or that of the terminating label). This is currenly implemented in a bit complicated way: as we iterate over the entries of the Chunk Lookup table, we check the id of each chunk and store its starting offset, then we check the id of the last seen chunk and calculate its size using its previously saved offset. At the moment there is only one chunk for which we calculate its size, but this patch series will add more, and the repeated chunk id checks are not that pretty. Instead let's read ahead the offset of the next chunk on each iteration, so we can calculate the size of each chunk right away, right where we store its starting offset. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

In write_commit_graph_file() one block of code fills the array of chunk IDs, another block of code fills the array of chunk offsets, then the chunk IDs and offsets are written to the Chunk Lookup table, and finally a third block of code writes the actual chunks. In case of optional chunks like Extra Edge List and Base Graphs List there is also a condition checking whether that chunk is necessary/desired, and that same condition is repeated in all those three blocks of code. This patch series is about to add more optional chunks, so there would be even more repeated conditions. Those chunk offsets are relative to the beginning of the file, so they inherently depend on the size of the Chunk Lookup table, which in turn depends on the number of chunks that are to be written to the commit-graph file. IOW at the time we set the first chunk's ID we can't yet know its offset, because we don't yet know how many chunks there are. Simplify this by initially filling an array of chunk sizes, not offsets, and calculate the offsets based on the chunk sizes only later, while we are writing the Chunk Lookup table. This way we can fill the arrays of chunk IDs and sizes in one go, eliminating one set of repeated conditions. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Unify the 'chunk_ids' and 'chunk_sizes' arrays into an array of 'struct chunk_info'. This will allow more cleanups in the following patches. Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

At the moment, the recommended way to configure Git's builds is to simply run `make`. If that does not work, the recommended strategy is to look at the top of the `Makefile` to see whether any "Makefile knob" has to be turned on/off, e.g. `make NO_OPENSSL=YesPlease`. Alternatively, Git also has an `autoconf` setup which allows configuring builds via `./configure [<option>...]`. Both of these options are fine if the developer works on Unix or Linux. But on Windows, we have to jump through hoops to configure a build (read: we force the user to install a full Git for Windows SDK, which occupies around two gigabytes (!) on disk and downloads about three quarters of a gigabyte worth of Git objects). The build infrastructure for Git is written around being able to run make, which is not supported natively on Windows. To help Windows developers a CMake build script is introduced here. With a working support CMake, developers on Windows need only install CMake, configure their build, load the generated Visual Studio solution and immediately start modifying the code and build their own version of Git. Likewise, developers on other platforms can use the convenient GUI tools provided by CMake to configure their build. So let's start building CMake support for Git. This is only the first step, and to make it easier to review, it only allows for configuring builds on the platform that is easiest to configure for: Linux. The CMake script checks whether the headers are present(eg. libgen.h), whether the functions are present(eg. memmem), whether the funtions work properly (eg. snprintf) and generate the required compile definitions for the platform. The script also searches for the required libraries, if it fails to find the required libraries the respective executables won't be built.(eg. If libcurl is not found then git-remote-http won't be built). This will help building Git easier. With a CMake script an out of source build of git is possible resulting in a clean source tree. Note: this patch asks for the minimum version v3.14 of CMake (which is not all that old as of time of writing) because that is the first version to offer a platform-independent way to generate hardlinks as part of the build. This is needed to generate all those hardlinks for the built-in commands of Git. Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Place an instance of struct bloom_settings into the struct write_commit_graph_context. This allows simplifying the function prototype of write_graph_chunk_bloom_data(). This will allow us to combine the function prototypes and use function pointers to simplify write_commit_graph_file(). By using a pointer, we can later replace the settings to match those that exist in the current commit-graph, in case a future Git version allows customization of these parameters. Reported-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

43d3561 (commit-graph write: don't die if the existing graph is corrupt, 2019-03-25) introduced the GIT_TEST_COMMIT_GRAPH_DIE_ON_LOAD environment variable. This was created to verify that commit-graph was not loaded when writing a new non-incremental commit-graph. An upcoming change wants to load a commit-graph in some valuable cases, but we want to maintain that we don't trust the commit-graph data when writing our new file. Instead of dying on load, instead die if we ever try to parse a commit from the commit-graph. This functionally verifies the same intended behavior, but allows a more advanced feature in the next change. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

…ions Implement the placeholder substitution to generate scripted Porcelain commands, e.g. git-request-pull out of git-request-pull.sh Generate shell/perl/python scripts and template using CMake instead of using sed like the build procedure in the Makefile does. The text translations are only build if `msgfmt` is found in your path. NOTE: The scripts and templates are generated during configuration. Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Install the built binaries and scripts using CMake This is very similar to `make install`. By default the destination directory(DESTDIR) is /usr/local/ on Linux To set a custom installation path do this: cmake `relative-path-to-srcdir` -DCMAKE_INSTALL_PREFIX=`preferred-install-path` Then run `make install` Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

This patch provides an alternate way to test git using ctest. CTest ships with CMake, so there is no additional dependency being introduced. To perform the tests with ctest do this after building: ctest -j[number of jobs] NOTE: -j is optional, the default number of jobs is 1 Each of the jobs does this: cd t/ && sh t[something].sh The reason for using CTest is that it logs the output of the tests in a neat way, which can be helpful during diagnosis of failures. After the tests have run ctest generates three log files located in `build-directory`/Testing/Temporary/ These log files are: CTestCostData.txt: This file contains the time taken to complete each test. LastTestsFailed.log: This log file contains the names of the tests that have failed in the run. LastTest.log: This log file contains the log of all the tests that have run. A snippet of the file is given below. 10/901 Testing: D:/my/git-master/t/t0009-prio-queue.sh 10/901 Test: D:/my/git-master/t/t0009-prio-queue.sh Command: "sh.exe" "D:/my/git-master/t/t0009-prio-queue.sh" Directory: D:/my/git-master/t "D:/my/git-master/t/t0009-prio-queue.sh" Output: ---------------------------------------------------------- ok 1 - basic ordering ok 2 - mixed put and get ok 3 - notice empty queue ok 4 - stack order passed all 4 test(s) 1..4 <end of output> Test time = 1.11 sec NOTE: Testing only works when building in source for now. Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

This patch allows git to be tested when performin out of source builds. This involves changing GIT_BUILD_DIR in t/test-lib.sh to point to the build directory. Also some miscellaneous copies from the source directory to the build directory. The copies are: t/chainlint.sed needed by a bunch of test scripts po/is.po needed by t0204-gettext-rencode-sanity mergetools/tkdiff needed by t7800-difftool contrib/completion/git-prompt.sh needed by t9903-bash-prompt contrib/completion/git-completion.bash needed by t9902-completion contrib/svn-fe/svnrdump_sim.py needed by t9020-remote-svn NOTE: t/test-lib.sh is only modified when tests are run not during the build or configure. The trash directory is still srcdir/t Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

This patch facilitates building git on Windows with CMake using MinGW NOTE: The funtions unsetenv and hstrerror are not checked in Windows builds. Reasons NO_UNSETENV is not compatible with Windows builds. lines 262-264 compat/mingw.h compat/mingw.h(line 25) provides a definition of hstrerror which conflicts with the definition provided in git-compat-util.h(lines 733-736). To use CMake on Windows with MinGW do this: cmake `relative-path-to-srcdir` -G "MinGW Makefiles" Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

This patch adds support for Visual Studio and Clang builds The minimum required version of CMake is upgraded to 3.15 because this version offers proper support for Clang builds on Windows. Libintl is not searched for when building with Visual Studio or Clang because there is no binary compatible version available yet. NOTE: In the link options invalidcontinue.obj has to be included. The reason for this is because by default, Windows calls abort()'s instead of setting errno=EINVAL when invalid arguments are passed to standard functions. This commit explains it in detail: 4b623d8 On Windows the default generator is Visual Studio,so for Visual Studio builds do this: cmake `relative-path-to-srcdir` NOTE: Visual Studio generator is a multi config generator, which means that Debug and Release builds can be done on the same build directory. For Clang builds do this: On bash CC=clang cmake `relative-path-to-srcdir` -G Ninja -DCMAKE_BUILD_TYPE=[Debug or Release] On cmd set CC=Clang cmake `relative-path-to-srcdir` -G Ninja -DCMAKE_BUILD_TYPE=[Debug or Release] Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Teach .github/workflows/main.yml to use CMake for VS builds. Modified the vs-test step to match windows-test step. This speeds up the vs-test. Calling git-cmd from powershell and then calling git-bash to perform the tests slows things down(factor of about 6). So git-bash is directly called from powershell to perform the tests using prove. NOTE: Since GitHub keeps the same directory for each job (with respect to path) absolute paths are used in the bin-wrapper scripts. GitHub has switched to CMake 3.17.1 which changed the behaviour of FindCURL module. An extra definition (-DCURL_NO_CURL_CMAKE=ON) has been added to revert to the old behaviour. In the configuration phase CMake looks for the required libraries for building git (eg zlib,libiconv). So we extract the libraries before we configure. To check for ICONV_OMITS_BOM libiconv.dll needs to be in the working directory of script or path. So we copy the dlls before we configure. Signed-off-by: Sibi Siddharthan <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The get_bloom_filter() method is a bit complicated in some parts where it does not need to be. In particular, it needs to return a NULL filter only when compute_if_not_present is zero AND the filter data cannot be loaded from a commit-graph file. This currently happens by accident because the commit-graph does not load changed-path Bloom filters from an existing commit-graph when writing a new one. This will change in a later patch. Also clean up some style issues while we are here. One side-effect of returning a NULL filter is that the filters that are reported as "too large" will now be reported as NULL insead of length zero. This case was not properly covered before, so add a test. Further, remote the counting of the zero-length filters from revision.c and the trace2 logs. Helped-by: René Scharfe <[email protected]> Helped-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The changed-path Bloom filters were released in v2.27.0, but have a significant drawback. A user can opt-in to writing the changed-path filters using the "--changed-paths" option to "git commit-graph write" but the next write will drop the filters unless that option is specified. This becomes even more important when considering the interaction with gc.writeCommitGraph (on by default) or fetch.writeCommitGraph (part of features.experimental). These config options trigger commit-graph writes that the user did not signal, and hence there is no --changed-paths option available. Allow a user that opts-in to the changed-path filters to persist the property of "my commit-graph has changed-path filters" automatically. A user can drop filters using the --no-changed-paths option. In the process, we need to be extremely careful to match the Bloom filter settings as specified by the commit-graph. This will allow future versions of Git to customize these settings, and the version with this change will persist those settings as commit-graphs are rewritten on top. Use the trace2 API to signal the settings used during the write, and check that output in a test after manually adjusting the correct bytes in the commit-graph file. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

…ions Update the write_graph_chunk_*() helper functions to have the same signature: - Return an int error code from all these functions. write_graph_chunk_base() already has an int error code, now the others will have one, too, but since they don't indicate any error, they will always return 0. - Drop the hash size parameter of write_graph_chunk_oids() and write_graph_chunk_data(); its value can be read directly from 'the_hash_algo' inside these functions as well. This opens up the possibility for further cleanups and foolproofing in the following two patches. Helped-by: René Scharfe <[email protected]> Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

In write_commit_graph_file() we now have one block of code filling the array of 'struct chunk_info' with the IDs and sizes of chunks to be written, and an other block of code calling the functions responsible for writing individual chunks. In case of optional chunks like Extra Edge List an Base Graphs List there is also a condition checking whether that chunk is necessary/desired, and that same condition is repeated in both blocks of code. Other, newer chunks have similar optional conditions. Eliminate these repeated conditions by storing the function pointers responsible for writing individual chunks in the 'struct chunk_info' array as well, and calling them in a loop to write the commit-graph file. This will open up the possibility for a bit of foolproofing in the following patch. Helped-by: René Scharfe <[email protected]> Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

In my experience while experimenting with new commit-graph chunks, early versions of the corresponding new write_commit_graph_my_chunk() functions are, sadly but not surprisingly, often buggy, and write more or less data than they are supposed to, especially if the chunk size is not directly proportional to the number of commits. This then causes all kinds of issues when reading such a bogus commit-graph file, raising the question of whether the writing or the reading part happens to be buggy this time. Let's catch such issues early, already when writing the commit-graph file, and check that each write_graph_chunk_*() function wrote the amount of data that it was expected to, and what has been encoded in the Chunk Lookup table. Now that all commit-graph chunks are written in a loop we can do this check in a single place for all chunks, and any chunks added in the future will get checked as well. Helped-by: René Scharfe <[email protected]> Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Here, four spaces were used instead of tab characters. Reported-by: Taylor Blau <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

The prepare_to_use_bloom_filter() method was not intended to be called on an empty pathspec. However, 'git log -- .' and 'git log' are subtly different: the latter reports all commits while the former will simplify commits that do not change the root tree. This means that the path used to construct the bloom_key might be empty, and that value is not added to the Bloom filter during construction. That means that the results are likely incorrect! To resolve the issue, be careful about the length of the path and stop filling Bloom filters. To be completely sure we do not use them, drop the pointer to the bloom_filter_settings from the commit-graph. That allows our test to look at the trace2 logs to verify no Bloom filter statistics are reported. Signed-off-by: Taylor Blau <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

…ters The file 'dir/subdir/file' can only be modified if its leading directories 'dir' and 'dir/subdir' are modified as well. So when checking modified path Bloom filters looking for commits modifying a path with multiple path components, then check not only the full path in the Bloom filters, but all its leading directories as well. Take care to check these paths in "deepest first" order, because it's the full path that is least likely to be modified, and the Bloom filter queries can short circuit sooner. This can significantly reduce the average false positive rate, by about an order of magnitude or three(!), and can further speed up pathspec-limited revision walks. The table below compares the average false positive rate and runtime of git rev-list HEAD -- "$path" before and after this change for 5000+ randomly* selected paths from each repository: Average false Average Average positive rate runtime runtime before after before after difference ------------------------------------------------------------------ git 3.220% 0.7853% 0.0558s 0.0387s -30.6% linux 2.453% 0.0296% 0.1046s 0.0766s -26.8% tensorflow 2.536% 0.6977% 0.0594s 0.0420s -29.2% *Path selection was done with the following pipeline: git ls-tree -r --name-only HEAD | sort -R | head -n 5000 The improvements in runtime are much smaller than the improvements in average false positive rate, as we are clearly reaching diminishing returns here. However, all these timings depend on that accessing tree objects is reasonably fast (warm caches). If we had a partial clone and the tree objects had to be fetched from a promisor remote, e.g.: $ git clone --filter=tree:0 --bare file://.../webkit.git webkit.notrees.git $ git -C webkit.git -c core.modifiedPathBloomFilters=1 \ commit-graph write --reachable $ cp webkit.git/objects/info/commit-graph webkit.notrees.git/objects/info/ $ git -C webkit.notrees.git -c core.modifiedPathBloomFilters=1 \ rev-list HEAD -- "$path" then checking all leading path component can reduce the runtime from over an hour to a few seconds (and this is with the clone and the promisor on the same machine). This adjusts the tracing values in t4216-log-bloom.sh, which provides a concrete way to notice the improvement. Helped-by: Taylor Blau <[email protected]> Helped-by: René Scharfe <[email protected]> Signed-off-by: SZEDER Gábor <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Signed-off-by: Han-Wen Nienhuys <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

In a future patch, we plan on making the test_must_fail()-family of functions accept only git commands. Even though force_color() wraps an invocation of `env git`, test_must_fail() will not be able to figure this out since it will assume that force_color() is just some random function which is disallowed. Instead of using `env` in force_color() (which does not support shell functions), export the environment variables in a subshell. Write the invocation as `force_color test_must_fail git ...` since shell functions are now supported. Signed-off-by: Denton Liu <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

Stop when "sendmail.*" configuration variables are defined, which could be a mistaken attempt to define "sendemail.*" variables. * dd/send-email-config: git-send-email: die if sendmail.* config is set

Further preliminary change to refs API. * hn/reftable-prep-part-2: Make HEAD a PSEUDOREF rather than PER_WORKTREE. Modify pseudo refs through ref backend storage t1400: use git rev-parse for testing PSEUDOREF existence

The existing backends for "git mergetool" based on variants of vim have been refactored and then support for "nvim" has been added. * pd/mergetool-nvimdiff: mergetools: add support for nvimdiff (neovim) family mergetool--lib: improve support for vimdiff-style tool variants

A no-op replacement function implemented as a C preprocessor macro does not perform as good a job as one implemented as a "static inline" function in catching errors in parameters; replace the former with the latter in <git-compat-util.h> header. * jc/noop-with-static-inline: compat-util: type-check parameters of no-op replacement functions

Mark error message for i18n. * jk/sideband-error-l10n: sideband: mark "remote error:" prefix for translation

"git bisect" learns the "--first-parent" option to find the first breakage along the first-parent chain. * al/bisect-first-parent: bisect: combine args passed to find_bisection() bisect: introduce first-parent flag cmd_bisect__helper: defer parsing no-checkout flag rev-list: allow bisect and first-parent flags t6030: modernize "git bisect run" tests

Recent versions of "git diff-files" shows a diff between the index and the working tree for "intent-to-add" paths as a "new file" patch; "git apply --cached" should be able to take "git diff-files" and should act as an equivalent to "git add" for the path, but the command failed to do so for such a path. * rp/apply-cached-with-i-t-a: t4140: test apply with i-t-a paths apply: make i-t-a entries never match worktree apply: allow "new file" patches on i-t-a entries

Test framework update. * es/test-cmp-typocatcher: test_cmp: diagnose incorrect arguments

NULL dereference fix. * ma/stop-progress-null-fix: progress: don't dereference before checking for NULL

"git log --first-parent -p" showed patches only for single-parent commits on the first-parent chain; the "--first-parent" option has been made to imply "-m". Use "--no-diff-merges" to restore the previous behaviour to omit patches for merge commits. * jk/log-fp-implies-m: doc/git-log: clarify handling of merge commit diffs doc/git-log: move "-t" into diff-options list doc/git-log: drop "-r" diff option doc/git-log: move "Diff Formatting" from rev-list-options log: enable "-m" automatically with "--first-parent" revision: add "--no-diff-merges" option to counteract "-m" log: drop "--cc implies -m" logic

Earlier, to countermand the implicit "-m" option when the "--first-parent" option is used with "git log", we added the "--[no-]diff-merges" option in the jk/log-fp-implies-m topic. To leave the door open to allow the "--diff-merges" option to take values that instructs how patches for merge commits should be computed (e.g. "cc"? "-p against first parent?"), redefine "--diff-merges" to take non-optional value, and implement "off" that means the same thing as "--no-diff-merges". * so/log-diff-merges-opt: t/t4013: add test for --diff-merges=off doc/git-log: describe --diff-merges=off revision: change "--diff-merges" option to require parameter

Signed-off-by: Junio C Hamano <[email protected]>

If you run fetch but record the result in remote-tracking branches, and either if you do nothing with the fetched refs (e.g. you are merely mirroring) or if you always work from the remote-tracking refs (e.g. you fetch and then merge origin/branchname separately), you can get away with having no FETCH_HEAD at all. Teach "git fetch" a command line option "--[no-]write-fetch-head". The default is to write FETCH_HEAD, and the option is primarily meant to be used with the "--no-" prefix to override this default, because there is no matching fetch.writeFetchHEAD configuration variable to flip the default to off (in which case, the positive form may become necessary to defeat it). Note that under "--dry-run" mode, FETCH_HEAD is never written; otherwise you'd see list of objects in the file that you do not actually have. Passing `--write-fetch-head` does not force `git fetch` to write the file. Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

dscho · 2020-09-30T20:48:57Z

This test failure looks familiar:

++ test_cmp large1 another
++ test 2 -eq 2
++ eval 'test-tool cmp' '"$@"'
+++ test-tool cmp large1 another
fatal: attempting to allocate 1814542 over limit 1536000
++ test xlarge1 = x-
++ test -e large1
++ test xanother = x-
++ test -e another
++ return 1
error: last command exited with $?=1
not ok 11 - checkout a large file
#

It is precisely what I tried to fix in 5557d92#diff-36a8ca43965bbcd41dbec77b366077e8L908:

git/t/test-lib-functions.sh

Lines 907 to 909 in 5557d92

    
           test_cmp() { 
        
           	GIT_ALLOC_LIMIT=0 eval "$GIT_TEST_CMP" '"$@"' 
        
           }

But I see that that got lost somewhere along the lines:

git/t/test-lib-functions.sh

Lines 954 to 962 in 5f981e4

    
           test_cmp() { 
        
           	test $# -eq 2 || BUG "test_cmp requires two arguments" 
        
           	if ! eval "$GIT_TEST_CMP" '"$@"' 
        
           	then 
        
           		test "x$1" = x- || test -e "$1" || BUG "test_cmp '$1' missing" 
        
           		test "x$2" = x- || test -e "$2" || BUG "test_cmp '$2' missing" 
        
           		return 1 
        
           	fi 
        
           }

Signed-off-by: Derrick Stolee <[email protected]>

derrickstolee · 2020-09-30T20:51:35Z

But I see that that got lost somewhere along the lines:

Yeah, this was a merge conflict. I've tried to update it again and see if that works. It doesn't help that this doesn't trigger on Linux ;)

dscho · 2020-09-30T22:01:08Z

Seems to have worked!

This includes all changes from #292, but then also `ds/maintenance-part-3` from upstream. This is _not_ the final maintenance builtin, but is very close. It's time to start making full updates in Scalar that depend on them.

szeder and others added 30 commits June 8, 2020 12:28

revision.c: fix whitespace

dc8e95b

Here, four spaces were used instead of tab characters. Reported-by: Taylor Blau <[email protected]> Signed-off-by: Derrick Stolee <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

lib-t6000.sh: write tag using git-update-ref

9e35a6a

Signed-off-by: Han-Wen Nienhuys <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>

gitster added 13 commits August 17, 2020 17:02

Merge branch 'dd/send-email-config'

a00bda2

Stop when "sendmail.*" configuration variables are defined, which could be a mistaken attempt to define "sendemail.*" variables. * dd/send-email-config: git-send-email: die if sendmail.* config is set

Merge branch 'hn/reftable-prep-part-2'

95c687b

Further preliminary change to refs API. * hn/reftable-prep-part-2: Make HEAD a PSEUDOREF rather than PER_WORKTREE. Modify pseudo refs through ref backend storage t1400: use git rev-parse for testing PSEUDOREF existence

Merge branch 'jk/sideband-error-l10n'

789279e

Mark error message for i18n. * jk/sideband-error-l10n: sideband: mark "remote error:" prefix for translation

Merge branch 'es/test-cmp-typocatcher'

07f14d3

Test framework update. * es/test-cmp-typocatcher: test_cmp: diagnose incorrect arguments

Merge branch 'ma/stop-progress-null-fix'

e6ec620

NULL dereference fix. * ma/stop-progress-null-fix: progress: don't dereference before checking for NULL

Eighth batch

2befe97

Signed-off-by: Junio C Hamano <[email protected]>

derrickstolee force-pushed the maintenance/vfs-base branch from fd22bff to 13255e0 Compare September 30, 2020 19:30

This was referenced Sep 30, 2020

Merge in current ds/maintenance-part-3 #293

Merged

Update Git to include maintenance dependencies microsoft/scalar#442

Closed

Merge branch 'maintenance/base' into HEAD

e88fb40

Signed-off-by: Derrick Stolee <[email protected]>

derrickstolee force-pushed the maintenance/vfs-base branch from 13255e0 to e88fb40 Compare September 30, 2020 20:50

dscho approved these changes Oct 1, 2020

View reviewed changes

derrickstolee merged commit 467bbe9 into microsoft:vfs-2.28.0 Oct 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Bring in dependencies of ds/maintenance-part-3 #292

Bring in dependencies of ds/maintenance-part-3 #292

Uh oh!

derrickstolee commented Sep 30, 2020

Uh oh!

dscho commented Sep 30, 2020

Uh oh!

derrickstolee commented Sep 30, 2020

Uh oh!

dscho commented Sep 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

31 participants

Bring in dependencies of ds/maintenance-part-3 #292

Bring in dependencies of ds/maintenance-part-3 #292

Uh oh!

Conversation

derrickstolee commented Sep 30, 2020

Uh oh!

dscho commented Sep 30, 2020

Uh oh!

derrickstolee commented Sep 30, 2020

Uh oh!

dscho commented Sep 30, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

31 participants