-
Notifications
You must be signed in to change notification settings - Fork 105
sparse-checkout: take en/sparse-checkout early #267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse-checkout: take en/sparse-checkout early #267
Conversation
Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
commit 08402b0 ("merge-recursive: distinguish "removed" and "overwritten" messages", 2010-08-11) split ERROR_WOULD_LOSE_UNTRACKED into both ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN ERROR_WOULD_LOSE_UNTRACKED_REMOVED and also split ERROR_WOULD_LOSE_ORPHANED into both ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN ERROR_WOULD_LOSE_ORPHANED_REMOVED However, despite the split only three of these four types were used. ERROR_WOULD_LOSE_ORPHANED_REMOVED was not put into use when it was introduced and nothing else has used it in the intervening decade either. Remove it. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
verify_absent_sparse() was introduced in commit 08402b0 ("merge-recursive: distinguish "removed" and "overwritten" messages", 2010-08-11), and has always had exactly one caller which always passes error_type == ERROR_WOULD_LOSE_UNTRACKED_OVERWRITTEN. This function then checks whether error_type is this value, and if so, sets it instead to ERROR_WOULD_LOSE_ORPHANED_OVERWRITTEN. It has been nearly a decade and no other caller has been created, and no other value has ever been passed, so just pass the expected value to begin with. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
commit e091228 ("sparse-checkout: update working directory in-process", 2019-11-21) allowed passing a pre-defined set of patterns to unpack_trees(). However, if o->pl was NULL, it would still read the existing patterns and use those. If those patterns were read into a data structure that was allocated, naturally they needed to be free'd. However, despite the same function being responsible for knowing about both the allocation and the free'ing, the logic for tracking whether to free the pattern_list was hoisted to an outer function with an additional flag in unpack_trees_options. Put the logic back in the relevant function and discard the now unnecessary flag. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
check_updates() previously assumed it was working on o->result. We want to use this function in combination with a different index_state, so take the intended index_state as a parameter. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
If a path is dirty, removing from the working tree risks losing data. As such, we want to make sure any such path is not marked with SKIP_WORKTREE. While the current callers of this code detect this case and re-populate with a previous set of sparsity patterns, we want to allow some paths to be marked with SKIP_WORKTREE while others are left unmarked without it being considered an error. The reason this shouldn't be considered an error is that SKIP_WORKTREE has always been an advisory-only setting; merge and rebase for example were free to materialize paths and clear the SKIP_WORKTREE bit in order to accomplish their work even though they kept the SKIP_WORKTREE bit set for other paths. Leaving dirty working files in the working tree is thus a natural extension of what we have already been doing. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Create a populate_from_existing_patterns() function for reading the path_patterns from $GIT_DIR/info/sparse-checkout so that we can re-use it elsewhere. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Previously, the only way to update the SKIP_WORKTREE bits for various
paths was invoking `git read-tree -mu HEAD` or calling the same code
that this codepath invoked. This however had a number of problems if
the index or working directory were not clean. First, let's consider
the case:
Flipping SKIP_WORKTREE -> !SKIP_WORKTREE (materializing files)
If the working tree was clean this was fine, but if there were files or
directories or symlinks or whatever already present at the given path
then the operation would abort with an error. Let's label this case
for later discussion:
A) There is an untracked path in the way
Now let's consider the opposite case:
Flipping !SKIP_WORKTREE -> SKIP_WORKTREE (removing files)
If the index and working tree was clean this was fine, but if there were
any unclean paths we would run into problems. There are three different
cases to consider:
B) The path is unmerged
C) The path has unstaged changes
D) The path has staged changes (differs from HEAD)
If any path fell into case B or C, then the whole operation would be
aborted with an error. With sparse-checkout, the whole operation would
be aborted for case D as well, but for its predecessor of using `git
read-tree -mu HEAD` directly, any paths that fell into case D would be
removed from the working copy and the index entry for that path would be
reset to match HEAD -- which looks and feels like data loss to users
(only a few are even aware to ask whether it can be recovered, and even
then it requires walking through loose objects trying to match up the
right ones).
Refusing to remove files that have unsaved user changes is good, but
refusing to work on any other paths is very problematic for users. If
the user is in the middle of a rebase or has made modifications to files
that bring in more dependencies, then for their build to work they need
to update the sparse paths. This logic has been preventing them from
doing so. Sometimes in response, the user will stage the files and
re-try, to no avail with sparse-checkout or to the horror of losing
their changes if they are using its predecessor of `git read-tree -mu
HEAD`.
Add a new update_sparsity() function which will not error out in any of
these cases but behaves as follows for the special cases:
A) Leave the file in the working copy alone, clear the SKIP_WORKTREE
bit, and print a warning (thus leaving the path in a state where
status will report the file as modified, which seems logical).
B) Do NOT mark this path as SKIP_WORKTREE, and leave it as unmerged.
C) Do NOT mark this path as SKIP_WORKTREE and print a warning about
the dirty path.
D) Mark the path as SKIP_WORKTREE, but do not revert the version
stored in the index to match HEAD; leave the contents alone.
I tried a different behavior for A (leave the SKIP_WORKTREE bit set),
but found it very surprising and counter-intuitive (e.g. the user sees
it is present along with all the other files in that directory, tries to
stage it, but git add ignores it since the SKIP_WORKTREE bit is set). A
& C seem like optimal behavior to me. B may be as well, though I wonder
if printing a warning would be an improvement. Some might be slightly
surprised by D at first, but given that it does the right thing with
`git commit` and even `git commit -a` (`git add` ignores entries that
are marked SKIP_WORKTREE and thus doesn't delete them, and `commit -a`
is similar), it seems logical to me.
Reviewed-by: Derrick Stolee <[email protected]>
Signed-off-by: Elijah Newren <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
Remove the equivalent of 'git read-tree -mu HEAD' in the sparse-checkout codepaths for setting the SKIP_WORKTREE bits and instead use the new update_sparsity() function. Note that when an issue is hit, the error message splits 'error' and 'Cannot update sparse checkout' on separate lines. For now, we use two greps to find both pieces of the error message but subsequent commits will clean up the messages reported to the user. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
setup_unpack_trees_porcelain() provides much improved error/warning
messages; instead of a message that assumes that there is only one path
with a given problem despite being used by code that intentionally is
grouping and showing errors together, it uses a message designed to be
used with groups of paths. For example, this transforms
error: Entry ' folder1/a
folder2/a
' not uptodate. Cannot update sparse checkout.
into
error: Cannot update sparse checkout: the following entries are not up to date:
folder1/a
folder2/a
In the past the suboptimal messages were never actually triggered
because we would error out if the working directory wasn't clean before
we even called unpack_trees(). The previous commit changed that,
though, so let's use the better error messages.
Reviewed-by: Derrick Stolee <[email protected]>
Signed-off-by: Elijah Newren <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
A minor change, but we want to convert the sparse messages to warnings and this allows us to group warnings and errors. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
We want to treat issues with setting the SKIP_WORKTREE bit as a warning rather than an error; rename the enum values to reflect this intent as a simple step towards that goal. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
display_error_msgs() is never called to show messages of both ERROR_* and WARNING_* types at the same time; it is instead called multiple times, separately for each type. Since we want to display these types differently, make two slightly different versions of this function. A subsequent commit will further modify unpack_trees() and how it calls the new display_warning_msgs(). Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
The messages for problems with sparse paths are phrased as errors that cause the operation to abort, even though we are not making the operation abort. Reword the messages to make sense in their new context. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
When sparse-checkout runs to update the list of sparsity patterns, it gives warnings if it can't remove paths from the working tree because those files have dirty changes. Add a similar warning for unmerged paths as well. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
Setting and clearing of the SKIP_WORKTREE bit is not only done when
users run 'sparse-checkout'; other commands such as 'checkout' also run
through unpack_trees() which has logic for handling this special bit.
As such, we need to consider how they handle special cases. A couple
comparison points should help explain the rationale for changing how
unpack_trees() handles these bits:
Ignoring sparse checkouts for a moment, if you are switching
branches and have dirty changes, it is only considered an error that
will prevent the branch switching from being successful if the dirty
file happens to be one of the paths with different contents.
SKIP_WORKTREE has always been considered advisory; for example, if
rebase or merge need or even want to materialize a path as part of
their work, they have always been allowed to do so regardless of the
SKIP_WORKTREE setting. This has been used for unmerged paths, but
it was often used for paths it wasn't needed just because it made
the code simpler. It was a best-effort consideration, and when it
materialized paths contrary to the SKIP_WORKTREE setting, it was
never required to even print a warning message.
In the past if you trying to run e.g. 'git checkout' and:
1) you had a path that was materialized and had some dirty changes
2) the path was listed in $GITDIR/info/sparse-checkout
3) this path did not different between the current and target branches
then despite the comparison points above, the inability to set
SKIP_WORKTREE was treated as a *hard* error that would abort the
checkout operation. This is completely inconsistent with how
SKIP_WORKTREE is handled elsewhere, and rather annoying for users as
leaving the paths materialized in the working copy (with a simple
warning) should present no problem at all.
Downgrade any errors from inability to toggle the SKIP_WORKTREE bit to a
warning and allow the operations to continue.
Reviewed-by: Derrick Stolee <[email protected]>
Signed-off-by: Elijah Newren <[email protected]>
Signed-off-by: Junio C Hamano <[email protected]>
If commands like merge or rebase materialize files as part of their work, or a previous sparse-checkout command failed to update individual files due to dirty changes, users may want a command to simply 'reapply' the sparsity rules. Provide one. Reviewed-by: Derrick Stolee <[email protected]> Signed-off-by: Elijah Newren <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
builtin/sparse-checkout.c
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(In case anyone is paying attention, this commit is actually a revert of the commit Revert "sparse-checkout: check for dirty status" which will get obliterated in the next rebase with this fixup.)
jeffhostetler
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is a rubber stamp ok here?
dscho
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was about to use range-diff to make sure that the patches did not change dramatically, but then I realized that you merged the commits verbatim from Junio's branch 😄
The loop in warn_conflicted_path() that checks for the count of entries with the same path uses "i+count" for the array entry. However, the loop only verifies that the value of count is below the array size. Fix this by adding i to the condition. I hit this condition during a test of the in-tree sparse-checkout feature, so it is exercised by the end of the series. Signed-off-by: Derrick Stolee <[email protected]> [jc: readability fix] Signed-off-by: Junio C Hamano <[email protected]>
Thank you both for the approvals. I need to figure out the errors this introduces in the Scalar and VFS for Git test suites. So far, it seems like it's a test bug in microsoft/scalar#374. Perhaps the same bug is at fault for microsoft/VFSForGit#1661, but I think it's more likely to be a merge issue that I need to resolve here. |
175cff7 to
ef29841
Compare
ef29841 to
f2e991d
Compare
Signed-off-by: Derrick Stolee <[email protected]>
Includes en/sparse-checkout with a small fix on top. Signed-off-by: Derrick Stolee <[email protected]>
f2e991d to
2dbaa83
Compare
Also fixes a bug when running `scalar clone` followed by `git sparse-checkout disable`. See microsoft/git#267.
These commits are in
nextand are valuable to us. In particular, it does the right thing when presented with a dirty status.Resolves #234.