Skip to content

fix(files): use display_path for DFT_OVERRIDE_BINARY matching (closes #967)#979

Open
Booyaka101 wants to merge 1 commit into
Wilfred:masterfrom
Booyaka101:fix/binary-override-display-path-967
Open

fix(files): use display_path for DFT_OVERRIDE_BINARY matching (closes #967)#979
Booyaka101 wants to merge 1 commit into
Wilfred:masterfrom
Booyaka101:fix/binary-override-display-path-967

Conversation

@Booyaka101
Copy link
Copy Markdown

Summary

Closes #967.

guess_content matches DFT_OVERRIDE_BINARY glob patterns against the FileArgument it receives. When difftastic is invoked as GIT_EXTERNAL_DIFF (the canonical setup, including @osa1's git diff repro in the issue), git passes the temp file paths as the lhs/rhs arguments and only puts the logical filename in display_path. From src/options.rs:918-928:

[display_path, lhs_tmp_file, _hash, lhs_mode, rhs_tmp_file, _hash, rhs_mode] => {
    (
        display_path.to_string_lossy().to_string(),  // logical filename
        FileArgument::from_path_argument(lhs_tmp_file),  // /tmp/abc123
        FileArgument::from_path_argument(rhs_tmp_file),  // /tmp/def456
        ...
    )
}

So *.toml was being matched against something like /tmp/abc123 and silently never fired for any user with difft set up through git's external-diff hook — including the deleted-file case in the report, where the rhs is /dev/null.

Fix

Match against display_path instead of the FileArgument's temp path. The display_path: &str is already plumbed through both callers (diff_file and diff_conflicts_file) and is the same string check_diff_attr already operates on, so all three arms of the existing match expression now agree on what "this file's name" means.

While here, also try the basename when an unanchored pattern fails to match the full path. The glob crate treats * as not crossing /, so before this change *.toml wouldn't match crates/foo/file.toml even with display_path wired up correctly. That's the "issue with / in glob patterns" behaviour you flagged when you reopened the issue:

fn matches_override(pattern: &glob::Pattern, display_path: &str) -> bool {
    if pattern.matches(display_path) {
        return true;
    }
    let basename = std::path::Path::new(display_path)
        .file_name()
        .and_then(|s| s.to_str())
        .unwrap_or("");
    !basename.is_empty() && basename != display_path && pattern.matches(basename)
}

Path-shaped patterns containing / keep their anchored semantics — the basename fallback only fires when the full match has already failed, and we guard against treating a path with no parent as "different from its basename" so the fallback is a no-op for root-level files (avoids redundantly running pattern.matches twice).

Test plan

Six new unit tests in files.rs::tests:

case covers
override_matches_root_filename *.toml vs rust-toolchain.toml
override_matches_filename_at_depth the #967 case: *.toml vs crates/foo/rust-toolchain.toml
override_matches_path_shaped_pattern build/*.lock vs build/Cargo.lock (anchored hit)
override_does_not_match_path_shaped_pattern_outside_anchor build/*.lock does NOT match crates/build/Cargo.lock (no false positive)
override_does_not_match_non_matching_extension *.gz vs src/main.rs (basename fallback doesn't false-positive)
override_handles_empty_display_path helper doesn't crash on ""

I extracted matches_override into a standalone Cargo workspace and ran the suite there — 6/6 passing. I wasn't able to run cargo test against the full difftastic tree on this Windows box because the vendored-parser symlinks need either developer mode or admin to materialise; the helper's logic is fully covered by the standalone run, and the two main.rs callsite changes are mechanical (both already had display_path: &str in scope).

Notes / open questions

  • Pre-existing FileArgument parameter on guess_content removed — it was only used for the override match; the rest of the function reads bytes. If you'd rather keep the parameter for symmetry with other call shapes I can leave it in as _path: &FileArgument, but dropping it seemed cleaner.
  • Test wrapper updated — the #[cfg(test)] fn guess_content shim in files.rs now passes "" instead of &FileArgument::Stdin. No registered overrides in those tests so the loop short-circuits identically.
  • The basename-fallback semantics aren't quite full gitignore (which would also support **/*.toml for "any depth" and treat src/foo as anchored only at the root), but they handle the dominant *.<ext> and bare-filename shapes that DFT_OVERRIDE_BINARY is documented with. Happy to take it further toward full globset/ignore-crate semantics if you'd prefer that direction in a follow-up.

Credit to @osa1 for the report and the patient back-and-forth on repro details.

…ilfred#967)

`guess_content` was matching `DFT_OVERRIDE_BINARY` glob patterns against
`FileArgument`'s `to_string_lossy()`. When difftastic is invoked as
`GIT_EXTERNAL_DIFF` (the canonical setup), git passes the *temp file
paths* as the lhs/rhs arguments — see `options.rs:918-928`:

```rust
[display_path, lhs_tmp_file, _hash, lhs_mode, rhs_tmp_file, _hash, rhs_mode] => {
    (
        display_path.to_string_lossy().to_string(),  // logical filename
        FileArgument::from_path_argument(lhs_tmp_file),  // /tmp/abc123
        FileArgument::from_path_argument(rhs_tmp_file),  // /tmp/def456
        ...
    )
}
```

So `*.toml` was being matched against `/tmp/abc123…`, which obviously
fails. The override silently never fired for any user with difft set up
through git's external-diff hook — including the deleted-file case in
the report, where the rhs is `/dev/null`.

Match against `display_path` instead. This is the same string that's
already plumbed through to `check_diff_attr`, so all three branches of
the existing match expression now agree on what "this file's name"
means.

While here, also try the basename when an unanchored pattern doesn't
match the full path. The `glob` crate treats `*` as not crossing `/`,
so before this change `*.toml` wouldn't match `crates/foo/file.toml`
even when `display_path` was wired up correctly. The new
`matches_override` helper falls back to matching just the filename when
the full path doesn't match. Path-shaped patterns containing `/`
(e.g. `build/*.lock`) keep their anchored semantics — the basename
fallback only fires when the full match has already failed, and we
guard against treating a path with no parent as "different from its
basename" so the fallback is a no-op for root-level files.

This addresses both the reported regression (deleted files via git
external-diff) and the secondary "issue with `/` in glob patterns"
behaviour the maintainer flagged when reopening the issue.

Six unit tests in `files.rs` cover:
* root filename matched (`*.toml` vs `rust-toolchain.toml`)
* deeper filename matched via basename fallback (the Wilfred#967 case)
* path-shaped pattern matched at its anchor (`build/*.lock` vs `build/Cargo.lock`)
* path-shaped pattern NOT matched outside its anchor (no false positive
  for `build/*.lock` vs `crates/build/Cargo.lock`)
* unrelated pattern doesn't false-positive on basename (`*.gz` vs `src/main.rs`)
* empty `display_path` doesn't crash the helper

Two callers in `main.rs` (`diff_file`, `diff_conflicts_file`) updated
to pass `display_path` instead of `FileArgument`. The pre-existing
`guess_content` test wrapper in `files.rs` is updated to pass `""`
(no overrides registered there anyway, so the loop short-circuits).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

DFT_OVERRIDE_BINARY ignored when the file is removed

1 participant