The issue is that in `chunks.rs`, `sep_iter` is relative to `search_start`.
But the returned value needs to be absolute position relative to the `buffer`.
sort -m takes: 3 lines, 96003 bytes
and emits: 4 lines, 96004 bytes
The output line lengths before the fix are:
```
a x 32000
b x 23809
b x 8191
c x 32000
```
So it splits one of the lines into two (23809 + 8191 = 32000).
In addition, the output becomes unsorted because the shorter 'b' fragment sorts before the longer 'b' fragment.
The issue is that in `chunks.rs`, `sep_iter` is relatve to `search_start`. But the returned value needs to be absolute position relative to the `buffer`.
We end up with these particular numbers because
- in merge.rs, initial chunk is created as `RecycledChunk::new(8 * 1024)` (8192 bytes)
- `search_start = 8192`; newline is at absolute buffer index `32000`
- `memchr_iter` returns `32000 - 8192 = 23808`, and newline adds + 1 byte
Use splice_unbounded_auto instead of splice_unbounded_broker to avoid
creating a broker pipe when stdout is already a pipe (e.g., when piping
to another command with |).
Fix doc comments in safe_traversal that still described the old behavior
of replacing symlinks with real directories. Rename and deduplicate tests
that were originally written as race-condition regression tests but now
just verify symlink-following behavior.
Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com>
All comment / doc-comment changes.
- src/uucore/src/lib/features/checksum/validate.rs L471: "must set it the the detected" -> "must set it to the detected" (also adds the missing "to")
- src/uucore/src/lib/features/uptime.rs L416: "logged in in a human-readable format" -> "logged in, in a human-readable format"
- src/uu/od/src/output_info.rs L115: "the width is width of the the type" -> "the width is the width of the type"
- src/uu/split/src/filenames.rs L113: "Determine if the the output file names" -> "Determine if the output file names"
- src/uu/tail/src/paths.rs L228: "if `path` is is a file type" -> "if `path` is a file type"
- src/uu/chcon/src/chcon.rs L527: "then is is assumed to be valid" -> "then it is assumed to be valid"
## Summary
Fixes#11469
`install -D` was replacing pre-existing symlinks in the destination path with real directories instead of following them. This broke any workflow where part of the install prefix is a symlink; including BOSH deployments, Homebrew, Nix, stow, and any `make install` targeting a symlinked prefix.
**Reproduction (from the issue):**
```sh
mkdir -p /tmp/target
ln -s /tmp/target /tmp/link
echo hello > /tmp/file.txt
install -D -m 644 /tmp/file.txt /tmp/link/subdir/file.txt
# GNU coreutils 8.32: /tmp/link stays a symlink, file lands in /tmp/target/subdir/file.txt
# uutils 0.7.0: /tmp/link is replaced with a real directory — wrong
```
## Root cause
PR #10140 introduced `create_dir_all_safe()` in `safe_traversal.rs` to prevent TOCTOU symlink race conditions. The fix was correct in intent but too aggressive: `open_or_create_subdir()` unconditionally unlinked and recreated any symlink it encountered, including pre-existing legitimate ones.
## Changes
**`src/uucore/src/lib/features/safe_traversal.rs`**
- `open_or_create_subdir`: when `stat_at` returns `S_IFLNK`, call `open_subdir(Follow)` instead of `unlink_at + mkdir_at`. The `O_DIRECTORY` flag already in `open_subdir` means dangling or non-directory symlinks still return an error cleanly.
- `find_existing_ancestor`: switch from `fs::symlink_metadata` to `fs::metadata` so that a symlink-to-directory is recognised as an existing ancestor rather than a component to recreate (this was already the stated intent in the function's doc comment).
**`src/uu/install/src/install.rs`**
- Align the `dir_exists` check and the `DirFd::open` call to also follow symlinks, consistent with the above.
**`tests/by-util/test_install.rs`**
- Update the two tests added by #10140 — they were asserting the buggy behavior (symlink replaced). Flip the assertions to document the correct GNU behavior.
- Add `test_install_d_follows_symlink_prefix` as a direct regression test for the issue's reproduction case.
## TOCTOU / security note
The true TOCTOU race (a symlink *injected during the operation* into a not-yet-existing path component) is still blocked: `mkdirat` fails with `EEXIST` if an attacker creates a symlink between `stat_at` returning `ENOENT` and our `mkdir_at`. Newly-created directories are still opened with `O_NOFOLLOW`.
What changes is that *pre-existing* symlinks are now followed — which is exactly what GNU coreutils 8.32 does. The previous behavior was stricter than GNU in this regard.
rustix::fs::Mode::from_bits_truncated accepts u32 on Linux and u16 on
MacOS. Casting to rustix::fs::RawMode makes the value compatible on
either platform.
Workflow for permission setting and ACLs failed in several scenarios,
most notable when passing -p. Parent directories in the mkdir call would
not appropriately set ACLs and could end up with more open permissions.
Generally, there was a misunderstanding that GNU coreutils was setting
umask (0) and that was the default -- the real flow was using a shaped
umask that takes current umask and ensures that the user has the ability
to execute mkdir commands through the tree. The umask (0) call was part
of a read setup for the equivalent of our UmaskGuard. New workflow
focuses on safe defaults, shaped umask, and allowing the Kernel to do
to apply ACLs. Adds a test specifically to guard against regression,
ensuring a more restrictive ACL is respected with mkdir -p
Replace the `has_alt_access` boolean (which widened the link-count column
to reserve space for the ACL/context indicator) with a `permissions`
field on `PaddingCollection`, so each column owns its own width.
Also make ACL detection in the padding-calculation pass use `item.path()`
to match the render path (previously used `item.display_name()`).
Refs #11790
- remove note about pre-commit hook being broken on macOS
- update clippy command to include `--workspace`, matching `pre-commit`
- remove outdated paragraph about `msrv` in `clippy.toml`