Skip to content

fix(cli): preserve symlinks during sandbox upload#1561

Open
mjamiv wants to merge 1 commit into
NVIDIA:mainfrom
mjamiv:fix/upload-preserve-symlinks
Open

fix(cli): preserve symlinks during sandbox upload#1561
mjamiv wants to merge 1 commit into
NVIDIA:mainfrom
mjamiv:fix/upload-preserve-symlinks

Conversation

@mjamiv
Copy link
Copy Markdown
Contributor

@mjamiv mjamiv commented May 25, 2026

Summary

  • build sandbox upload archives with symlink metadata instead of following links
  • preserve symlink entries for whole-directory and git-style file-list uploads
  • keep existing directory archive shape and file-list missing-entry behavior

Fixes #1425.

Testing

  • cargo fmt --all -- --check
  • git diff --check
  • cargo test -p openshell-cli archive_preserves_symlink
  • cargo test -p openshell-cli

@mjamiv mjamiv requested review from a team, derekwaynecarr, maxamillion and mrunalp as code owners May 25, 2026 21:50
@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 25, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@johntmyers
Copy link
Copy Markdown
Collaborator

[Warning] The archive writer preserves symlinks once entries reach it, but single symlink uploads can still be dereferenced or skipped before that path runs.

The common case is:

openshell sandbox upload <sandbox> ./link

That command still goes through the default git-filtered upload path. git_sync_files canonicalizes the requested path, which follows the symlink. That means:

  • If ./link -> tracked-target.txt, the upload can send tracked-target.txt rather than the symlink entry.
  • If ./link -> untracked-target.txt or the link is dangling, the upload can produce an empty file list or fail the current Path::exists() guard.
  • The create-time upload path has a similar issue for missing/dangling symlinks: the entry can be skipped while the command still reports that files were uploaded.

So the whole-directory/archive implementation looks good, but the single-path entry point still leaves an important symlink-preservation case unfixed.

Suggested fix:

  • Detect a single local symlink with symlink_metadata before entering git_sync_files / canonicalization.
  • Route that case through sandbox_sync_up directly so the symlink itself reaches write_upload_archive.
  • Replace upload existence checks based on Path::exists() with a symlink-aware helper. exists() returns false for dangling symlinks, but a dangling symlink is still a filesystem entry that should either be preserved intentionally or rejected with a clear error.
  • Add regression coverage for both a tracked single symlink and a dangling symlink.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

sandbox upload dereferences symlinks — uploaded as real directories, breaking git

2 participants