Skip to content

Protect CLI bundle versions with leases#17282

Merged
danegsta merged 8 commits into
microsoft:mainfrom
danegsta:bundle-version-leases
May 20, 2026
Merged

Protect CLI bundle versions with leases#17282
danegsta merged 8 commits into
microsoft:mainfrom
danegsta:bundle-version-leases

Conversation

@danegsta
Copy link
Copy Markdown
Member

@danegsta danegsta commented May 19, 2026

Description

Fixes #16306

This prevents Aspire CLI bundle upgrades from deleting or invalidating a bundle version while another CLI/AppHost process is still using it. Before this change, upgraded CLIs could flip the bundle reparse point and clean up older versions\<id> directories while a concurrent process was about to launch aspire-managed or DCP from the previous bundle version.

The CLI now acquires per-version leases under versions\<id>\.leases, returns leased layouts rooted directly at versions\<id>, and makes cleanup skip versions with active lease files. Bundle-owned child processes receive ASPIRE_BUNDLE_VERSION_DIR, and aspire-managed self-leases on startup so long-running dashboard/server/NuGet helper processes keep their backing version alive after parent startup.

Behavior by install/layout type

Bundled installs always extract into a versioned bundle layout before bundle-owned component paths are used. The install route only changes the extraction root:

  • winget, brew, and dotnet-tool installs extract beside the CLI binary, then use versions\<id> plus the public bundle\ pointer under that binary directory.
  • script and pr installs extract under the install prefix parent of bin\, then use the same versions\<id> plus bundle\ pointer shape there.

Layouts without a versioned bundle folder are treated as non-owned fallback layouts. That covers dev/SDK/external layouts where the running CLI has no embedded bundle payload. In those cases EnsureExtractedAndAcquireLayoutAsync falls back to normal layout discovery and returns an unleased layout because BundleService does not own or clean up those files, so there is no bundle cleanup race to protect.

User-facing usage

No command syntax changes are required. Existing bundle flows such as setup/update, dashboard launch, NuGet restore/search, AppHost server startup, and AspireUseCliBundle now use stable version-rooted paths internally instead of launching through the mutable bundle pointer.

Validation

  • .\restore.cmd
  • dotnet build .\src\Aspire.Cli\Aspire.Cli.csproj --no-restore
  • dotnet build .\src\Aspire.Managed\Aspire.Managed.csproj --no-restore
  • dotnet test --project .\tests\Aspire.Cli.Tests\Aspire.Cli.Tests.csproj --no-launch-profile -- --filter-class "*.BundleServiceIntegrationTests" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"
  • dotnet test --project .\tests\Aspire.Cli.Tests\Aspire.Cli.Tests.csproj --no-launch-profile -- --filter-class "*.BundleServiceIntegrationTests" --filter-class "*.DotNetAppHostProjectTests" --filter-class "*.BundleNuGetPackageCacheTests" --filter-class "*.BundleNuGetServiceTests" --filter-class "*.DashboardRunCommandTests" --filter-class "*.ProfileCaptureServiceTests" --filter-class "*.PrebuiltAppHostServerTests" --filter-not-trait "quarantined=true" --filter-not-trait "outerloop=true"

Full Aspire.Cli.Tests was also run with quarantine/outerloop exclusions; it failed only in two unrelated git safecrlf tests (fatal: LF would be replaced by CRLF in .gitignore).

Checklist

  • Is this feature complete?
    • Yes. Ready to ship.
    • No. Follow-up changes expected.
  • Are you including unit tests for the changes and scenario tests if relevant?
    • Yes
    • No
  • Did you add public API?
    • Yes
      • If yes, did you have an API Review for it?
        • Yes
        • No
      • Did you add <remarks /> and <code /> elements on your triple slash comments?
        • Yes
        • No
    • No
  • Does the change make any security assumptions or guarantees?
    • Yes
      • If yes, have you done a threat model and had a security review?
        • Yes
        • No
    • No

Add per-version bundle leases so cleanup skips versions that are actively used during an upgrade. Launch bundle-owned child processes from version-rooted layouts and pass lease handoff metadata so aspire-managed can acquire its own lease.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 19, 2026

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.sh | bash -s -- 17282

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/microsoft/aspire/main/eng/scripts/get-aspire-cli-pr.ps1) } 17282"

danegsta and others added 2 commits May 19, 2026 16:25
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@danegsta
Copy link
Copy Markdown
Member Author

Update race smoke test validation

I ran a fresh smoke test focused on the original issue in #16306: concurrent bundle update/extraction (setup --force) while another CLI process launches bundle-owned child tools (sdk dump / �spire-managed).

The PR dogfood workflow for the latest artifact was still pending at the time of testing and cli-native-archives-win-x64 was not yet downloadable, so I used a freshly built local win-x64 bundled CLI from this branch as the fallback validation path.

Scenario: two timestamp-distinct copies of the bundled CLI shared one install root, forcing alternating bundle version IDs and cleanup decisions. Four workers ran concurrently for 90 seconds:

Worker Command path Iterations Failed
v1-dump dump 18 False
v2-setup setup 25 False
v2-dump dump 17 False
v1-setup setup 30 False

Result: Passed - 90 total iterations, 0 failures.

CLI version tested: $(@{Status=Passed; Source=local Bundle.proj win-x64 build from current branch; SourceCli=C:\Users\danegsta\source\repos\aspire\main\artifacts\bin\Aspire.Cli\Debug\net10.0\win-x64\publish\aspire.exe; CliVersion=13.4.0-pr.17282.g85cda484b2; ArtifactDir=C:\Users\danegsta.copilot\session-state\bca2d521-ddd4-4f6e-abd3-7251972f2f5f\files\update-smoke-local-bundle-20260519-164340; TestDir=C:\Users\danegsta\AppData\Local\Temp\aspire-update-smoke-local-ef38007eabf34307a9fe5e917437587d; DurationSeconds=90; TotalIterations=90; Results=System.Object[]}.CliVersion)
Smoke artifacts: $artifactDir

This directly exercises the original repro shape (setup --force racing with sdk dump) plus the cross-version cleanup path that motivated the lease changes.

@danegsta danegsta marked this pull request as ready for review May 20, 2026 00:12
Copilot AI review requested due to automatic review settings May 20, 2026 00:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces per-version “lease” files for Aspire CLI bundled layouts so upgrades/cleanup won’t delete a bundle version directory while another CLI/AppHost/child process is still using it. It shifts bundle-owned process launches to stable, version-rooted paths and propagates the version directory via ASPIRE_BUNDLE_VERSION_DIR so long-running child processes can self-protect.

Changes:

  • Add BundleVersionLease (shared) and BundleLayoutLease (CLI) to acquire/hold per-version leases and pass lease handoff env vars to child processes.
  • Update BundleService to return a leased, version-rooted layout and to skip cleanup of versions with active leases.
  • Update CLI and aspire-managed to use/propagate leases in key flows (AppHost server, dashboard, NuGet helper, profiling, DCP stop).

Reviewed changes

Copilot reviewed 21 out of 21 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
tests/Aspire.Cli.Tests/Utils/CliTestHelper.cs Updates test bundle service fakes to the new lease-returning API.
tests/Aspire.Cli.Tests/Processes/ProcessShutdownServiceTests.cs Adds coverage that DCP stop uses the version-rooted (leased) DCP path when available.
tests/Aspire.Cli.Tests/Commands/AppHostLauncherTests.cs Fixes test wiring for the new ProcessShutdownService dependency.
tests/Aspire.Cli.Tests/BundleServiceIntegrationTests.cs Adds integration tests for lease acquisition and lease-aware stale version cleanup.
src/Shared/BundleVersionLease.cs Introduces the cross-process lease primitive (lease files under .leases).
src/Shared/BundleDiscovery.cs Adds ASPIRE_BUNDLE_VERSION_DIR env var constant for lease handoff.
src/Aspire.Managed/Program.cs Makes aspire-managed self-acquire a lease based on ASPIRE_BUNDLE_VERSION_DIR.
src/Aspire.Managed/Aspire.Managed.csproj Links shared bundle discovery + lease sources into Aspire.Managed.
src/Aspire.Cli/Projects/PrebuiltAppHostServer.cs Carries and disposes a bundle layout lease; forwards ASPIRE_BUNDLE_VERSION_DIR to the server process.
src/Aspire.Cli/Projects/DotNetAppHostProject.cs Updates AspireUseCliBundle flow to acquire/hold a lease and pass lease env vars to children.
src/Aspire.Cli/Projects/AppHostServerSession.cs Adds disposal plumbing so disposable projects can be cleaned up with the session lifecycle.
src/Aspire.Cli/Projects/AppHostServerProject.cs Uses lease-aware layout acquisition and hands the lease to PrebuiltAppHostServer.
src/Aspire.Cli/Profiling/ProfileCaptureService.cs Acquires a lease for aspire-managed dashboard profiling and passes lease env vars.
src/Aspire.Cli/Processes/ProcessShutdownService.cs Uses leased version-rooted DCP path when stopping a process tree.
src/Aspire.Cli/NuGet/BundleNuGetService.cs Uses a lease (when available) and passes lease env vars to the NuGet helper.
src/Aspire.Cli/NuGet/BundleNuGetPackageCache.cs Leases the layout before launching aspire-managed for NuGet search.
src/Aspire.Cli/Commands/DashboardRunCommand.cs Leases the layout before launching dashboard to avoid races with bundle cleanup.
src/Aspire.Cli/Bundles/IBundleService.cs Replaces “get layout” API with “acquire leased layout” API.
src/Aspire.Cli/Bundles/BundleService.cs Implements lease acquisition + lease-aware cleanup and returns version-rooted layouts.
src/Aspire.Cli/Bundles/BundleLayoutLease.cs Adds the CLI-side disposable wrapper that carries layout + optional lease + env handoff.
src/Aspire.Cli/Aspire.Cli.csproj Links the shared lease implementation into the CLI build.

Comment thread src/Aspire.Cli/Projects/AppHostServerSession.cs
Comment thread src/Aspire.Cli/Projects/AppHostServerProject.cs
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@github-actions
Copy link
Copy Markdown
Contributor

Re-running the failed jobs in the CI workflow for this pull request because 1 job was identified as retry-safe transient failures in the CI run attempt.
GitHub was asked to rerun all failed jobs for that attempt, and the rerun is being tracked in the rerun attempt.
The job links below point to the failed attempt jobs that matched the retry-safe transient failure rules.

Matched test failure patterns (1 test)
  • Aspire.Cli.EndToEnd.Tests.KubernetesDeployWithGarnetTests.DeployK8sWithGarnet — Unable to access container registry during publish

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/Shared/BundleVersionLease.cs Outdated
Comment thread src/Shared/BundleVersionLease.cs Outdated
danegsta and others added 3 commits May 20, 2026 10:40
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@IEvangelist
Copy link
Copy Markdown
Member

PR Testing Report

PR Information

CLI Version Verification

  • Expected commit: 76f914843fd074ff73a6487e347415128e77561d
  • Installed CLI: 13.4.0-pr.17282.g76f91484
  • Status: Passed - the dogfood CLI matched the PR head commit suffix.

Changes Analyzed

The PR changes CLI bundle extraction/layout lease handling, bundle NuGet cache/service behavior, AppHost server/session startup handoff, process shutdown, and related tests.

Test Scenarios Executed

Scenario 1: Dogfood CLI install and version check

Objective: Verify the PR dogfood artifact is available and corresponds to the current PR head.

Result: Passed

Evidence:

Scenario 2: Fresh starter AppHost lifecycle with detached run

Objective: Exercise the changed bundle/server/process paths by creating a project, building it, starting the AppHost in detached mode, listing the running AppHost, and stopping it.

Steps:

  1. Ran aspire new aspire-starter --name BundleLeaseSmoke --output . --non-interactive.
  2. Ran dotnet build .\BundleLeaseSmoke.AppHost\BundleLeaseSmoke.AppHost.csproj --nologo.
  3. Ran aspire run --project .\BundleLeaseSmoke.AppHost\BundleLeaseSmoke.AppHost.csproj --detach --non-interactive --nologo.
  4. Ran aspire ps --non-interactive --nologo.
  5. Ran aspire stop --project .\BundleLeaseSmoke.AppHost\BundleLeaseSmoke.AppHost.csproj --non-interactive --nologo.

Result: Passed

Evidence:

  • Template version selected: 13.4.0-pr.17282.g76f91484
  • Build result: Build succeeded. with 0 warnings and 0 errors.
  • Detached AppHost started successfully with SDK 13.4.0-pr.17282.g76f91484.
  • aspire ps listed the detached AppHost.
  • aspire stop stopped the AppHost successfully.

Summary

Scenario Status Notes
Dogfood CLI install/version check Passed Artifact version matched PR head commit suffix.
Fresh starter AppHost lifecycle with detached run Passed Start/list/stop succeeded with the PR CLI and package channel.

Overall Result

Passed. The tested CLI lifecycle path worked with the PR dogfood build.

@danegsta danegsta merged commit 59e83e2 into microsoft:main May 20, 2026
303 checks passed
@microsoft-github-policy-service microsoft-github-policy-service Bot added this to the 13.4 milestone May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bundled CLI can mutate the shared extracted layout while another command is using it

4 participants