Skip to content

wslc: idle-terminate per-user session VMs when inactive#40781

Open
benhillis wants to merge 6 commits into
masterfrom
user/benhill/wslc-idle-terminate-vm
Open

wslc: idle-terminate per-user session VMs when inactive#40781
benhillis wants to merge 6 commits into
masterfrom
user/benhill/wslc-idle-terminate-vm

Conversation

@benhillis

Copy link
Copy Markdown
Member

Summary

Idle-terminates a per-user WSLC session's backing VM when it has been inactive, freeing memory while the session object (and its persistent storage) lives on. The VM is transparently recreated on the next operation.

Builds on #40770 (IWSLCVirtualMachineFactory).

Behavior

  • Only sessions with persistent storage (StoragePath set) idle-terminate.
  • An idle worker thread tears the VM down after a grace period (currently 30s) once there is no in-flight activity and no active container lock.
  • In-flight work holds an activity reference so the VM cannot be torn down mid-operation:
    • VmLease wraps CLI/container operations.
    • BeginContainerOperation hands clients an activity token (IFastRundown so a client crash reclaims it promptly).
    • Long-lived root-namespace processes (e.g. plugin hosts) created via CreateRootNamespaceProcess hold a keep-alive token for their lifetime.
  • Activity bookkeeping (count + wake event) lives in a shared IdleState held via shared_ptr, decoupled from the session's lifetime, so a held token suppresses idle teardown without extending the session object's lifetime (preserving the explicit-reset-invalidates-held-processes invariant from Add WSLC (WSL Containers) feature #40366).

Testing

  • New WSLCE2EVmIdleTests E2E suite (5 tests) including WSLCE2E_VmIdle_RootProcessKeepsVmAlive.
  • WSLCTests::CreateRootNamespaceProcess still passes.
  • Full x64 Debug build clean.

Notes / follow-ups (deferred)

  • Grace period is a hardcoded constexpr; making it injectable would enable deterministic race tests.
  • No crash-path (client dies holding token) automated coverage yet.

Note

Draft for early review.

Copilot AI review requested due to automatic review settings June 11, 2026 19:50

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds on-demand creation and idle-termination of per-user WSLC session VMs (for sessions with persistent storage), so memory can be reclaimed while keeping the session object and storage intact. It also introduces VM-liveness/activity bookkeeping to prevent teardown during in-flight operations and adds new E2E coverage around VM lifecycle behavior.

Changes:

  • Implement lazy VM bring-up and idle shutdown in wslcsession via an idle worker, activity counting/tokens, and a VmLease used by VM-requiring operations.
  • Add client-side “operation keep-alive” usage in wslc.exe container operations to prevent VM teardown between OpenContainer and subsequent calls/streaming.
  • Add a new E2E test suite validating lazy start, idle stop, persistence across restarts, keep-alive for root-namespace processes, and teardown/recreate races.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/windows/wslc/e2e/WSLCE2EVmIdleTests.cpp New E2E tests covering lazy VM start, idle stop, persistence, keep-alive, and race scenarios.
test/windows/wslc/e2e/WSLCE2EHelpers.h Exposes the underlying IWSLCSession* for diagnostics/test-only calls.
src/windows/wslcsession/WSLCSession.h Adds VM lifecycle state, idle worker/tokens/lease declarations, and new session methods.
src/windows/wslcsession/WSLCSession.cpp Implements lazy VM creation, idle teardown, activity tokens, and VM diagnostics reporting.
src/windows/wslcsession/WSLCProcessControl.cpp Preserves a real exit code when signaling container release, only synthesizing SIGKILL when needed.
src/windows/wslcsession/WSLCProcess.h Stores a keep-alive token on root-namespace processes to keep the VM alive for their lifetime.
src/windows/wslcsession/WSLCContainer.cpp Signals idle re-checks on terminal container transitions; holds a VM lease during delete.
src/windows/wslcsession/IORelay.h Adds IsRelayThread() to safely avoid destroying the relay on its own thread.
src/windows/wslcsession/IORelay.cpp Co-initializes the relay thread into the MTA; implements IsRelayThread().
src/windows/wslc/services/SessionModel.h Adds a helper to acquire/hold a keep-alive token for client-side container operations.
src/windows/wslc/services/ContainerService.cpp Uses the keep-alive token across container operations (attach/start/stop/kill/delete/exec/etc.).
src/windows/service/inc/wslc.idl Adds VM diagnostics type + new session methods for diagnostics and operation keep-alive.
src/windows/service/exe/WSLCSessionManager.cpp Updates comments to reflect on-demand VM creation and recreation after idle termination.

Comment thread src/windows/wslcsession/WSLCSession.h
Comment thread src/windows/service/inc/wslc.idl Outdated
Copilot AI review requested due to automatic review settings June 11, 2026 20:10

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 3 comments.

Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread src/windows/wslcsession/WSLCSession.cpp
Comment thread src/windows/wslcsession/WSLCSession.cpp Outdated
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from c12d7e1 to fa2eb47 Compare June 12, 2026 01:28
Copilot AI review requested due to automatic review settings June 12, 2026 17:46
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from fa2eb47 to ea2254c Compare June 12, 2026 17:46

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 3 comments.

Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/service/inc/wslc.idl Outdated
Copilot AI review requested due to automatic review settings June 12, 2026 20:55

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 32 out of 32 changed files in this pull request and generated 4 comments.

Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/service/inc/wslc.idl
Comment thread src/windows/WslcSDK/wslcsdk.def
Comment thread src/windows/WslcSDK/winrt/Session.cpp
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from b870044 to 4bcd87f Compare June 17, 2026 00:22
Copilot AI review requested due to automatic review settings June 17, 2026 16:21
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from 4bcd87f to 348e2e1 Compare June 17, 2026 16:21

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.

Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread test/windows/WSLCTests.cpp
Comment thread test/windows/WSLCTests.cpp

@benhillis benhillis left a comment

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed VM-related comments - all have been addressed:

Comment on WSLCSession.h:84: Already correct - lines 77-78 say "IWSLCVirtualMachineFactory" and "lazily on first use"

Comment on WSLCSession.cpp:571: Fixed by AddRef/Release activity tracking - idle worker checks ActivityCount (line 779), and container proxies increment it on AddRef 1→2 transition. VM will not idle-terminate while clients hold container proxies. See lines 672-674 comment.

Comment on WSLCSession.cpp:380: Already has exception handling - IdleWorker() is wrapped in CATCH_LOG() at lines 375-379

Comment on Session.cpp:65: Already correct - wil::unique_threadpool_wait (line 56) calls WaitForThreadpoolWaitCallbacks in destructor automatically

Copilot AI review requested due to automatic review settings June 17, 2026 16:53

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 14 comments.

Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread test/windows/WSLCTests.cpp
Comment thread src/windows/service/exe/HcsVirtualMachine.cpp Outdated
Comment thread test/windows/wslc/e2e/WSLCE2EVmIdleTests.cpp Outdated
Comment thread test/windows/wslc/e2e/WSLCE2EVmIdleTests.cpp Outdated
Copilot AI review requested due to automatic review settings June 17, 2026 17:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 4 comments.

Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread test/windows/WSLCTests.cpp
Comment thread test/windows/WSLCTests.cpp
Copilot AI review requested due to automatic review settings June 17, 2026 21:58

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 12 comments.

Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread src/windows/service/inc/wslc.idl Outdated
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread src/windows/wslc/services/ContainerService.cpp
Comment thread test/windows/WSLCTests.cpp
Copilot AI review requested due to automatic review settings June 18, 2026 16:14
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from e99a7f2 to ea6bad8 Compare June 23, 2026 17:33
Copilot AI review requested due to automatic review settings June 23, 2026 19:33
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from ea6bad8 to 6e9729a Compare June 23, 2026 19:33

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Comment thread src/windows/service/inc/wslc.idl
@benhillis benhillis marked this pull request as ready for review June 24, 2026 14:23
@benhillis benhillis requested a review from a team as a code owner June 24, 2026 14:23
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from 6e9729a to fff0325 Compare June 24, 2026 22:50
Copilot AI review requested due to automatic review settings June 26, 2026 20:11
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from fff0325 to dd57d4c Compare June 26, 2026 20:11

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Comment thread src/windows/wslcsession/WSLCIdleState.h
Comment thread src/windows/wslcsession/WSLCContainer.h Outdated
Comment thread src/windows/wslcsession/WSLCContainer.h Outdated

@OneBlue OneBlue left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a first read, I think this is looking good. One thing that might help make this a bit easier would be to fully separate the "VM state, related fields and timing logic" out of WSLCSession, with something like:

class WSLCSessionRuntime
{
    WSLCVirtualMachine m_vm;
    IORelay m_ioRelay;
    [..]
}

That class can handle the ref counting, timeout and such, and then whenever the session needs to access the VM, it can do something like:

struct LockedRuntime
{
    WSLCVirtualMachine& vm;
    IORelay& relay;
    [...]
};

Which then allows do something like:

auto runtime = AcquireRuntime(...);

runtime->Vm.Operation(...)


AcquireRuntime starts the VM as needed and handles timeouts 

Comment thread src/windows/wslcsession/WSLCExecutionContext.h Outdated
Comment thread src/windows/wslcsession/WSLCSession.cpp Outdated
Comment thread src/windows/wslcsession/WSLCSession.cpp
Comment thread src/windows/wslcsession/WSLCProcessControl.cpp
Comment thread src/windows/wslcsession/WSLCSession.cpp
Copilot AI review requested due to automatic review settings June 30, 2026 00:32

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.

Comment thread src/windows/wslcsession/WSLCSession.cpp
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from ea84deb to 025b755 Compare June 30, 2026 00:43
Copilot AI review requested due to automatic review settings June 30, 2026 16:00

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 3 comments.

Comment thread src/windows/wslc/services/SessionService.cpp Outdated
Comment thread src/windows/wslcsession/WSLCSession.cpp
Comment thread src/windows/wslc/services/SessionModel.h Outdated
Ben Hillis and others added 6 commits July 1, 2026 07:44
Per-user WSLC container session VMs now idle-terminate when no
container is in a non-terminal (Created/Running) state, freeing host
memory, and lazily restart on the next operation that needs the VM.

- Centralize VM lifecycle in WSLCSession via TearDownVmLockHeld /
  StartVmLockHeld and an atomic VmExitDisposition (Active /
  StopRequested / ExitClaimed) to arbitrate expected stops vs.
  spontaneous VM exits without a polling thread.
- Gate VM-requiring entrypoints behind AcquireVmLease(), which brings
  the VM up on demand and keeps it alive for the operation's duration.
- Add IWSLCSession::BeginContainerOperation so a CLI command can hold
  the VM alive across resolve + operate + streamed output.
- Preserve the session WarningCallback for the lifetime of the session
  so warnings emitted by the lazy VM start (e.g. resource recovery)
  are still delivered to the CLI invocation.
- Remove the dtor lock in HcsVirtualMachine; OnExit/OnCrash are
  lock-free.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Only Running containers now hold an activity reference that keeps the
per-user session VM alive. Previously a container in either Created or
Running state held the reference, so a `create`d-but-never-started
container pinned the VM indefinitely and defeated idle termination.

A created container's metadata persists on the containerd VHD across VM
teardown and is rebuilt by RecoverExistingContainers on the next
VM-requiring operation, so create -> idle-terminate -> start later works;
the 30s grace period covers the common create-then-start gap.

Also fix m_stateChangedAt recovery for created containers: docker inspect
reports FinishedAt as the zero date ("0001-01-01T00:00:00Z") for a
never-started container, which parsed to year 1 and rendered as "created
2026 years ago". Use the container's Created time for the Created state.
This recovery path was previously unreachable, since created containers
never got torn down.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Addresses review feedback: WSLCContainer.h still described the activity
hold as held while Created/Running, but it now only pins the VM while
Running. Update the two header comments to match the implementation.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Make the VM idle grace period configurable via settings.yaml
  (session.idleTimeout, default 30s) instead of a hardcoded constant.
- Assert UserSid is non-null in PersistSettings rather than tolerating
  a null SID.
- Drop the session warning-callback GIT fallback; warnings emitted
  outside a callback-bearing operation are logged and event-logged only.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Recovery warnings emitted during lazy VM start run outside the user's
current command, so they are now logged (and written to the event log)
instead of being routed back to the session-creation warning callback.
Update the three WarningCallback*Recovery unit tests and the e2e test to
assert the warning is no longer delivered to the session callback /
printed on stderr.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- BeginContainerOperation: reject new operations once the session is
  terminating/terminated, mirroring EnsureVmRunning's gate, so a started
  operation cannot pin a VM that is being torn down.
- TearDownVmLockHeld: reset m_storageMounted after unmounting so the
  flag does not stay stale across an idle teardown.
- CLI Session model: stop retaining the IWarningCallback for the session
  lifetime. Recovery warnings from lazy VM start are logged rather than
  delivered to the session callback, so the stashed callback (and its
  now-misleading comments) is dead state. The callback is still passed to
  CreateSession, where it is consumed during initialization.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings July 1, 2026 14:59
@benhillis benhillis force-pushed the user/benhill/wslc-idle-terminate-vm branch from 08096cd to f341cf2 Compare July 1, 2026 14:59

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 21 out of 21 changed files in this pull request and generated 1 comment.

Comment on lines +390 to +402
if (WI_IsFlagSet(Settings->StorageFlags, WSLCSessionStorageFlagsNoCreate))
{
// The storage VHD must already exist (ConfigureStorage will not create it).
THROW_HR_WITH_USER_ERROR_IF(
HRESULT_FROM_WIN32(ERROR_PATH_NOT_FOUND),
Localization::MessageWslcSessionStorageNotFound(Settings->StoragePath),
!std::filesystem::exists(storagePath / c_storageVhdFilename));
}
else if (!std::filesystem::exists(storagePath / c_storageVhdFilename))
{
// New session: the target path (if it exists) must be an empty directory.
ValidateNewSessionStorageDirectory(storagePath);
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants