feat(cloud): Add managed relay tunnels and APN service#2837
feat(cloud): Add managed relay tunnels and APN service#2837juliusmarminge wants to merge 11 commits into
Conversation
|
Important Review skippedAuto reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
554a1d2 to
8480c92
Compare
65c36c0 to
a7ed828
Compare
8480c92 to
e3ab348
Compare
a7ed828 to
b868fee
Compare
e3ab348 to
436b1b9
Compare
b868fee to
589e2ed
Compare
436b1b9 to
d20a8ce
Compare
63a525d to
8027af0
Compare
6c0e54d to
f15e2ba
Compare
8027af0 to
1a912f6
Compare
f15e2ba to
71e0186
Compare
1a912f6 to
90bf2b3
Compare
71e0186 to
e721336
Compare
e63e3f4 to
ba9802d
Compare
22e103a to
60b7d8d
Compare
ba9802d to
8789910
Compare
60b7d8d to
ee4ec05
Compare
8789910 to
f7ac694
Compare
bf992b4 to
d374900
Compare
a668bac to
a42a3f5
Compare
597e56d to
f9c9f4d
Compare
a42a3f5 to
2627b33
Compare
There was a problem hiding this comment.
🟡 Medium
The cache invalidation logic at line 290 only checks expectedMetadata, which omits the environment variables embedded by writeDevelopmentLauncherScript. When the metadata matches, the function returns early at line 293 without regenerating the launcher script, even if VITE_DEV_SERVER_URL or T3CODE_PORT has changed. This causes the app to use stale environment variables from a previous run — for example, attempting to connect to a Vite dev server port that is no longer running.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file apps/desktop/scripts/electron-launcher.mjs around line 277:
The cache invalidation logic at line 290 only checks `expectedMetadata`, which omits the environment variables embedded by `writeDevelopmentLauncherScript`. When the metadata matches, the function returns early at line 293 without regenerating the launcher script, even if `VITE_DEV_SERVER_URL` or `T3CODE_PORT` has changed. This causes the app to use stale environment variables from a previous run — for example, attempting to connect to a Vite dev server port that is no longer running.
Evidence trail:
apps/desktop/scripts/electron-launcher.mjs lines 277-294 (expectedMetadata definition and early return), lines 97-130 (writeDevelopmentLauncherScript embedding env vars into the launcher script), lines 300-302 (writeDevelopmentLauncherScript only called when cache is invalidated)
| Effect.provide(runtimeLayer), | ||
| ), |
There was a problem hiding this comment.
🟡 Medium src/worker.ts:249
The queue message handler and cron handler create spans using Stream.withSpan and Effect.withSpan, but they only provide runtimeLayer which does not include the tracer. This means spans from background jobs are silently dropped instead of being exported to Axiom, despite the explicit instrumentation. Consider providing relayTraceLayer to these handlers, similar to the HTTP fetch handler at line 270.
Effect.provide(runtimeLayer),
+ Effect.provide(relayTraceLayer),🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file infra/relay/src/worker.ts around lines 249-250:
The queue message handler and cron handler create spans using `Stream.withSpan` and `Effect.withSpan`, but they only provide `runtimeLayer` which does not include the tracer. This means spans from background jobs are silently dropped instead of being exported to Axiom, despite the explicit instrumentation. Consider providing `relayTraceLayer` to these handlers, similar to the HTTP fetch handler at line 270.
Evidence trail:
infra/relay/src/worker.ts lines 170-176 (relayTraceLayer definition with OtlpTracer), lines 178-213 (runtimeLayer definition without tracer), line 242 (Stream.withSpan in queue handler), line 246 (Effect.withSpan in queue handler), line 249 (Effect.provide(runtimeLayer) - no tracer), line 256 (Effect.withSpan in cron handler), line 257 (Effect.provide(runtimeLayer) - no tracer), line 270 (HTTP handler uses relayTraceLayer). infra/relay/src/observability.ts lines 53-72 (makeRelayTraceLayer creates OtlpTracer.layer for Axiom export).
| Effect.filterMapOrFail((tunnel) => | ||
| tunnel.id && tunnel.name | ||
| ? Result.succeed({ id: tunnel.id, name: tunnel.name }) | ||
| : Result.fail(new ManagedEndpointProvisioningFailed({ cause: tunnel })), | ||
| ), | ||
| Effect.mapError((cause) => new ManagedEndpointProvisioningFailed({ cause })), |
There was a problem hiding this comment.
🟢 Low environments/ManagedEndpointProvider.ts:232
The Effect.filterMapOrFail at line 232 uses the single-argument form, so when Result.fail(new ManagedEndpointProvisioningFailed({ cause: tunnel })) is returned, the error inside is discarded and the effect fails with Cause.NoSuchElementError instead. The subsequent mapError at line 237 then wraps that NoSuchElementError into ManagedEndpointProvisioningFailed, losing the original tunnel object that was intended for debugging. Use the two-argument overload of filterMapOrFail to preserve the tunnel as the error cause.
- Effect.filterMapOrFail((tunnel) =>
- tunnel.id && tunnel.name
- ? Result.succeed({ id: tunnel.id, name: tunnel.name })
- : Result.fail(new ManagedEndpointProvisioningFailed({ cause: tunnel })),
- ),
- Effect.mapError((cause) => new ManagedEndpointProvisioningFailed({ cause })),
+ Effect.filterMapOrFail(
+ (tunnel) =>
+ tunnel.id && tunnel.name
+ ? Result.succeed({ id: tunnel.id, name: tunnel.name })
+ : Result.fail(tunnel),
+ (failedTunnel) => new ManagedEndpointProvisioningFailed({ cause: failedTunnel }),
+ ),🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file infra/relay/src/environments/ManagedEndpointProvider.ts around lines 232-237:
The `Effect.filterMapOrFail` at line 232 uses the single-argument form, so when `Result.fail(new ManagedEndpointProvisioningFailed({ cause: tunnel }))` is returned, the error inside is discarded and the effect fails with `Cause.NoSuchElementError` instead. The subsequent `mapError` at line 237 then wraps that `NoSuchElementError` into `ManagedEndpointProvisioningFailed`, losing the original `tunnel` object that was intended for debugging. Use the two-argument overload of `filterMapOrFail` to preserve the `tunnel` as the error cause.
Evidence trail:
1. Effect.filterMapOrFail type signature and implementation: packages/effect/src/Effect.ts lines 5210-5227 and packages/effect/src/internal/effect.ts lines 4789-4810 in https://github.com/Effect-TS/effect-smol (the source for effect@4.0.0-beta.73)
2. Single-argument overload fails with NoSuchElementError: `() => fail(new NoSuchElementError() as E2)` at packages/effect/src/internal/effect.ts ~line 4810
3. Code under review: infra/relay/src/environments/ManagedEndpointProvider.ts lines 232-237 in https://github.com/pingdotgg/t3code at REVIEWED_COMMIT
4. Correct two-argument usage in same codebase: apps/server/src/provider/opencodeRuntime.ts lines 513-524 in https://github.com/pingdotgg/t3code at REVIEWED_COMMIT
5. effect version: package.json line 80 shows effect@4.0.0-beta.73
Co-authored-by: codex <codex@users.noreply.github.com>
refactor: remove cloudflare settings from RelayConfigurationShape and related tests test: clean up tests by removing cloudflare settings from configuration feat: implement ManagedEndpointDnsClient for DNS operations with Cloudflare refactor: simplify ManagedEndpointProvider by removing HTTP client dependencies fix: update ManagedEndpointProvider to use new DNS client for CNAME record management chore: adjust Api to use new DNS binding and remove deprecated cloudflare settings refactor: streamline ManagedEndpointZone creation by removing unnecessary token generation
Co-authored-by: codex <codex@users.noreply.github.com>
| export function updatePrimaryCloudPreferences(input: { | ||
| readonly publishAgentActivity: boolean; | ||
| }): Effect.Effect<CloudLinkState, CloudEnvironmentLinkError, HttpClient.HttpClient> { | ||
| return Effect.gen(function* () { | ||
| const client = yield* makeEnvironmentHttpApiClient(resolvePrimaryEnvironmentHttpUrl("/")); |
There was a problem hiding this comment.
🟡 Medium cloud/linkEnvironment.ts:408
updatePrimaryCloudPreferences calls resolvePrimaryEnvironmentHttpUrl without checking if a primary environment exists. When none is configured, resolvePrimaryEnvironmentHttpUrl throws a raw Error("Unable to resolve the primary environment HTTP base URL.") instead of the declared CloudEnvironmentLinkError. Callers in CloudSettings.tsx catch and handle CloudEnvironmentLinkError, so raw Error throws produce unhandled exception messages instead of proper user-facing errors.
- return Effect.gen(function* () {
- const client = yield* makeEnvironmentHttpApiClient(resolvePrimaryEnvironmentHttpUrl("/"));
+ return Effect.gen(function* () {
+ if (!readPrimaryCloudLinkTarget()) {
+ return yield* new CloudEnvironmentLinkError({
+ message: "Local environment is not ready yet.",
+ });
+ }
+ const client = yield* makeEnvironmentHttpApiClient(resolvePrimaryEnvironmentHttpUrl("/"));🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file apps/web/src/cloud/linkEnvironment.ts around lines 408-412:
`updatePrimaryCloudPreferences` calls `resolvePrimaryEnvironmentHttpUrl` without checking if a primary environment exists. When none is configured, `resolvePrimaryEnvironmentHttpUrl` throws a raw `Error("Unable to resolve the primary environment HTTP base URL.")` instead of the declared `CloudEnvironmentLinkError`. Callers in `CloudSettings.tsx` catch and handle `CloudEnvironmentLinkError`, so raw `Error` throws produce unhandled exception messages instead of proper user-facing errors.
Evidence trail:
apps/web/src/cloud/linkEnvironment.ts lines 408-423 (updatePrimaryCloudPreferences — no guard before calling resolvePrimaryEnvironmentHttpUrl); apps/web/src/cloud/linkEnvironment.ts lines 394-397 (readPrimaryCloudLinkState — has the guard); apps/web/src/cloud/linkEnvironment.ts lines 429-433 (unlinkPrimaryEnvironmentFromCloud — has the guard yielding CloudEnvironmentLinkError); apps/web/src/environments/primary/target.ts lines 135-150 (resolvePrimaryEnvironmentHttpUrl — throws raw Error at line 141); apps/web/src/components/settings/CloudSettings.tsx lines 158-181 (caller using runPromise with generic catch); apps/web/src/components/settings/CloudSettings.tsx lines 49-58 (cloudErrorMessage — checks CloudSettingsOperationError then Error)
| desktopClerkFetchInstalled = true; | ||
| } | ||
|
|
||
| function loadDesktopClerkUi(publishableKey: string): Promise<DesktopClerkUiCtor> { |
There was a problem hiding this comment.
🟡 Medium cloud/desktopClerk.tsx:128
If existingScript is found in the DOM but has already finished loading (its load event already fired), the event listeners added on lines 167-176 will never fire. Since window.__internal_ClerkUICtor check at line 150 already returned false, the promise will hang until the 15-second timeout rejects it, even though the script element exists and has completed loading.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file apps/web/src/cloud/desktopClerk.tsx around line 128:
If `existingScript` is found in the DOM but has already finished loading (its `load` event already fired), the event listeners added on lines 167-176 will never fire. Since `window.__internal_ClerkUICtor` check at line 150 already returned false, the promise will hang until the 15-second timeout rejects it, even though the script element exists and has completed loading.
Evidence trail:
apps/web/src/cloud/desktopClerk.tsx lines 128-187 (REVIEWED_COMMIT): loadDesktopClerkUi function. Line 129/150: checks window.__internal_ClerkUICtor. Line 138-140: queries for existing script. Line 154: reuses existing script if found. Lines 155-158: sets attributes on script (won't retrigger load per HTML spec for already-started scripts). Lines 167-176: adds load/error event listeners that won't fire on already-loaded scripts. Lines 164-166: 15-second timeout is the only fallback. Line 177: existing script is not re-appended. HTML spec: setting src on an already-inserted script element does not cause re-fetch or re-execution due to the 'already started' flag.
| return liveActivitySyncSemaphore.withPermits(1)(operation); | ||
| } | ||
|
|
||
| function syncAgentLiveActivities(): Effect.Effect<void, unknown> { |
There was a problem hiding this comment.
🟡 Medium agent-awareness/liveActivityController.ts:217
When activeActivity.missingSince is set at line 233, the function returns Effect.void without scheduling a timer to re-check after MISSING_STATE_GRACE_MS (30s). The timeout condition at line 239 only evaluates when syncAgentLiveActivities is invoked again by an external event. If no new snapshots arrive, the live activity remains visible indefinitely despite the grace period having elapsed.
🚀 Reply "fix it for me" or copy this AI Prompt for your agent:
In file apps/mobile/src/features/agent-awareness/liveActivityController.ts around line 217:
When `activeActivity.missingSince` is set at line 233, the function returns `Effect.void` without scheduling a timer to re-check after `MISSING_STATE_GRACE_MS` (30s). The timeout condition at line 239 only evaluates when `syncAgentLiveActivities` is invoked again by an external event. If no new snapshots arrive, the live activity remains visible indefinitely despite the grace period having elapsed.
Evidence trail:
apps/mobile/src/features/agent-awareness/liveActivityController.ts lines 217-246 (syncAgentLiveActivities function with missing timer), line 28 (MISSING_STATE_GRACE_MS = 30_000), lines 434-465 (scheduleRemoteTokenRegistrationRetry showing the correct pattern with Effect.sleep), lines 161-182 (syncAgentLiveActivitiesForSnapshot - the only external caller that triggers re-sync), apps/mobile/src/features/agent-awareness/shellLiveActivitySync.ts lines 28-42 (showing sync only happens on atom subscription changes, no polling).
Stack
codex/collection-performance-refactors->main(merged)t3code/mobile-remote-connect->maincodex/relay-managed-tunnels-auth-infra->t3code/mobile-remote-connectSummary
This stacked draft PR adds the relay-managed tunnel and cloud authentication work on top of the mobile remote-runtime PR. General collection/performance rewrites from #2854 and the TypeScript/Effect tooling base are now on
main.effect/Cryptowhile retaining Expo-compatible implementationsValidation
bun fmtbun lint(passes with 8 existing web warnings)bun lint:mobilebun typecheckcd infra/relay && bun run test(103passed,5skipped)cd apps/mobile && bun run test(135passed)cd apps/web && bun run test(1005passed)cd apps/server && bun run test(1075passed,4skipped)Rebase Note
General collection/performance rewrites from #2854 are now merged into
main; mobile command metadata, pairing-URL redaction, and shared-runtime Crypto cleanup remain in #2013. This PR retains the managed-relay changes to the mobile connection contract and runtime above those inherited lower-layer changes.