Skip to content

perf(viewer): cache the parsed SSR bundle per isolate#254

Merged
jamesdabbs merged 1 commit into
mainfrom
perf/viewer-ssr-isolate-cache
Jun 28, 2026
Merged

perf(viewer): cache the parsed SSR bundle per isolate#254
jamesdabbs merged 1 commit into
mainfrom
perf/viewer-ssr-isolate-cache

Conversation

@jamesdabbs

Copy link
Copy Markdown
Member

What & why

A warm Cloudflare Worker isolate serves many SSR requests but re-downloaded and re-parsed the entire data bundle on every one. This caches the parsed + transformed bundle in isolate (module) scope, keyed by host/branch, and revalidates it against the S3 ETag — so an unchanged source costs a conditional request that 304s and reuses the existing transform instead of re-parsing the whole dataset.

The cache is bounded (the 2 most-recently-used sources; a site serves a single host/branch, so in practice one entry) so a parsed bundle can't accumulate unbounded in a long-lived isolate. SSR only — the client already persists to localStorage.

First of a two-PR stack; the lazy-deduction PR is the larger CPU win and builds on this.

Also in here

Declares @cloudflare/workers-types as a viewer devDependency. app.d.ts (added with the telemetry in #253) references it for IncomingRequestCfProperties, but it was only present transitively, so tsc --noEmit couldn't resolve it. pnpm --filter viewer tsc is now clean.

Instrumentation

Adds a cached flag to the bundle_sync Workers Logs event so a warm-isolate cache hit (changed:false, cached:true) is distinguishable from a cold parse (changed:true, cached:false).

Validation

  • pnpm --filter viewer tsc and validate (svelte-check) — clean
  • pnpm --filter viewer test — 70 passed, 5 todo
  • VITE_SITE=topology pnpm --filter viewer build

Manual deploy + telemetry check (topology worker pi-base-topology):

  1. VITE_SITE=topology pnpm --filter viewer build
  2. pnpm --filter viewer cf:deploy:topology (= wrangler deploy --env topology)
  3. Drive several SSR requests to a non-prerendered detail page (e.g. repeatedly GET a /spaces/<id>/properties/<id> URL) so at least one lands on a warm isolate.
  4. In Workers Logs / observability, query the bundle_sync event:
    • Expect: the first request on a cold isolate → changed:true, cached:false (one parse); subsequent warm requests → changed:false, cached:true (ETag 304, transform reused, no re-parse).
    • Cross-check the request event's cold flag and the platform cpuTimeMs/wallTimeMs (joined by $metadata.requestId): warm cached:true requests should carry lower bundle-attributable cost.

Risk / tradeoff

A stale parse can only persist within a single isolate and is bounded by ETag revalidation — any source change returns a 200 and re-parses. Memory is bounded to ≤2 parsed bundles per isolate.

A warm Cloudflare Worker isolate serves many SSR requests but re-downloaded
and re-parsed the whole data bundle on every one. Cache the parsed +
transformed bundle in isolate (module) scope, keyed by host/branch, and
revalidate it against the S3 ETag: an unchanged source now costs a
conditional request that 304s and reuses the existing transform instead of
re-parsing the entire dataset.

The cache is bounded to the few most-recently-used sources (a site serves a
single host/branch, so in practice one entry) so a parsed bundle can't
accumulate unbounded in a long-lived isolate. SSR only — the client store
already persists to localStorage.

Instrumentation: add a `cached` flag to the `bundle_sync` Workers Logs event
so warm-isolate cache hits (changed:false, cached:true) can be told apart
from cold parses (changed:true) when validating against telemetry.

Also declare `@cloudflare/workers-types` as a viewer devDependency: app.d.ts
(added with the telemetry in #253) references it for `IncomingRequestCfProperties`,
but it was only present transitively, so `tsc --noEmit` failed to resolve it.
@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
pi-base-topology 461b208 Commit Preview URL

Branch Preview URL
Jun 28 2026, 04:03 PM

@cloudflare-workers-and-pages

Copy link
Copy Markdown

Deploying topology with  Cloudflare Pages  Cloudflare Pages

Latest commit: 461b208
Status: ✅  Deploy successful!
Preview URL: https://fe84dc95.topology.pages.dev
Branch Preview URL: https://perf-viewer-ssr-isolate-cach.topology.pages.dev

View logs

@jamesdabbs jamesdabbs self-assigned this Jun 28, 2026
@jamesdabbs

Copy link
Copy Markdown
Member Author

Telemetry validation — deployed to pi-base-topology, ~3h soak

Deployed this branch to the topology worker (version f0404709) and let it soak under live traffic (≈99% AI-crawler walk of the space×property matrix). Compared a 3h post-deploy window (16:30–19:30 UTC) against a 3h pre-deploy baseline (13:00–16:00 UTC, main).

✅ The cache works (82.8% warm-hit rate)

bundle_sync events over the soak:

count network fetch avg / p90
cached=true (warm isolate, 304 revalidate, no re-parse) 929 (82.8%) 112 ms / 137 ms
cached=false (cold isolate, full 200 + parse) 193 (17.2%) 280 ms / 560 ms

cached=falsechanged=true (cold-isolate first parses); every warm request after that reuses the parse. The warm path also cuts the bundle network fetch ~2.5× (conditional 304 vs full download), on top of skipping the whole-bundle parse/transform CPU.

The hit rate is lower than the 95% I measured in a back-to-back burst right after deploy — realistic crawler traffic is bursty, so idle gaps evict isolates and the next burst re-parses once. ~83% is the steady-state number; it should drift up over a longer soak.

✅ No CPU regression — and a clean baseline for #251

cpuTimeMs on the heavy route /spaces/[id]/properties/[propertyId], pre vs post:

percentile pre (main) post (#254)
p50 956 ms 889 ms
p90 2843 ms 2621 ms
p99 3799 ms 3654 ms

Flat within noise — expected, since this PR doesn't touch deduction. It confirms the cache didn't regress CPU and establishes the baseline the lazy-deduction PR (#251) should move: ~0.9 s median, ~3.7 s p99 of CPU per heavy render, dominated by the eager full-database prover.

⚠️ Cancellations (pre-existing, not caused by this PR)

The only non-ok outcomes are canceled (no exceptions/5xx on the new version), clustered at ~10 s wall time (p99 wallTimeMs ≈ 10,039 ms) — slow renders that hit the wall cap or that crawlers abandon.

canceled total rate
pre (main) 129 4,028 3.2%
post (#254) 506 7,597 6.7%

The post window had ~1.9× the traffic (and heavier bursts). The cache can only reduce request latency (it speeds the bundle path; CPU is unchanged), so it can't be the cause of the higher cancellation count — this is the eager-deduction slowness under crawler bursts, which is exactly what #251 removes. Worth re-checking the cancellation rate after #251 deploys.

Reproduce / explore

Verdict: cache is doing its job (≈83% of bundle syncs skip the re-parse), no regression, and we have a clean CPU baseline. Good to proceed to #251.

@jamesdabbs jamesdabbs marked this pull request as ready for review June 28, 2026 20:24
@jamesdabbs jamesdabbs merged commit 7fb32e8 into main Jun 28, 2026
4 checks passed
@jamesdabbs jamesdabbs deleted the perf/viewer-ssr-isolate-cache branch June 28, 2026 20:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant