Skip to content

feat(server): redirect blob GETs to presigned object-store URLs#6

Open
tonicmuroq wants to merge 2 commits into
mainfrom
feat/blob-redirect-presigned-url
Open

feat(server): redirect blob GETs to presigned object-store URLs#6
tonicmuroq wants to merge 2 commits into
mainfrom
feat/blob-redirect-presigned-url

Conversation

@tonicmuroq

@tonicmuroq tonicmuroq commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Problem

v2GetBlob proxies every blob byte through the epoch pod: it reads from the object store (GCS) and io.Copys the body back to the client, paying TLS on both sides for multi-GiB VM disk/memory blobs. Consequences:

  • A single pull stream tops out around ~20 MB/s even with 3 CPUs → a 10.5 GB snapshot takes ~9 min.
  • Concurrent transfers saturate the single pod and have tipped over the shared ingress controller (liveness-kill → connection refused).

Change

When EPOCH_BLOB_REDIRECT is set, v2GetBlob responds with a 307 to a presigned object-store URL (minio-go PresignedGetObject) instead of proxying. Blob bytes then flow client ↔ storage directly and epoch leaves the data path entirely. GCS honors the SigV4-signed URL and serves Range requests natively.

  • No client changes. Any redirect-following OCI client — registryclient/vk-cocoon, oras, crane, docker — benefits transparently. CopyBlobExact still verifies the digest end to end. Go strips Authorization on the cross-host redirect, so the registry token is never leaked to GCS.
  • Scoped to /v2/.../blobs/{digest} GET only. /dl/ (server-side reassembly of multi-part cloud images) is untouched — it can't be a simple redirect and still proxies.

Benefit

Single-blob throughput from a same-region GCE VM:

path single-stream 10.5 GB pull
proxy (1 CPU, before) 3.4 MB/s ~51 min
proxy (3 CPU) 20 MB/s ~9 min
presigned direct-to-GCS 275 MB/s ~38 s

~13× over the (already bumped) proxy, zero client changes. 8-connection intra-blob Range parallelism measured 1.28 GB/s — but that needs a client-side ranged downloader and is intentionally out of scope here.

Verified live (deployed to cocoonstack-us, image redirect-stream-20260618)

  • GET /v2/.../blobs/<digest> now returns 307…storage.googleapis.com in ~30 ms (server out of the byte path); following it serves 200/206 straight from GCS.
  • A real cocoon snapshot pull (win11, ~9.6 GiB) ran with every layer GET as a 307; end-to-end ~2 min, of which the GCS download is only ~35 s — the remainder is cocoon snapshot import writing to local disk (network is no longer the bottleneck).
  • Server idle at ~1m CPU / ~8 MiB during pulls (bytes bypass it).

Safety / rollout

  • Flag defaults off (EPOCH_BLOB_REDIRECT unset). The deploy manifest turns it on.
  • On any server-side failure (existence check or presign error) the handler falls back to streaming.
  • No fallback once the 307 is sent — every client hitting /v2/blobs must reach the object-store host (storage.googleapis.com). Verified reachable from US + SG cocoonset nodes before enabling. Confirm node egress before flipping the flag in a new environment.
  • Push is unaffected by this PR (see feat(server): stream monolithic blob PUT straight to object store #7); epoch-server kept at 3 CPU / 4 Gi for the push proxy + redirect fallback paths.

Test

  • resolveBlobRedirectTTL unit test (default / parse / invalid / non-positive).
  • Validated end to end on staging + live: presigned GCS URL returns 206 for ranged GETs; deployed and confirmed 307→GCS on cocoonstack-us.

v2GetBlob proxies every blob byte through the epoch process: it reads
from the object store and io.Copy's the body back to the client, paying
TLS on both sides for multi-GiB VM disk/memory blobs. A single stream
tops out at ~20 MB/s even with 3 CPUs, so a 10.5 GB snapshot pull takes
~9 min, and concurrent transfers saturate the pod.

When EPOCH_BLOB_REDIRECT is set, v2GetBlob now responds with a 307 to a
presigned object-store URL (minio-go PresignedGetObject), so blob bytes
flow client<->storage directly and epoch leaves the data path entirely.
GCS honors the SigV4-signed URL and serves Range requests natively.

Measured on a same-region GCE VM pulling the 5.7 GB memory-ranges blob:

  path                         single-stream    10.5 GB pull
  proxy (1 CPU, before)        3.4 MB/s         ~51 min
  proxy (3 CPU)                20 MB/s          ~9 min
  presigned direct-to-GCS      275 MB/s         ~38 s

~13x over the proxy with zero client changes — any redirect-following
OCI client (registryclient/vk-cocoon, oras, crane, docker) benefits
transparently, and CopyBlobExact still verifies the digest end to end.
Intra-blob Range parallelism (measured 1.28 GB/s at 8 connections) would
need a client-side downloader and is out of scope here.

The flag defaults off. On any server-side failure (existence check or
presign) the handler falls back to streaming. Note there is no fallback
once the 307 is sent: every client hitting /v2/blobs must have egress to
the object-store host when the flag is enabled.
Turn on EPOCH_BLOB_REDIRECT so pulls bypass the proxy. Bump resources
from 1 CPU / 512Mi to 3 CPU / 4Gi: pushes still stream through epoch
(this PR only redirects GETs) and the redirect fallback path also
proxies, so the pod still needs headroom for multi-GiB transfers.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant