fix(site): raise gateway-proxy rate limit so encrypted chat works#134
Merged
Conversation
Regression from the rate-limiting in #111. One encrypted inference through /api/gw/* is not one request: it is ~5 setup calls (auth challenge + verify, session select, prepare, blob upload) PLUS a relay-token poll that fires up to 30 times (once a second until the worker is ready), and the playground retries up to 3 workers. The 30/min per-IP cap 429'd the token poll whenever the worker took more than ~25s (the common case - a full run is ~50s), so the relay token never resolved and the answer never streamed: the 'starts the session, returns nothing' chat bug. A direct-gateway SDK call (no proxy, no cap) returns a worker verdict in ~50s, which isolates the cap as the cause. Raise /api/gw/ to 600/min per IP (the upstream gateway has its own limits; this only stops gross abuse of the open proxy). DAO/operator-preview keep 30/min.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
On lightnode.app the encrypted chat/playground 'starts creating the session and returns nothing.'
Root cause (a regression I introduced in #111)
One encrypted inference through
/api/gw/*is not one request:The rate limiting in #111 capped
/api/gw/at 30/min per IP, so the moment the worker takes more than ~25s (a full run is ~50s) the token poll gets 429'd, the relay token never resolves, and the answer never streams.Proof
I ran a LightChallenge-style judge prompt through lightnode-sdk 0.19.1 directly against the mainnet gateway (no proxy, no cap): a real worker returned a correct verdict in ~50s (PASS for a valid 10,234-step proof,
{"passed": false}for a 4,201-step one). So the worker network and SDK are healthy; the only thing breaking the browser path was my proxy cap.Fix
Raise
/api/gw/to 600/min per IP (one inference is ~35 calls; this leaves room for retries and a few inferences while still bounding abuse of the open proxy; the upstream LightChain gateway enforces its own limits). DAO/operator-preview keep 30/min.Tests updated; tsc clean.