fix: recover mTLS auth after settings races#976
Conversation
4afb9d2 to
0e9e93b
Compare
0e9e93b to
0487124
Compare
| if (this.shouldRetryAfterAuthConfigChange(error)) { | ||
| return this.retryAfterAuthConfigChange(error); | ||
| } |
There was a problem hiding this comment.
These feel like they would be natural to be located within recoverFromUnauthorized, so we could see all the possible 401 recovery paths in one place.
That would let us avoid duplicating the _retryAttempted check as well.
| return false; | ||
| } | ||
|
|
||
| return this.client.hasAuthConfigChangedSince(config.authConfigVersion); |
There was a problem hiding this comment.
This makes sense but I wonder if it would be possible to compare the relevant settings instead? If any have changed, then we re-run the request.
Or I suppose the tracker could hash the values and use that as the version.
Hmm but although that might technically cover more cases, this is probably close enough, worst case is we run it once more with values that just failed (like if a user edits and then reverts the edit).
| return false; | ||
| } | ||
|
|
||
| const config = error.config as RequestConfigWithMeta & { |
There was a problem hiding this comment.
Not a blocker, but there must be a way we can convince Typescript these properties exist. I know in code-server we do something similar with Express to pass around things on the request
declare global {
// eslint-disable-next-line @typescript-eslint/no-namespace
namespace Express {
export interface Request {
args: DefaultedArgs
heart: Heart
settings: SettingsProvider<CoderSettings>
updater: UpdateProvider
cookieSessionName: string
}
}
}| this.logger.debug( | ||
| "Authentication settings changed after session suspended, recovering", | ||
| ); | ||
| await this.setDeploymentIfValid( |
There was a problem hiding this comment.
I wonder if we need some renaming because just looking at this function it kinda looks like we get the current deployment, then set the deployment to the current deployment, which looks like it should be a no-op.
I know as a side effect this might un-suspend the deployment, but just hard to tell from here.
| for (const key of configWithAuthMeta.headerCommandKeys ?? []) { | ||
| config.headers.delete(key); | ||
| } |
There was a problem hiding this comment.
To make sure I understand what is happening, since this could be a retry, we are deleting headers from the previous run of the header command? Since the next run may not output the same headers.
I think this property could use a bit of documentation.
| // Token from other window was invalid, ignore and keep waiting | ||
| // (or user can click Login/Cancel in the dialog) | ||
| } finally { | ||
| client.dispose(); |
There was a problem hiding this comment.
Ooo good catch there are probably at least a few clients that we were not disposing.
I wish we had Go's defer
| * reconnect, reload prompts). Keeps rapid edits in settings.json from | ||
| * flushing each side-effect per keystroke. | ||
| */ | ||
| export const CONFIG_CHANGE_DEBOUNCE_MS = 250; |
There was a problem hiding this comment.
kinda wild to me that VS Code is not already debouncing setting updates 💀
There was a problem hiding this comment.
They do but it's too short, also Netflix manually modifies the settings one by one so this coalesces them into one batch
| * opens a fixed collection window; subsequent events during the window are | ||
| * coalesced. This bounds latency even when events arrive faster than the | ||
| * window length (a reset-style debounce would starve). |
There was a problem hiding this comment.
As I understand it, a debounce means every time the function is called, it resets the timer.
So this is not actually a debounce? A throttle then? It fires at most once per debounceMs (the window)?
What does it mean to bind latency? Is this just saying we use a throttle to avoid the case where a user keeps editing settings within 250ms over and over, keeping the function from running? Is this really a problem? If the user keeps editing, we really probably do want to avoid using the now-stale values.
| if (windowTimer) { | ||
| return; // already collecting in the open window | ||
| } | ||
| windowTimer = setTimeout(() => { |
There was a problem hiding this comment.
yeah this looks like a throttle
I do feel like a debounce would make more sense though
There was a problem hiding this comment.
I had it at denounce first but I worried about denouncing for a long time since this can be triggered by editing the file OR the setting through the UI.
I can bring back the denounce but should we limit this to some max time or am I being overly cautious? Do keep in mind that this is used for the CoderApi/Remote settings watch/deployment watch... Etc
Perhaps denounce but reduce it from 250ms?
coder.headerCommand/coder.tls*changes settleCloses #973