fix: resolve refresh token race condition causing false invalid_grant errors
When multiple goroutines/workers concurrently refresh the same OAuth token,
the first succeeds but invalidates the old refresh_token (rotation). Subsequent
attempts using the stale token get invalid_grant, which was incorrectly treated
as non-retryable, permanently marking the account as ERROR.
Three complementary fixes:
1. Race-aware recovery: after invalid_grant, re-read DB to check if another
worker already refreshed (refresh_token changed) — return success instead
of error
2. In-process mutex (sync.Map of per-account locks): prevents concurrent
refreshes within the same process, complementing the Redis distributed lock
3. Increase default lock TTL from 30s to 60s to reduce TTL-expiry races
Co-Authored-By:
Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Please register or sign in to comment