- 08 Feb, 2026 3 commits
-
-
Wesley Liddick authored
feat(antigravity): comprehensive enhancements — rate limiting, scheduling & smart retry
-
shaw authored
Standardize filter bar layout across admin pages to place search/filters on left and action buttons on right within the same row, improving visual consistency and space utilization.
-
shaw authored
-
- 07 Feb, 2026 34 commits
-
-
erio authored
- Change antigravitySmartRetryMaxAttempts from 3 to 1 to prevent repeated rate limiting and long waits - Clear sticky session binding (DeleteSessionAccountID) after smart retry exhaustion, so subsequent requests don't hit the same rate-limited account - Add flow diagrams to Forward/ForwardGemini doc comments - Add comprehensive unit tests covering: - Sticky session cleared on retry failure (429, 503, network error) - Sticky session NOT cleared on retry success - Sticky session NOT cleared for non-sticky requests (empty hash) - Sticky session NOT cleared on long delay path (handled by handler) - Nil cache safety (no panic) - MaxAttempts constant verification - End-to-end retryLoop → switchError propagation with session clear
-
shaw authored
-
shaw authored
-
erio authored
-
erio authored
-
erio authored
The digest chain fallback is only needed for Gemini endpoints, not for the Anthropic Messages API path. Remove the handler integration while keeping the reusable service/repository layer for future use.
-
erio authored
The previous fallback (step 3) in GenerateSessionHash hashed system + all messages together, producing a different hash each round as the conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky sessions ineffective for multi-turn conversations. Implement per-message Trie digest chain matching (reusing Gemini's Trie infrastructure) so that the previous round's chain is always a prefix of the current round's chain, enabling reliable session affinity.
-
shaw authored
-
shaw authored
- avoid panic by using safe UUID prefix truncation in Gemini digest fallback logs\n- remove unconditional Antigravity 429 full-body debug logs and honor log truncation config\n- align Antigravity quick preset mappings to opus 4.6-thinking targets only\n- restore scope rate-limit aggregation/output in ops availability stats
-
erio authored
Remove threshold-based waiting in both sticky session and antigravity pre-check paths. When a model is rate-limited, immediately clear the sticky session and switch accounts instead of waiting for short durations.
-
Wesley Liddick authored
feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops
-
erio authored
-
erio authored
1. Frontend: replace hardcoded antigravityDefaultMappings with async fetch from GET /admin/accounts/antigravity/default-model-mapping, eliminating the duplicate data source that caused frontend/backend mapping inconsistency. 2. Backend: convert handleSmartRetry and antigravityRetryLoop from standalone functions to AntigravityGatewayService methods, enabling Redis cache sync (updateAccountModelRateLimitInCache) after both rate-limit write paths — long-delay branch and retry-exhausted branch.
-
shaw authored
-
erio authored
Remove extra blank line that caused golangci-lint gofmt check to fail.
-
erio authored
-
erio authored
- Add GetAccessToken upstream branch tests (success/failure/empty/nil) - Add mapAntigravityModel wildcard-target-equals-request edge case tests - Add upstream account smart retry test case - Add GeminiMessagesCompatService custom model_mapping and empty model tests
-
erio authored
- GetAccessToken: add upstream branch to read api_key from credentials - shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based - isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported whitelist with mapAntigravityModel for unified scheduling/forwarding logic - mapAntigravityModel: fix edge case where wildcard target equals request model - Update tests for new behavior and add custom model_mapping test cases
-
erio authored
-
erio authored
-
erio authored
Key changes: - Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching - Unified rate limiting: scope-level → model-level with Redis snapshot sync - Load-balanced scheduling by call count with smart retry mechanism - Force cache billing support - Model identity injection in prompts with leak prevention - Thinking mode auto-handling (max_tokens/budget_tokens fix) - Frontend: whitelist mode toggle, model mapping validation, status indicators - Gemini session fallback with Redis Trie O(L) matching - Ops: enhanced concurrency monitoring, account availability, retry logic - Migration scripts: 049-051 for model mapping unification
-
Wesley Liddick authored
feat(frontend): show seconds in rate limit time display
-
Wesley Liddick authored
fix(antigravity): reduce 429 fallback cooldown from 5min to 30s
-
Wesley Liddick authored
fix(antigravity): auto-fix max_tokens <= budget_tokens causing 400 error
-
Wesley Liddick authored
chore: add .gitattributes to enforce LF line endings
-
erio authored
Change formatTime() to include seconds (HH:MM:SS) instead of only hours and minutes (HH:MM). This gives users more precise information about when rate limits will reset.
-
erio authored
The default fallback cooldown when rate limit reset time cannot be parsed was 5 minutes, which is too aggressive and causes accounts to be unnecessarily locked out. Reduce to 30 seconds for faster recovery. Config override still works (unit remains minutes).
-
erio authored
When extended thinking is enabled, Claude API requires max_tokens > thinking.budget_tokens. If misconfigured, this auto-adjusts max_tokens to budget_tokens + 1000 instead of returning a 400 error. - Add ensureMaxTokensGreaterThanBudget helper function - Extract Gemini25FlashThinkingBudgetLimit constant (24576) - Log adjustment for debugging
-
erio authored
Ensures consistent line endings for SQL migration files, Go source, shell scripts, YAML configs, and Dockerfiles. Fixes checksum mismatches on Windows where CRLF line endings cause migration hash differences.
-
shaw authored
- OAuth 账号:使用完整的 DefaultBetaHeader 和 Claude Code 客户端 headers - API Key 账号:使用 APIKeyBetaHeader(不含 oauth beta)
-
shaw authored
-
shaw authored
- 将筛选器和操作按钮合并到同一行显示 - 筛选器在左侧,操作按钮在右侧 - 添加响应式支持,窄屏时自动换行并简化按钮文字
-
shaw authored
-
shaw authored
-
- 06 Feb, 2026 3 commits
-
-
shaw authored
用户 Dashboard 的 Token 使用趋势图表现在显示 Input/Output/Cache 三种类型, 并在 Tooltip 中显示 Actual 和 Standard 价格,与管理员页面保持一致。
-
shaw authored
将 /api/health 改为 /health,与后端实际注册的路由一致
-
shaw authored
在敏感字段检测中添加白名单,排除 API 参数和用量统计字段: - max_tokens, max_completion_tokens, max_output_tokens - completion_tokens, prompt_tokens, total_tokens - input_tokens, output_tokens - cache_creation_input_tokens, cache_read_input_tokens 这些字段名虽然包含 "token" 但只是数值参数,不应被脱敏处理。
-