- 08 Feb, 2026 3 commits
-
-
erio authored
ForwardUpstream/ForwardUpstreamGemini should pipe the upstream response directly to the client (headers + body), not parse it as SSE stream.
-
erio authored
-
erio authored
Replace manual header setting (Content-Type, anthropic-version, anthropic-beta) with full client header passthrough in ForwardUpstream/ForwardUpstreamGemini. Only authentication headers (Authorization, x-api-key) are overridden with upstream account credentials. Hop-by-hop headers are excluded. Add unit tests covering header passthrough, auth override, and hop-by-hop filtering.
-
- 07 Feb, 2026 17 commits
-
-
erio authored
v0.1.74 merged upstream accounts into the OAuth path, causing requests to hit the wrong protocol and endpoint. Add three upstream-specific methods (testUpstreamConnection, ForwardUpstream, ForwardUpstreamGemini) that use base_url + apiKey auth and passthrough the original body, while reusing the existing response handling and error/retry logic.
-
shaw authored
-
shaw authored
- avoid panic by using safe UUID prefix truncation in Gemini digest fallback logs\n- remove unconditional Antigravity 429 full-body debug logs and honor log truncation config\n- align Antigravity quick preset mappings to opus 4.6-thinking targets only\n- restore scope rate-limit aggregation/output in ops availability stats
-
erio authored
-
erio authored
1. Frontend: replace hardcoded antigravityDefaultMappings with async fetch from GET /admin/accounts/antigravity/default-model-mapping, eliminating the duplicate data source that caused frontend/backend mapping inconsistency. 2. Backend: convert handleSmartRetry and antigravityRetryLoop from standalone functions to AntigravityGatewayService methods, enabling Redis cache sync (updateAccountModelRateLimitInCache) after both rate-limit write paths — long-delay branch and retry-exhausted branch.
-
erio authored
Remove extra blank line that caused golangci-lint gofmt check to fail.
-
erio authored
-
erio authored
- Add GetAccessToken upstream branch tests (success/failure/empty/nil) - Add mapAntigravityModel wildcard-target-equals-request edge case tests - Add upstream account smart retry test case - Add GeminiMessagesCompatService custom model_mapping and empty model tests
-
erio authored
- GetAccessToken: add upstream branch to read api_key from credentials - shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based - isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported whitelist with mapAntigravityModel for unified scheduling/forwarding logic - mapAntigravityModel: fix edge case where wildcard target equals request model - Update tests for new behavior and add custom model_mapping test cases
-
erio authored
-
erio authored
-
erio authored
Key changes: - Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching - Unified rate limiting: scope-level → model-level with Redis snapshot sync - Load-balanced scheduling by call count with smart retry mechanism - Force cache billing support - Model identity injection in prompts with leak prevention - Thinking mode auto-handling (max_tokens/budget_tokens fix) - Frontend: whitelist mode toggle, model mapping validation, status indicators - Gemini session fallback with Redis Trie O(L) matching - Ops: enhanced concurrency monitoring, account availability, retry logic - Migration scripts: 049-051 for model mapping unification
-
erio authored
The default fallback cooldown when rate limit reset time cannot be parsed was 5 minutes, which is too aggressive and causes accounts to be unnecessarily locked out. Reduce to 30 seconds for faster recovery. Config override still works (unit remains minutes).
-
erio authored
When extended thinking is enabled, Claude API requires max_tokens > thinking.budget_tokens. If misconfigured, this auto-adjusts max_tokens to budget_tokens + 1000 instead of returning a 400 error. - Add ensureMaxTokensGreaterThanBudget helper function - Extract Gemini25FlashThinkingBudgetLimit constant (24576) - Log adjustment for debugging
-
shaw authored
- OAuth 账号:使用完整的 DefaultBetaHeader 和 Claude Code 客户端 headers - API Key 账号:使用 APIKeyBetaHeader(不含 oauth beta)
-
shaw authored
-
shaw authored
-
- 06 Feb, 2026 6 commits
-
-
shaw authored
在敏感字段检测中添加白名单,排除 API 参数和用量统计字段: - max_tokens, max_completion_tokens, max_output_tokens - completion_tokens, prompt_tokens, total_tokens - input_tokens, output_tokens - cache_creation_input_tokens, cache_read_input_tokens 这些字段名虽然包含 "token" 但只是数值参数,不应被脱敏处理。
-
shaw authored
移除响应阶段的工具名/schema/description 转换逻辑,修复第三方工具调用时 工具名被错误转换的问题(如 Task → task)。 移除内容: - 工具名相关正则变量(toolPrefixRe, toolNameBoundaryRe 等) - openCodeToolOverrides 和 claudeToolNameOverrides 映射表 - 工具名转换函数(normalizeToolNameForClaude, normalizeToolNameForOpenCode 等) - 响应体工具名替换函数(replaceToolNamesInText, replaceToolNamesInResponseBody 等) - 参数名转换函数(normalizeParamNameForOpenCode, rewriteParamKeysInValue) - 工具描述清理函数(sanitizeToolDescription) - 输入 schema 转换函数(normalizeToolInputSchema) - 模型 ID 正则替换函数(replaceModelIDInText) 保留内容: - 系统提示词清理(sanitizeSystemText) - Claude Code 指纹 headers 处理 - 模型 ID 映射(通过 JSON 对象操作)
-
yangjianbo authored
-
yangjianbo authored
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段, 而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且 内部计费缓存折扣为 0。 新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且 cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和 非流式两种响应路径。对 Claude 原生上游无影响。 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
shaw authored
-
yangjianbo authored
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段, 而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且 内部计费缓存折扣为 0。 新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且 cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和 非流式两种响应路径。对 Claude 原生上游无影响。 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
- 05 Feb, 2026 14 commits
-
-
yangjianbo authored
-
yangjianbo authored
-
yangjianbo authored
-
yangjianbo authored
-
iBenzene authored
-
shaw authored
问题原因:Redis Pipeline 执行 Lua 脚本时出现 NOSCRIPT 错误, 因为 redis.NewScript 使用 EVALSHA 执行脚本,当 Redis 重启或 脚本未被缓存时,Pipeline 模式无法自动回退到 EVAL。 解决方案:在 NewSessionLimitCache 初始化时预加载所有 Lua 脚本 到 Redis,确保后续 Pipeline 执行时脚本已被缓存。
-
shaw authored
支持管理员配置上游错误如何返回给客户端: - 新增 ErrorPassthroughRule 数据模型和 Ent Schema - 实现规则的 CRUD API(/admin/error-passthrough-rules) - 支持按错误码、关键词匹配,支持 any/all 匹配模式 - 支持按平台过滤(anthropic/openai/gemini/antigravity) - 支持透传或自定义响应状态码和错误消息 - 实现两级缓存(Redis + 本地内存)和多实例同步 - 集成到 gateway_handler 的错误处理流程 - 新增前端管理界面组件 - 新增单元测试覆盖核心匹配逻辑 优化: - 移除 refreshLocalCache 中的冗余排序(数据库已排序) - 后端 Validate() 增加匹配条件非空校验
-
ianshaw authored
-
ianshaw authored
当 Gemini for Google Cloud API 未启用时(SERVICE_DISABLED 错误), 系统现在会: - 自动检测 403 PERMISSION_DENIED 错误 - 从错误响应中提取 API 激活 URL - 向用户显示清晰的错误消息和可点击的激活链接 - 提供操作指引(启用后等待几分钟) 新增文件: - internal/pkg/googleapi/error.go: Google API 错误解析器 - internal/pkg/googleapi/error_test.go: 完整的测试覆盖 - GEMINI_API_ERROR_HANDLING.md: 实现文档 修改文件: - internal/repository/geminicli_codeassist_client.go: 在 LoadCodeAssist 和 OnboardUser 中增强错误处理 这大大改善了用户体验,用户不再需要手动从错误日志中查找激活 URL。
-
LLLLLLiulei authored
-
LLLLLLiulei authored
-
LLLLLLiulei authored
-
LLLLLLiulei authored
-
LLLLLLiulei authored
-