- 09 Feb, 2026 1 commit
-
-
erio authored
Merge functional changes from develop branch: - Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image) - Replace with per-model rate limiting using resolveAntigravityModelKey - Remove model load statistics (IncrModelCallCount/GetModelLoadBatch) - Simplify account selection to unified priority→load→LRU algorithm - Remove SetAntigravityQuotaScopeLimit from AccountRepository - Clean up scope-related UI indicators and API fields
-
- 08 Feb, 2026 4 commits
-
-
erio authored
Add post-sort shuffle for accounts with identical (priority, loadRate, lastUsedAt) to break deterministic ordering when concurrent requests read the same scheduler snapshot. Applies to both Antigravity and OpenAI scheduling paths, plus the sortAccountsByPriorityAndLastUsed helper. Keeps upstream CallCount/ModelLoadInfo scheduling intact; shuffle is additive and only randomises within equivalent-rank groups.
-
erio authored
-
erio authored
ParseGatewayRequest only parsed Anthropic format (system/messages), ignoring Gemini native format (systemInstruction/contents). This caused GenerateSessionHash to produce identical hashes for all Gemini sessions. Add protocol parameter to ParseGatewayRequest to branch between Anthropic and Gemini parsing. Update GenerateSessionHash message traversal to extract text from both formats.
-
erio authored
Mix SessionContext (ClientIP, UserAgent, APIKeyID) into GenerateSessionHash 3rd-level fallback to differentiate requests from different users sending identical content. Also switch hashContent from SHA256-truncated to XXHash64 for better performance, and optimize Trie Lua script to match from longest prefix first.
-
- 07 Feb, 2026 10 commits
-
-
erio authored
-
erio authored
The previous fallback (step 3) in GenerateSessionHash hashed system + all messages together, producing a different hash each round as the conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky sessions ineffective for multi-turn conversations. Implement per-message Trie digest chain matching (reusing Gemini's Trie infrastructure) so that the previous round's chain is always a prefix of the current round's chain, enabling reliable session affinity.
-
yangjianbo authored
SSE 热路径中 replaceModelInSSELine 和 replaceModelInResponseBody 原来 使用 json.Unmarshal/Marshal 对每个事件做全量反序列化再序列化,现改为 gjson.Get/sjson.Set 精确字段操作,消除 O(n) 中间 map 分配,保持 JSON 字段顺序不变。涉及 OpenAIGatewayService 和 GatewayService 两个服务。 新增 23 个单元测试覆盖:顶层/嵌套 model 替换、不匹配跳过、空行/[DONE]/ 非法 JSON 等边界情况。 Fixes: P1-08 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
erio authored
Remove threshold-based waiting in both sticky session and antigravity pre-check paths. When a model is rate-limited, immediately clear the sticky session and switch accounts instead of waiting for short durations.
-
erio authored
Remove extra blank line that caused golangci-lint gofmt check to fail.
-
erio authored
-
erio authored
- GetAccessToken: add upstream branch to read api_key from credentials - shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based - isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported whitelist with mapAntigravityModel for unified scheduling/forwarding logic - mapAntigravityModel: fix edge case where wildcard target equals request model - Update tests for new behavior and add custom model_mapping test cases
-
erio authored
-
erio authored
Key changes: - Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching - Unified rate limiting: scope-level → model-level with Redis snapshot sync - Load-balanced scheduling by call count with smart retry mechanism - Force cache billing support - Model identity injection in prompts with leak prevention - Thinking mode auto-handling (max_tokens/budget_tokens fix) - Frontend: whitelist mode toggle, model mapping validation, status indicators - Gemini session fallback with Redis Trie O(L) matching - Ops: enhanced concurrency monitoring, account availability, retry logic - Migration scripts: 049-051 for model mapping unification
-
shaw authored
-
- 06 Feb, 2026 4 commits
-
-
yangjianbo authored
将流式响应中 bufio.Scanner 的 64KB buffer 从每次 make 分配改为 sync.Pool 复用,统一切片表达式为 [:0]、变量命名为 scanBuf, 并补充对应的单元测试。 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
shaw authored
移除响应阶段的工具名/schema/description 转换逻辑,修复第三方工具调用时 工具名被错误转换的问题(如 Task → task)。 移除内容: - 工具名相关正则变量(toolPrefixRe, toolNameBoundaryRe 等) - openCodeToolOverrides 和 claudeToolNameOverrides 映射表 - 工具名转换函数(normalizeToolNameForClaude, normalizeToolNameForOpenCode 等) - 响应体工具名替换函数(replaceToolNamesInText, replaceToolNamesInResponseBody 等) - 参数名转换函数(normalizeParamNameForOpenCode, rewriteParamKeysInValue) - 工具描述清理函数(sanitizeToolDescription) - 输入 schema 转换函数(normalizeToolInputSchema) - 模型 ID 正则替换函数(replaceModelIDInText) 保留内容: - 系统提示词清理(sanitizeSystemText) - Claude Code 指纹 headers 处理 - 模型 ID 映射(通过 JSON 对象操作)
-
yangjianbo authored
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段, 而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且 内部计费缓存折扣为 0。 新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且 cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和 非流式两种响应路径。对 Claude 原生上游无影响。 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
yangjianbo authored
Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段, 而非 Claude 标准的 cache_read_input_tokens,导致客户端收不到缓存命中信息且 内部计费缓存折扣为 0。 新增 reconcileCachedTokens 辅助函数,在 cache_read_input_tokens == 0 且 cached_tokens > 0 时自动填充,覆盖流式(message_start/message_delta)和 非流式两种响应路径。对 Claude 原生上游无影响。 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
- 05 Feb, 2026 3 commits
-
-
shaw authored
支持管理员配置上游错误如何返回给客户端: - 新增 ErrorPassthroughRule 数据模型和 Ent Schema - 实现规则的 CRUD API(/admin/error-passthrough-rules) - 支持按错误码、关键词匹配,支持 any/all 匹配模式 - 支持按平台过滤(anthropic/openai/gemini/antigravity) - 支持透传或自定义响应状态码和错误消息 - 实现两级缓存(Redis + 本地内存)和多实例同步 - 集成到 gateway_handler 的错误处理流程 - 新增前端管理界面组件 - 新增单元测试覆盖核心匹配逻辑 优化: - 移除 refreshLocalCache 中的冗余排序(数据库已排序) - 后端 Validate() 增加匹配条件非空校验
-
shaw authored
-
shaw authored
未知工具名不再进行 PascalCase/snake_case 转换,保持原样透传。 修复 text_editor_20250728 等 Anthropic 特殊工具被错误转换的问题。
-
- 04 Feb, 2026 1 commit
-
-
shaw authored
问题:normalizeClaudeModelForAnthropic 函数错误地将长模型ID截断为短ID, 导致 APIKey 账号的模型名被错误修改。 修复: - 删除错误的 normalizeClaudeModelForAnthropic 函数和 anthropicPrefixMappings 变量 - 直接使用 claude.NormalizeModelID(正确的短ID->长ID扩展) - APIKey 账号无显式映射时透传原始模型名
-
- 03 Feb, 2026 3 commits
-
-
bayma888 authored
-
bayma888 authored
This feature allows API Keys to have their own quota limits and expiration times, independent of the user's balance. Backend: - Add quota, quota_used, expires_at fields to api_key schema - Implement IsExpired() and IsQuotaExhausted() checks in middleware - Add ResetQuota and ClearExpiration API endpoints - Integrate quota billing in gateway handlers (OpenAI, Anthropic, Gemini) - Include quota/expiration fields in auth cache for performance - Expiration check returns 403, quota exhausted returns 429 Frontend: - Add quota and expiration inputs to key create/edit dialog - Add quick-select buttons for expiration (+7, +30, +90 days) - Add reset quota confirmation dialog - Add expires_at column to keys list - Add i18n translations for new features (en/zh) Migration: - Add 045_add_api_key_quota.sql for new columns
-
JIA-ss authored
问题描述: 使用扩展思考功能时,偶现以下错误: "thinking or redacted_thinking blocks in the latest assistant message cannot be modified" 根因分析: 当代理服务修改请求体中的某些字段时(如 metadata.user_id、model), 使用 map[string]any 解析整个 JSON 后重新序列化,导致: 1. 字段顺序改变(Go map 序列化按字母排序) 2. 数字格式变化(如 1.0 → 1) 3. Unicode 转义变化 Claude API 对 thinking 块进行字节级验证,任何变化都会触发错误。 修复内容: 1. identity_service.go - RewriteUserID/RewriteUserIDWithMasking 使用 json.RawMessage 保留其他字段的原始字节 2. gateway_service.go - replaceModelInBody 使用 json.RawMessage 保留其他字段的原始字节 3. gateway_service.go - normalizeClaudeOAuthRequestBody 保留 messages 的原始字节,跳过包含 thinking 块的消息修改 4. gateway_service.go - isThinkingBlockSignatureError 添加 "cannot be modified" 错误检测,触发自动重试 5. antigravity_gateway_service.go - isSignatureRelatedError 添加 "cannot be modified" 错误检测 Co-Authored-By:Claude Opus 4.5 <noreply@anthropic.com>
-
- 02 Feb, 2026 4 commits
-
-
song authored
-
song authored
-
liuxiongfeng authored
- 新增 CalculateCostWithLongContext 方法支持阈值双倍计费 - 新增 RecordUsageWithLongContext 方法专用于 Gemini 计费 - Gemini 超过 200K token 的部分按 2 倍费率计算 - 其他平台(Claude/OpenAI)完全不受影响
-
liuxiongfeng authored
Gemini API Key 账户通常代理上游服务,模型支持由上游判断, 本地不需要预先配置模型映射。
-
- 30 Jan, 2026 3 commits
- 29 Jan, 2026 3 commits
- 28 Jan, 2026 4 commits