Commits · bde9dbc57a26fb98175bb8eb1ee8857c66286f48 · 陈曦 / sub2api

21 Feb, 2026 1 commit

feat(anthropic): 支持 API Key 自动透传并优化透传链路性能 · bde9dbc5

yangjianbo authored Feb 21, 2026

- 新增 Anthropic API Key 自动透传开关与后端透传分支（仅替换认证）

- 账号编辑页新增自动透传开关，默认关闭

- 优化透传性能：SSE usage 解析 gjson 快路径、减少请求体重复拷贝、优化流式写回与非流式 usage 解析

- 补充单元测试与 benchmark，确保 Claude OAuth 路径不受影响

bde9dbc5

19 Feb, 2026 2 commits
- feat(proxy,sora): 增强代理质量检测与Sora稳定性并修复审查问题 · 46d9aee6
  yangjianbo authored Feb 19, 2026
  
  46d9aee6
- fix(sora): 增强 Cloudflare 挑战识别并收敛 Sora 请求链路 · 440b8709
  yangjianbo authored Feb 19, 2026
```
- 在 failover 场景透传上游响应头并识别 Cloudflare challenge/cf-ray

- 统一 Sora 任务请求的 UA 与代理使用，sentinel 与业务请求保持一致

- 修复流式错误事件 JSON 转义问题并补充相关单元测试
```
  440b8709
18 Feb, 2026 1 commit
- fix: 临时移除context-1m-2025-08-07以确保避免sonnet1m触发429 · 074bd0df
  shaw authored Feb 18, 2026
  
  074bd0df
17 Feb, 2026 1 commit

feat: add Cache TTL Override per account + bump VERSION to 0.1.83 · 3d1f03c2

John Doe authored Feb 17, 2026

- Account-level cache TTL override: rewrite Anthropic cache_creation
  token classification (5m↔

1h) in streaming/non-streaming responses
- New DB field cache_ttl_overridden in usage_log for billing tracking
- Migration 055_add_cache_ttl_overridden
- Frontend: CacheTTL override toggle in account create/edit modals
- Ent schema regenerated for new usage_log fields
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

3d1f03c2

16 Feb, 2026 1 commit

fix(gateway): 避免SSE delta将缓存创建明细重置为0 · 6577f2ef

yangjianbo authored Feb 16, 2026



- 仅在 delta 中 5m/1h 值大于0时覆盖 usage 明细
- 新增回归测试覆盖 delta 默认 0 不应覆盖 message_start 非零值
- 迁移 054 在删除 legacy 字段前追加一次回填，避免升级实例丢失历史写入
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

6577f2ef

14 Feb, 2026 2 commits

feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费 · a817cafe

shaw authored Feb 14, 2026

Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
两种缓存创建 token，1h 单价远高于 5m（如 claude-3-5-haiku: 5m=$1/MTok,
1h=$6/MTok）。此前系统统一按 5m 单价计费，导致计费偏低。

后端：
- pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
- billing_service: GetModelPricing 启用分类计费（安全守卫 1h>5m），
  CalculateCost 按 5m/1h 分别计费，无明细时回退到 5m 单价
- gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
  提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
- antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
- usage_log: 修复 GORM column tag 确保写入正确的数据库列
- 新增迁移 054: 删除 GORM 自动生成的重复列

前端：
- 使用记录 tooltip 展示 5m/1h 缓存创建明细（带彩色 badge 区分）
- 表格单元格缓存写入数值旁显示 1h 标识

a817cafe

feat(backend): 提交后端审计修复与配套测试改动 · d04b47b3
yangjianbo authored Feb 14, 2026

d04b47b3

12 Feb, 2026 1 commit

chore(logging): 完成后端日志审计与结构化迁移 · 584cfc3d

yangjianbo authored Feb 12, 2026

- 将高密度服务与处理器日志迁移到新日志系统（LegacyPrintf/结构化日志）
- 增加 stdlog bridge 与兼容测试，保留旧日志捕获能力
- 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
- 补齐后端相关文件 logger 引用并通过全量 go test

584cfc3d

11 Feb, 2026 1 commit

[UPDATE] 增强 Claude Thinking 模式支持与 Opus 4.6 动态预算适配 · 19cca11e

SilentFlower authored Feb 11, 2026

✨ feat(antigravity): 支持 thinking adaptive 类型并适配 Opus 4.6 动态预算
🧪 test(gateway): 增加 thinking 模式解析与签名块过滤的边界用例测试

19cca11e

10 Feb, 2026 2 commits

fix: 修复 CI 检查失败 · 378e476e

Edric Li authored Feb 10, 2026

- gofmt: 修复 error_passthrough_service.go 格式问题
- errcheck: 修复 error_passthrough_runtime_test.go 类型断言未检查
- staticcheck: if-else 改为 switch (gateway_service.go)
- test: 修复两个测试用例错误使用 MODEL_CAPACITY_EXHAUSTED 导致走错路径

378e476e

fix: 移除特定system以适配新版cc客户端缓存失效的bug · 5dd83d3c
shaw authored Feb 10, 2026

5dd83d3c

09 Feb, 2026 4 commits

feat: same-account retry before failover for transient errors · d6c2921f

Edric Li authored Feb 10, 2026

For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.

- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
  all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute

d6c2921f

fix(unit): 修复 unit tag 测试编译与账号选择用例 · 2bfb1629
yangjianbo authored Feb 09, 2026

2bfb1629

fix: 单账号分组首次 503 不设模型限流标记，避免后续请求雪崩 · 021abfca

Rose Ding authored Feb 09, 2026

单账号 antigravity 分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时，
原逻辑会设置 ~29s 模型限流标记。由于只有一个账号无法切换，
后续所有新请求在预检查时命中限流 → 几毫秒内直接返回 503，
导致约 30 秒的雪崩窗口。

修复：在 Handler 入口处检查分组是否只有单个 antigravity 账号，
如果是则提前设置 SingleAccountRetry context 标记，让 Service 层
首次 503 就走原地重试逻辑（不设限流标记），避免污染后续请求。

021abfca

refactor: replace scope-level rate limiting with model-level rate limiting · fc095bf0

erio authored Feb 09, 2026

Merge functional changes from develop branch:
- Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image)
- Replace with per-model rate limiting using resolveAntigravityModelKey
- Remove model load statistics (IncrModelCallCount/GetModelLoadBatch)
- Simplify account selection to unified priority→load→LRU algorithm
- Remove SetAntigravityQuotaScopeLimit from AccountRepository
- Clean up scope-related UI indicators and API fields

fc095bf0

08 Feb, 2026 4 commits

feat: shuffle accounts within same sort group to prevent thundering herd · 1af06aed

erio authored Feb 09, 2026

Add post-sort shuffle for accounts with identical (priority, loadRate,
lastUsedAt) to break deterministic ordering when concurrent requests
read the same scheduler snapshot. Applies to both Antigravity and
OpenAI scheduling paths, plus the sortAccountsByPriorityAndLastUsed
helper.

Keeps upstream CallCount/ModelLoadInfo scheduling intact; shuffle is
additive and only randomises within equivalent-rank groups.

1af06aed

refactor: replace Trie-based digest session store with flat cache · b889d501
erio authored Feb 09, 2026

b889d501

fix: parse Gemini native request format in ParseGatewayRequest for correct session hash generation · 35598d56

erio authored Feb 09, 2026

ParseGatewayRequest only parsed Anthropic format (system/messages),
ignoring Gemini native format (systemInstruction/contents). This caused
GenerateSessionHash to produce identical hashes for all Gemini sessions.

Add protocol parameter to ParseGatewayRequest to branch between
Anthropic and Gemini parsing. Update GenerateSessionHash message
traversal to extract text from both formats.

35598d56

fix: prevent sessionHash collision for different users with same messages · 5c76b9e4

erio authored Feb 09, 2026

Mix SessionContext (ClientIP, UserAgent, APIKeyID) into
GenerateSessionHash 3rd-level fallback to differentiate requests
from different users sending identical content.

Also switch hashContent from SHA256-truncated to XXHash64 for
better performance, and optimize Trie Lua script to match from
longest prefix first.

5c76b9e4

07 Feb, 2026 10 commits

fix(lint): handle errcheck for strings.Builder.WriteString · e3748da8
erio authored Feb 07, 2026

e3748da8

feat: add Anthropic sticky session digest chain matching via Trie · 50a783ff

erio authored Feb 07, 2026

The previous fallback (step 3) in GenerateSessionHash hashed system +
all messages together, producing a different hash each round as the
conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky
sessions ineffective for multi-turn conversations.

Implement per-message Trie digest chain matching (reusing Gemini's Trie
infrastructure) so that the previous round's chain is always a prefix
of the current round's chain, enabling reliable session affinity.

50a783ff

perf(service): 优化 model 替换函数，用 gjson/sjson 替代全量 JSON 序列化 · 8226a4ce

yangjianbo authored Feb 07, 2026

SSE 热路径中 replaceModelInSSELine 和 replaceModelInResponseBody 原来
使用 json.Unmarshal/Marshal 对每个事件做全量反序列化再序列化，现改为
gjson.Get/sjson.Set 精确字段操作，消除 O(n) 中间 map 分配，保持 JSON
字段顺序不变。涉及 OpenAIGatewayService 和 GatewayService 两个服务。

新增 23 个单元测试覆盖：顶层/嵌套 model 替换、不匹配跳过、空行/[DONE]/
非法 JSON 等边界情况。

Fixes: P1-08
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

8226a4ce

refactor: simplify sticky session rate limit handling — switch immediately on any rate limit · e1a68497

erio authored Feb 07, 2026

Remove threshold-based waiting in both sticky session and antigravity
pre-check paths. When a model is rate-limited, immediately clear the
sticky session and switch accounts instead of waiting for short durations.

e1a68497

style: fix gofmt formatting in gateway_service.go · b4f6c4f9
erio authored Feb 07, 2026
```
Remove extra blank line that caused golangci-lint gofmt check to fail.
```
b4f6c4f9
refactor: remove unused IsAntigravityModelSupported function and its tests · 14c6c932
erio authored Feb 07, 2026

14c6c932

fix(antigravity): support upstream accounts and custom model_mapping in scheduling · de092728

erio authored Feb 07, 2026

- GetAccessToken: add upstream branch to read api_key from credentials
- shouldTriggerAntigravitySmartRetry: relax check from IsOAuth to Platform-based
- isModelSupportedByAccount/WithContext: replace IsAntigravityModelSupported
  whitelist with mapAntigravityModel for unified scheduling/forwarding logic
- mapAntigravityModel: fix edge case where wildcard target equals request model
- Update tests for new behavior and add custom model_mapping test cases

de092728

fix: restore non-failover error passthrough from 7b156489 · edb09370
erio authored Feb 07, 2026

edb09370

feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops · 5e98445b

erio authored Feb 07, 2026

Key changes:
- Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
- Unified rate limiting: scope-level → model-level with Redis snapshot sync
- Load-balanced scheduling by call count with smart retry mechanism
- Force cache billing support
- Model identity injection in prompts with leak prevention
- Thinking mode auto-handling (max_tokens/budget_tokens fix)
- Frontend: whitelist mode toggle, model mapping validation, status indicators
- Gemini session fallback with Redis Trie O(L) matching
- Ops: enhanced concurrency monitoring, account availability, retry logic
- Migration scripts: 049-051 for model mapping unification

5e98445b

fix: make error passthrough effective for non-failover upstream errors · 7b156489
shaw authored Feb 07, 2026

7b156489

06 Feb, 2026 4 commits

perf(service): SSE Scanner buffer 改用 sync.Pool 复用，减少高并发 GC 压力 · d71537d4

yangjianbo authored Feb 06, 2026



将流式响应中 bufio.Scanner 的 64KB buffer 从每次 make 分配改为
sync.Pool 复用，统一切片表达式为 [:0]、变量命名为 scanBuf，
并补充对应的单元测试。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

d71537d4

fix(gateway): 移除 PR #316 引入的工具名转换逻辑 · d182ef03

shaw authored Feb 06, 2026

移除响应阶段的工具名/schema/description 转换逻辑，修复第三方工具调用时
工具名被错误转换的问题（如 Task → task）。

移除内容：
- 工具名相关正则变量（toolPrefixRe, toolNameBoundaryRe 等）
- openCodeToolOverrides 和 claudeToolNameOverrides 映射表
- 工具名转换函数（normalizeToolNameForClaude, normalizeToolNameForOpenCode 等）
- 响应体工具名替换函数（replaceToolNamesInText, replaceToolNamesInResponseBody 等）
- 参数名转换函数（normalizeParamNameForOpenCode, rewriteParamKeysInValue）
- 工具描述清理函数（sanitizeToolDescription）
- 输入 schema 转换函数（normalizeToolInputSchema）
- 模型 ID 正则替换函数（replaceModelIDInText）

保留内容：
- 系统提示词清理（sanitizeSystemText）
- Claude Code 指纹 headers 处理
- 模型 ID 映射（通过 JSON 对象操作）

d182ef03

fix(兼容): 将 Kimi cached_tokens 映射到 Claude 标准 cache_read_input_tokens · f33a9501

yangjianbo authored Feb 06, 2026

Kimi 等 Claude 兼容 API 返回缓存信息使用 OpenAI 风格的 cached_tokens 字段，
而非 Claude 标准的 cache_read_input_tokens，导致客户端收不到缓存命中信息且
内部计费缓存折扣为 0。

新增 reconcileCachedTokens 辅助函数，在 cache_read_input_tokens == 0 且
cached_tokens > 0 时自动填充，覆盖流式（message_start/message_delta）和
非流式两种响应路径。对 Claude 原生上游无影响。
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

f33a9501

fix(兼容): 将 Kimi cached_tokens 映射到 Claude 标准 cache_read_input_tokens · c6a456c7

yangjianbo authored Feb 06, 2026

c6a456c7

05 Feb, 2026 3 commits

feat: 新增全局错误透传规则功能 · 39e05a2d

shaw authored Feb 05, 2026

支持管理员配置上游错误如何返回给客户端：
- 新增 ErrorPassthroughRule 数据模型和 Ent Schema
- 实现规则的 CRUD API（/admin/error-passthrough-rules）
- 支持按错误码、关键词匹配，支持 any/all 匹配模式
- 支持按平台过滤（anthropic/openai/gemini/antigravity）
- 支持透传或自定义响应状态码和错误消息
- 实现两级缓存（Redis + 本地内存）和多实例同步
- 集成到 gateway_handler 的错误处理流程
- 新增前端管理界面组件
- 新增单元测试覆盖核心匹配逻辑

优化：
- 移除 refreshLocalCache 中的冗余排序（数据库已排序）
- 后端 Validate() 增加匹配条件非空校验

39e05a2d

feat: 支持用户专属分组倍率配置 · 2b192f7d
shaw authored Feb 05, 2026

2b192f7d

fix(gateway): 修复工具名转换破坏 Anthropic 特殊工具的问题 · 05af95da

shaw authored Feb 05, 2026

未知工具名不再进行 PascalCase/snake_case 转换，保持原样透传。
修复 text_editor_20250728 等 Anthropic 特殊工具被错误转换的问题。

05af95da

04 Feb, 2026 1 commit

fix(gateway): 修复模型前缀映射逻辑错误 · 8f397548

shaw authored Feb 04, 2026

问题：normalizeClaudeModelForAnthropic 函数错误地将长模型ID截断为短ID，
导致 APIKey 账号的模型名被错误修改。

修复：
- 删除错误的 normalizeClaudeModelForAnthropic 函数和 anthropicPrefixMappings 变量
- 直接使用 claude.NormalizeModelID（正确的短ID->长ID扩展）
- APIKey 账号无显式映射时透传原始模型名

8f397548

03 Feb, 2026 2 commits

fix(lint): format gateway_service.go with gofmt · 3fed478e
bayma888 authored Feb 03, 2026

3fed478e

feat(api-key): add independent quota and expiration support · 6146be14

bayma888 authored Feb 03, 2026

This feature allows API Keys to have their own quota limits and expiration
times, independent of the user's balance.

Backend:
- Add quota, quota_used, expires_at fields to api_key schema
- Implement IsExpired() and IsQuotaExhausted() checks in middleware
- Add ResetQuota and ClearExpiration API endpoints
- Integrate quota billing in gateway handlers (OpenAI, Anthropic, Gemini)
- Include quota/expiration fields in auth cache for performance
- Expiration check returns 403, quota exhausted returns 429

Frontend:
- Add quota and expiration inputs to key create/edit dialog
- Add quick-select buttons for expiration (+7, +30, +90 days)
- Add reset quota confirmation dialog
- Add expires_at column to keys list
- Add i18n translations for new features (en/zh)

Migration:
- Add 045_add_api_key_quota.sql for new columns

6146be14