Commits · 91ef085d7dd5507a27b32f89344a1e2a2071a8ec · 陈曦 / sub2api

09 Mar, 2026 1 commit

fix: increase SSE scanner max line size from 40MB to 500MB · 91ef085d

erio authored Mar 09, 2026

4K image base64 data can exceed 40MB limit, causing "bufio.Scanner:
token too long" errors. Scanner is adaptive (starts at 64KB, grows
as needed), so increasing the cap has no impact on normal responses.

91ef085d

08 Mar, 2026 1 commit
- feat: 支持 API Key 上游池模式同账号重试次数配置与自定义错误策略 · e643fc38
  kyx236 authored Mar 08, 2026
  
  e643fc38
07 Mar, 2026 2 commits

feat: 支持后台设置是否启用整流开关 · a3791104
shaw authored Mar 07, 2026

a3791104

feat(account): add daily/weekly periodic quota limits for API Key accounts · 1ee17383

erio authored Mar 07, 2026



Extend the existing total quota limit with daily and weekly periodic
dimensions. Each dimension is independently configurable and uses lazy
reset — when the period expires, usage is automatically reset to zero on
the next increment. Any dimension exceeding its limit will pause the
account from scheduling.

Backend:
- Add GetQuotaDailyLimit/Used, GetQuotaWeeklyLimit/Used, HasAnyQuotaLimit
- Rewrite IncrementQuotaUsed with atomic CTE SQL for 3-dimension update
- Rewrite ResetQuotaUsed to clear all dimensions and period timestamps
- Update postUsageBilling to use HasAnyQuotaLimit()
- Preserve daily/weekly used values on account edit

Frontend:
- Refactor QuotaLimitCard from single v-model to 3-dimension props
- Add QuotaBadge component for compact D/W/$ display
- Update AccountCapacityCell with per-dimension badges
- Update Create/Edit modals with daily/weekly quota fields
- Update AccountActionMenu hasQuotaLimit to check all dimensions
- Add i18n strings for daily/weekly/total quota labels
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

1ee17383

06 Mar, 2026 3 commits

fix(openai): 统一专属倍率计费链路并补齐回归测试 · a18bbb5f

yangjianbo authored Mar 06, 2026

抽取共享的用户分组专属倍率解析器，统一缓存、singleflight 与回退逻辑。\n\n让 OpenAI 独立计费链路复用专属倍率解析，修复 usage 记录与实际扣费未命中用户专属倍率的问题。\n\n补齐 OpenAI 计费与解析器单元测试，并修复全量回归中暴露的 lint 阻塞项。\n\nCo-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

a18bbb5f

feat(openai): add /v1/messages endpoint and API compatibility layer · ff1f1149

alfadb authored Mar 06, 2026

Add Anthropic Messages API support for OpenAI platform groups, enabling
clients using Claude-style /v1/messages format to access OpenAI accounts
through automatic protocol conversion.

- Add apicompat package with type definitions and bidirectional converters
  (Anthropic ↔ Chat, Chat ↔ Responses, Anthropic ↔

 Responses)
- Implement /v1/messages endpoint for OpenAI gateway with streaming support
- Add model mapping UI for OpenAI OAuth accounts (whitelist + mapping modes)
- Support prompt caching fields and codex OAuth transforms
- Fix tool call ID conversion for Responses API (fc_ prefix)
- Ensure function_call_output has non-empty output field
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

ff1f1149

fix: OpenAI passthrough accounts bypass model mapping check · 79ae15d5

erio authored Mar 06, 2026

透传模式账号仅替换认证，应允许所有模型通过。之前调度阶段的
isModelSupportedByAccount 不感知透传模式，导致 model_mapping
中未配置的新模型（如 gpt-5.4）被拒绝返回 503。

79ae15d5

05 Mar, 2026 5 commits

feat: add independent load_factor field for scheduling load calculation · 0d6c1c77
erio authored Mar 06, 2026

0d6c1c77

refactor: unify post-usage billing logic and fix account quota calculation · 02dea7b0

erio authored Mar 06, 2026

- Extract postUsageBilling() to consolidate billing logic across
  GatewayService.RecordUsage, RecordUsageWithLongContext, and
  OpenAIGatewayService.RecordUsage, eliminating ~120 lines of
  duplicated code
- Fix account quota to use TotalCost × accountRateMultiplier
  (was using raw TotalCost, inconsistent with account cost stats)
- Fix RecordUsageWithLongContext API Key quota only updating in
  balance mode (now updates regardless of billing type)
- Fix WebSocket client disconnect detection on Windows by adding
  "an established connection was aborted" to known disconnect errors

02dea7b0

feat: add quota limit for API key accounts · 05527b13

erio authored Mar 05, 2026

- Add configurable spending limit (quota_limit) for apikey-type accounts
- Atomic quota accumulation via PostgreSQL JSONB operations on TotalCost
- Scheduler filters out over-quota accounts with outbox-triggered snapshot refresh
- Display quota usage ($used / $limit) in account capacity column
- Add "Reset Quota" action in account menu to reset usage to zero
- Editing account settings preserves quota_used (no accidental reset)
- Covers all 3 billing paths: Anthropic, Gemini, OpenAI RecordUsage

chore: bump version to 0.1.90.4

05527b13

fix: 修复claude apikey账号请求时未携带beta=true 查询参数的bug · 9d70c385
shaw authored Mar 05, 2026

9d70c385
feat: 模型映射应用 /v1/messages/count_tokens端点 · aeb464f3
shaw authored Mar 05, 2026

aeb464f3

03 Mar, 2026 2 commits

feat: apikey支持5h/1d/7d速率控制 · a80ec5d8
shaw authored Mar 03, 2026

a80ec5d8

fix(gateway): 分组隔离 — 禁止未分组账号被跨组调度 · 530a1629

QTom authored Mar 03, 2026

当 API Key 无分组时，调度仅从未分组账号池中选取。
修复 isAccountInGroup 在 groupID==nil 时的逻辑，
同时补全 scheduler_snapshot_service 和 gemini_compat_service
中的 SimpleMode 保护，确保分组隔离在所有调度路径生效。

新增 ListSchedulableUngroupedByPlatform/s 方法，
使用 Ent 的 Not(HasAccountGroups()) 谓词实现未分组账号隔离。
新增 17 个单元和端到端隔离测试，覆盖所有分支和边界条件。

530a1629

02 Mar, 2026 1 commit

feat(gateway): 双模式用户消息队列 — 串行队列 + 软性限速 · a9285b8a

QTom authored Mar 03, 2026

新增 UMQ (User Message Queue) 双模式支持:
- serialize: 账号级分布式串行锁 + RPM 自适应延迟（严格限流）
- throttle: 仅 RPM 自适应前置延迟，不阻塞并发（软性限速）

后端:
- config: 新增 Mode 字段，保留 Enabled 向后兼容
- service: 新增 UserMessageQueueService（Lua 锁/延迟算法/清理 worker）
- repository: 新增 UserMsgQueueCache（Redis Lua acquire/release/force-release）
- handler: 新增 UserMsgQueueHelper（SSE ping + 等待循环 + throttle）
- gateway: 按 mode 分支集成 serialize/throttle 逻辑
- lint: 修复 gofmt rewrite rules、errcheck 类型断言、staticcheck QF1012

前端:
- 三态选择器 UI（关闭/软性限速/串行队列）替代 toggle 开关
- BulkEdit 支持 null 语义（不修改）
- i18n 中英文文案

通过 6 轮专家评审（42 次 review）、golangci-lint、单元测试、集成测试。

a9285b8a

28 Feb, 2026 8 commits

fix: round-3 review fixes for RPM limiting · 2491e9b5

QTom authored Feb 28, 2026

- Add sanitizeExtraBaseRPM to BulkUpdate handler (was missing)
- Add WindowCost scheduling checks to legacy non-sticky selection
  paths (4 sites), matching existing sticky + load-aware coverage
- Export ParseExtraInt from service package, remove duplicate
  parseExtraIntForValidation from admin handler

2491e9b5

fix: address deep code review issues for RPM limiting · e63c8395

QTom authored Feb 28, 2026

- Move IncrementRPM after Forward success to prevent phantom RPM
  consumption during account switch retries
- Add base_rpm input sanitization (clamp to 0-10000) in Create/Update
- Add WindowCost scheduling checks to legacy path sticky sessions
  (4 check sites + 4 prefetch sites), fixing pre-existing gap
- Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in
  BulkEditModal (JSONB merge cannot delete keys, use empty values)
- Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer
- Document TOCTOU race as accepted soft-limit design trade-off

e63c8395

fix: move RPM prefetch before routing segment in legacy/mixed paths · ff9683b0

QTom authored Feb 28, 2026

Ensures isAccountSchedulableForRPM calls within the routing segment
hit the prefetch cache instead of querying Redis individually.

ff9683b0

fix: address code review issues for RPM limiting feature · 60723757

QTom authored Feb 28, 2026

- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
- Filter negative values in GetBaseRPM(), update test expectation
- Add RPM batch query (GetRPMBatch) to account List API
- Add warn logs for RPM increment failures in gateway handler
- Reset enableRpmLimit on BulkEditAccountModal close
- Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
- Add design decision comments for rdb.Time() RTT trade-off

60723757

feat: increment RPM counter before request forwarding · f648b8e0
QTom authored Feb 28, 2026

f648b8e0
feat: integrate RPM scheduling checks into account selection flow · 678c3ae1
QTom authored Feb 28, 2026

678c3ae1
feat: wire RPMCache into GatewayService and AccountHandler · c1c31ed9
QTom authored Feb 28, 2026

c1c31ed9
feat(sync): full code sync from release · bb664d9b
yangjianbo authored Feb 28, 2026

bb664d9b

27 Feb, 2026 1 commit

feat: replace gemini-3-pro-image with gemini-3.1-flash-image · a6f9f9f9

erio authored Feb 27, 2026

- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model

a6f9f9f9

26 Feb, 2026 4 commits

fix: address review - fix log wording and add response body assertion in test · e6969acb
alfadb authored Feb 26, 2026

e6969acb

fix(gateway): return 404 instead of fake 200 for unsupported count_tokens endpoint · 94895314

alfadb authored Feb 26, 2026

PR #635 returned HTTP 200 with {"input_tokens": 0} when upstream doesn't
support count_tokens (404). This caused Claude Code CLI to trust the zero
value, believing context uses 0 tokens, so auto-compression never triggers.

Fix: return 404 with proper error body so CLI falls back to its local
tokenizer for accurate estimation. Return nil (not error) to avoid
polluting ops error metrics with expected 404s.

Affected paths:
- Passthrough APIKey accounts: upstream 404 now passed through as 404
- Antigravity accounts: same fix (was also returning fake 200)

94895314

fix: 临时移除fast-mode-2026-02-01避免429问题 · 4ac57b4e
shaw authored Feb 26, 2026

4ac57b4e

fix: count_tokens 端点不支持时降级返回空值 (404 only) · 03bcd94a

alfadb authored Feb 26, 2026

第三方 Anthropic 中转站通常不支持 /v1/messages/count_tokens 端点，
上游返回 404 时降级返回 {input_tokens: 0}，客户端 fallback 到本地估算。

- 仅匹配 404 状态码，语义明确：端点不存在
- 其他错误 (400/429/500) 保留原始处理链和 ops 遥测
- 无需解析错误消息内容，不依赖字符串匹配
- 新增 table-driven 测试覆盖 fallback 和 non-fallback 路径

03bcd94a

24 Feb, 2026 1 commit

fix(gemini): enable model_mapping filtering for Gemini API Key accounts · 64405817

erio authored Feb 24, 2026

Remove the special case that bypassed model-supported checks for Gemini
API Key accounts, allowing model_mapping to filter requests properly.
Add tests for multiplatform model filtering behavior.

64405817

22 Feb, 2026 2 commits
- fix(gateway): 修复粘性会话预取分组错配并优化并发等待热路径 · 2ee6c266
  yangjianbo authored Feb 22, 2026
  
  2ee6c266
- perf(gateway): 优化热点路径并补齐高覆盖测试 · a89477dd
  yangjianbo authored Feb 22, 2026
  
  a89477dd
21 Feb, 2026 2 commits

fix(gateway): 恢复 Anthropic 透传流数据间隔超时保护并补充回归测试 · 1985be26
yangjianbo authored Feb 21, 2026

1985be26

feat(anthropic): 支持 API Key 自动透传并优化透传链路性能 · bde9dbc5

yangjianbo authored Feb 21, 2026

- 新增 Anthropic API Key 自动透传开关与后端透传分支（仅替换认证）

- 账号编辑页新增自动透传开关，默认关闭

- 优化透传性能：SSE usage 解析 gjson 快路径、减少请求体重复拷贝、优化流式写回与非流式 usage 解析

- 补充单元测试与 benchmark，确保 Claude OAuth 路径不受影响

bde9dbc5

19 Feb, 2026 2 commits
- feat(proxy,sora): 增强代理质量检测与Sora稳定性并修复审查问题 · 46d9aee6
  yangjianbo authored Feb 19, 2026
  
  46d9aee6
- fix(sora): 增强 Cloudflare 挑战识别并收敛 Sora 请求链路 · 440b8709
  yangjianbo authored Feb 19, 2026
```
- 在 failover 场景透传上游响应头并识别 Cloudflare challenge/cf-ray

- 统一 Sora 任务请求的 UA 与代理使用，sentinel 与业务请求保持一致

- 修复流式错误事件 JSON 转义问题并补充相关单元测试
```
  440b8709
18 Feb, 2026 1 commit
- fix: 临时移除context-1m-2025-08-07以确保避免sonnet1m触发429 · 074bd0df
  shaw authored Feb 18, 2026
  
  074bd0df
17 Feb, 2026 1 commit

feat: add Cache TTL Override per account + bump VERSION to 0.1.83 · 3d1f03c2

John Doe authored Feb 17, 2026

- Account-level cache TTL override: rewrite Anthropic cache_creation
  token classification (5m↔

1h) in streaming/non-streaming responses
- New DB field cache_ttl_overridden in usage_log for billing tracking
- Migration 055_add_cache_ttl_overridden
- Frontend: CacheTTL override toggle in account create/edit modals
- Ent schema regenerated for new usage_log fields
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

3d1f03c2

16 Feb, 2026 1 commit

fix(gateway): 避免SSE delta将缓存创建明细重置为0 · 6577f2ef

yangjianbo authored Feb 16, 2026



- 仅在 delta 中 5m/1h 值大于0时覆盖 usage 明细
- 新增回归测试覆盖 delta 默认 0 不应覆盖 message_start 非零值
- 迁移 054 在删除 legacy 字段前追加一次回填，避免升级实例丢失历史写入
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

6577f2ef

14 Feb, 2026 2 commits

feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费 · a817cafe

shaw authored Feb 14, 2026

Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
两种缓存创建 token，1h 单价远高于 5m（如 claude-3-5-haiku: 5m=$1/MTok,
1h=$6/MTok）。此前系统统一按 5m 单价计费，导致计费偏低。

后端：
- pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
- billing_service: GetModelPricing 启用分类计费（安全守卫 1h>5m），
  CalculateCost 按 5m/1h 分别计费，无明细时回退到 5m 单价
- gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
  提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
- antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
- usage_log: 修复 GORM column tag 确保写入正确的数据库列
- 新增迁移 054: 删除 GORM 自动生成的重复列

前端：
- 使用记录 tooltip 展示 5m/1h 缓存创建明细（带彩色 badge 区分）
- 表格单元格缓存写入数值旁显示 1h 标识

a817cafe

feat(backend): 提交后端审计修复与配套测试改动 · d04b47b3
yangjianbo authored Feb 14, 2026

d04b47b3