1. 02 Mar, 2026 1 commit
    • QTom's avatar
      feat(gateway): 双模式用户消息队列 — 串行队列 + 软性限速 · a9285b8a
      QTom authored
      新增 UMQ (User Message Queue) 双模式支持:
      - serialize: 账号级分布式串行锁 + RPM 自适应延迟(严格限流)
      - throttle: 仅 RPM 自适应前置延迟,不阻塞并发(软性限速)
      
      后端:
      - config: 新增 Mode 字段,保留 Enabled 向后兼容
      - service: 新增 UserMessageQueueService(Lua 锁/延迟算法/清理 worker)
      - repository: 新增 UserMsgQueueCache(Redis Lua acquire/release/force-release)
      - handler: 新增 UserMsgQueueHelper(SSE ping + 等待循环 + throttle)
      - gateway: 按 mode 分支集成 serialize/throttle 逻辑
      - lint: 修复 gofmt rewrite rules、errcheck 类型断言、staticcheck QF1012
      
      前端:
      - 三态选择器 UI(关闭/软性限速/串行队列)替代 toggle 开关
      - BulkEdit 支持 null 语义(不修改)
      - i18n 中英文文案
      
      通过 6 轮专家评审(42 次 review)、golangci-lint、单元测试、集成测试。
      a9285b8a
  2. 01 Mar, 2026 1 commit
    • QTom's avatar
      feat(gateway): 添加 Claude Code 客户端最低版本检查功能 · 4280aca8
      QTom authored
      - 通过 User-Agent 识别 Claude Code 客户端并提取版本号
      - 在网关层验证客户端版本是否满足管理员配置的最低要求
      - 在管理后台提供版本要求配置选项(英文/中文双语)
      - 实现原子缓存 + singleflight 防止并发问题和 thundering herd
      - 使用 context.WithoutCancel 隔离 DB 查询,避免客户端断连影响缓存
      - 双 TTL 策略:60s 正常、5s 错误恢复,保证性能与可用性
      - 仅检查 Claude Code 客户端,其他客户端不受影响
      - 添加完整单元测试覆盖版本提取、比对、上下文操作
      4280aca8
  3. 28 Feb, 2026 4 commits
    • QTom's avatar
      fix: address deep code review issues for RPM limiting · e63c8395
      QTom authored
      - Move IncrementRPM after Forward success to prevent phantom RPM
        consumption during account switch retries
      - Add base_rpm input sanitization (clamp to 0-10000) in Create/Update
      - Add WindowCost scheduling checks to legacy path sticky sessions
        (4 check sites + 4 prefetch sites), fixing pre-existing gap
      - Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in
        BulkEditModal (JSONB merge cannot delete keys, use empty values)
      - Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer
      - Document TOCTOU race as accepted soft-limit design trade-off
      e63c8395
    • QTom's avatar
      fix: address code review issues for RPM limiting feature · 60723757
      QTom authored
      - Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
      - Filter negative values in GetBaseRPM(), update test expectation
      - Add RPM batch query (GetRPMBatch) to account List API
      - Add warn logs for RPM increment failures in gateway handler
      - Reset enableRpmLimit on BulkEditAccountModal close
      - Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
      - Add design decision comments for rdb.Time() RTT trade-off
      60723757
    • QTom's avatar
      f648b8e0
    • yangjianbo's avatar
      feat(sync): full code sync from release · bb664d9b
      yangjianbo authored
      bb664d9b
  4. 24 Feb, 2026 2 commits
    • erio's avatar
      fix: enable Gemini model_mapping UI and extend warmup to Antigravity · d8d4b0c0
      erio authored
      - Remove Gemini platform exclusion from model restriction UI in
        Create/Edit account modals (Gemini now supports model_mapping)
      - Remove outdated Gemini model passthrough info cards
      - Add model_mapping field to GeminiCredentials type
      - Extend warmup request interception toggle to Antigravity platform
      - Remove redundant try/catch in API key account creation
      - Remove noisy gateway.request_completed debug log
      - Reorganize Gemini model mapping sections in constants.go
      d8d4b0c0
    • erio's avatar
      refactor: extract failover error handling into FailoverState · 09166a52
      erio authored
      - Extract duplicated failover logic from gateway_handler.go (3 places)
        and gemini_v1beta_handler.go into shared failover_loop.go
      - Introduce FailoverState with HandleFailoverError and HandleSelectionExhausted
      - Move helper functions (needForceCacheBilling, sleepWithContext) into failover_loop.go
      - Add comprehensive unit tests (32+ test cases)
      - Delete redundant gateway_handler_single_account_retry_test.go
      09166a52
  5. 22 Feb, 2026 3 commits
  6. 14 Feb, 2026 1 commit
  7. 12 Feb, 2026 2 commits
  8. 10 Feb, 2026 1 commit
    • Edric Li's avatar
      fix: 修复错误透传规则 skip_monitoring 未生效的问题 · 2d4236f7
      Edric Li authored
      - ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
      - ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent,中间重试/故障转移事件也能触发跳过标记
      - gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
      - antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用
      2d4236f7
  9. 09 Feb, 2026 3 commits
    • Edric Li's avatar
      feat: same-account retry before failover for transient errors · d6c2921f
      Edric Li authored
      For retryable transient errors (Google 400 "invalid project resource name"
      and empty stream responses), retry on the same account up to 2 times
      (with 500ms delay) before switching to another account.
      
      - Add RetryableOnSameAccount field to UpstreamFailoverError
      - Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
      - Move temp-unschedule from service layer to handler layer (only after
        all same-account retries exhausted)
      - Reduce temp-unschedule cooldown from 30 minutes to 1 minute
      d6c2921f
    • Rose Ding's avatar
      fix: 单账号分组首次 503 不设模型限流标记,避免后续请求雪崩 · 021abfca
      Rose Ding authored
      单账号 antigravity 分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时,
      原逻辑会设置 ~29s 模型限流标记。由于只有一个账号无法切换,
      后续所有新请求在预检查时命中限流 → 几毫秒内直接返回 503,
      导致约 30 秒的雪崩窗口。
      
      修复:在 Handler 入口处检查分组是否只有单个 antigravity 账号,
      如果是则提前设置 SingleAccountRetry context 标记,让 Service 层
      首次 503 就走原地重试逻辑(不设限流标记),避免污染后续请求。
      021abfca
    • Rose Ding's avatar
      feat: 添加 Antigravity 单账号 503 退避重试机制 · f6cfab99
      Rose Ding authored
      当分组内只有一个可用账号且上游返回 503 (MODEL_CAPACITY_EXHAUSTED) 时,
      不再设置模型限流+切换账号(因为切换回来还是同一个账号),而是在 Service 层
      原地等待+重试,避免双重等待问题。
      
      主要变更:
      - Handler 层:检测单账号 503 场景,清除排除列表并设置 SingleAccountRetry 标记
      - Service 层:新增 handleSingleAccountRetryInPlace 原地重试逻辑
      - Service 层:预检查跳过单账号模式下的限流检查
      - 新增 ctxkey.SingleAccountRetry 上下文标记
      f6cfab99
  10. 08 Feb, 2026 5 commits
  11. 07 Feb, 2026 7 commits
    • yangjianbo's avatar
      fix(audit): 第二批审计修复 — P0 生产 Bug、安全加固、性能优化、缓存一致性、代码质量 · 2588fa6a
      yangjianbo authored
      
      
      基于 backend-code-audit 审计报告,修复剩余 P0/P1/P2 共 34 项问题:
      
      P0 生产 Bug:
      - 修复 time.Since(time.Now()) 计时逻辑错误 (P0-03)
      - generateRandomID 改用 crypto/rand 替代固定索引 (P0-04)
      - IncrementQuotaUsed 重写为 Ent 原子操作消除 TOCTOU 竞态 (P0-05)
      
      安全加固:
      - gateway/openai handler 错误响应替换为泛化消息,防止内部信息泄露 (P1-14)
      - usage_log_repo dateFormat 参数改用白名单映射,防止 SQL 注入 (P1-16)
      - 默认配置安全加固:sslmode=prefer、response_headers=true、mode=release (P1-18/19, P2-15)
      
      性能优化:
      - gateway handler 循环内 defer 替换为显式 releaseWait 闭包 (P1-02)
      - group_repo/promo_code_repo Count 前 Clone 查询避免状态污染 (P1-03)
      - usage_log_repo 四个查询添加 LIMIT 10000 防止 OOM (P1-07)
      - GetBatchUsageStats 添加时间范围参数,默认最近 30 天 (P1-10)
      - ip.go CIDR 预编译为包级变量 (P1-11)
      - BatchUpdateCredentials 重构为先验证后更新 (P1-13)
      
      缓存一致性:
      - billing_cache 添加 jitteredTTL 防止缓存雪崩 (P2-10)
      - DeductUserBalance/UpdateSubscriptionUsage 错误传播修复 (P2-12)
      - UserService.UpdateBalance 成功后异步失效 billingCache (P2-13)
      
      代码质量:
      - search 截断改为按 rune 处理,支持多字节字符 (P2-01)
      - TLS Handshake 改为 HandshakeContext 支持 context 取消 (P2-07)
      - CORS 预检添加 Access-Control-Max-Age: 86400 (P2-16)
      
      测试覆盖:
      - 新增 user_service_test.go(UpdateBalance 缓存失效 6 个用例)
      - 新增 batch_update_credentials_test.go(fail-fast + 类型验证 7 个用例)
      - 新增 response_transformer_test.go、ip_test.go、usage_log_repo_unit_test.go、search_truncate_test.go
      - 集成测试:IncrementQuotaUsed 并发测试、billing_cache 错误传播测试
      - config_test.go 补充 server.mode/sslmode 默认值断言
      Co-Authored-By: default avatarClaude Opus 4.6 <noreply@anthropic.com>
      2588fa6a
    • shaw's avatar
      6aaa4aee
    • erio's avatar
      refactor: remove Anthropic digest chain from Messages handler · 86b503f8
      erio authored
      The digest chain fallback is only needed for Gemini endpoints, not
      for the Anthropic Messages API path. Remove the handler integration
      while keeping the reusable service/repository layer for future use.
      86b503f8
    • erio's avatar
      feat: add Anthropic sticky session digest chain matching via Trie · 50a783ff
      erio authored
      The previous fallback (step 3) in GenerateSessionHash hashed system +
      all messages together, producing a different hash each round as the
      conversation grew ([a] -> [a,b] -> [a,b,c]). This made fallback sticky
      sessions ineffective for multi-turn conversations.
      
      Implement per-message Trie digest chain matching (reusing Gemini's Trie
      infrastructure) so that the previous round's chain is always a prefix
      of the current round's chain, enabling reliable session affinity.
      50a783ff
    • erio's avatar
      edb09370
    • erio's avatar
      feat(antigravity): comprehensive enhancements - model mapping, rate limiting, scheduling & ops · 5e98445b
      erio authored
      Key changes:
      - Upgrade model mapping: Opus 4.5 → Opus 4.6-thinking with precise matching
      - Unified rate limiting: scope-level → model-level with Redis snapshot sync
      - Load-balanced scheduling by call count with smart retry mechanism
      - Force cache billing support
      - Model identity injection in prompts with leak prevention
      - Thinking mode auto-handling (max_tokens/budget_tokens fix)
      - Frontend: whitelist mode toggle, model mapping validation, status indicators
      - Gemini session fallback with Redis Trie O(L) matching
      - Ops: enhanced concurrency monitoring, account availability, retry logic
      - Migration scripts: 049-051 for model mapping unification
      5e98445b
    • shaw's avatar
  12. 05 Feb, 2026 2 commits
    • shaw's avatar
      feat: 新增全局错误透传规则功能 · 39e05a2d
      shaw authored
      支持管理员配置上游错误如何返回给客户端:
      - 新增 ErrorPassthroughRule 数据模型和 Ent Schema
      - 实现规则的 CRUD API(/admin/error-passthrough-rules)
      - 支持按错误码、关键词匹配,支持 any/all 匹配模式
      - 支持按平台过滤(anthropic/openai/gemini/antigravity)
      - 支持透传或自定义响应状态码和错误消息
      - 实现两级缓存(Redis + 本地内存)和多实例同步
      - 集成到 gateway_handler 的错误处理流程
      - 新增前端管理界面组件
      - 新增单元测试覆盖核心匹配逻辑
      
      优化:
      - 移除 refreshLocalCache 中的冗余排序(数据库已排序)
      - 后端 Validate() 增加匹配条件非空校验
      39e05a2d
    • JIA-ss's avatar
      feat(gateway): filter /v1/usage stats by API Key instead of UserID · fa3ea5ee
      JIA-ss authored
      
      
      Previously the /v1/usage endpoint aggregated usage stats (today/total
      tokens, cost, RPM/TPM) across all API Keys belonging to the user.
      This made it impossible to distinguish usage from different API Keys
      (e.g. balance vs subscription keys).
      
      Now the usage stats are filtered by the current request's API Key ID,
      so each key only sees its own usage data. The balance/remaining fields
      are unaffected and still reflect the user-level wallet balance.
      
      Changes:
      - Add GetAPIKeyDashboardStats to repository interface and implementation
      - Add getPerformanceStatsByAPIKey helper (also fixes TPM to include
        cache_creation_tokens and cache_read_tokens)
      - Add GetAPIKeyDashboardStats to UsageService
      - Update Usage handler to call GetAPIKeyDashboardStats(apiKey.ID)
      Co-Authored-By: default avatarClaude Opus 4.5 <noreply@anthropic.com>
      fa3ea5ee
  13. 03 Feb, 2026 3 commits
    • shaw's avatar
      chore: fix gofmt formatting · df1c2383
      shaw authored
      df1c2383
    • bayma888's avatar
      feat(api-key): add independent quota and expiration support · 6146be14
      bayma888 authored
      This feature allows API Keys to have their own quota limits and expiration
      times, independent of the user's balance.
      
      Backend:
      - Add quota, quota_used, expires_at fields to api_key schema
      - Implement IsExpired() and IsQuotaExhausted() checks in middleware
      - Add ResetQuota and ClearExpiration API endpoints
      - Integrate quota billing in gateway handlers (OpenAI, Anthropic, Gemini)
      - Include quota/expiration fields in auth cache for performance
      - Expiration check returns 403, quota exhausted returns 429
      
      Frontend:
      - Add quota and expiration inputs to key create/edit dialog
      - Add quick-select buttons for expiration (+7, +30, +90 days)
      - Add reset quota confirmation dialog
      - Add expires_at column to keys list
      - Add i18n translations for new features (en/zh)
      
      Migration:
      - Add 045_add_api_key_quota.sql for new columns
      6146be14
    • song's avatar
      chore: gofmt · f1aafbc0
      song authored
      f1aafbc0
  14. 02 Feb, 2026 2 commits
    • song's avatar
      merge upstream main · 0170d19f
      song authored
      0170d19f
    • JIA-ss's avatar
      feat(gateway): 增强 /v1/usage 端点返回完整用量统计 · c441638f
      JIA-ss authored
      
      
      为 CC Switch 集成增强 /v1/usage 网关端点,在保持原有 4 字段
      (isValid, planName, remaining, unit) 向后兼容的基础上,新增:
      
      - usage 对象:今日/累计的请求数、token 用量、费用,以及 RPM/TPM
      - subscription 对象(订阅模式):日/周/月用量和限额、过期时间
      - balance 字段(余额模式):当前钱包余额
      
      用量数据获取采用 best-effort 策略,失败不影响基础响应。
      Co-Authored-By: default avatarClaude Opus 4.5 <noreply@anthropic.com>
      c441638f
  15. 01 Feb, 2026 1 commit
    • yangjianbo's avatar
      feat(Sora): 直连生成并移除sora2api依赖 · 399dd78b
      yangjianbo authored
      实现直连 Sora 客户端、媒体落地与清理策略\n更新网关与前端配置以支持 Sora 平台\n补齐单元测试与契约测试,新增 curl 测试脚本\n\n测试: go test ./... -tags=unit
      399dd78b
  16. 31 Jan, 2026 1 commit
  17. 29 Jan, 2026 1 commit
    • yangjianbo's avatar
      feat(sora): 新增 Sora 平台支持并修复高危安全和性能问题 · 13262a56
      yangjianbo authored
      
      
      新增功能:
      - 新增 Sora 账号管理和 OAuth 认证
      - 新增 Sora 视频/图片生成 API 网关
      - 新增 Sora 任务调度和缓存机制
      - 新增 Sora 使用统计和计费支持
      - 前端增加 Sora 平台配置界面
      
      安全修复(代码审核):
      - [SEC-001] 限制媒体下载响应体大小(图片 20MB、视频 200MB),防止 DoS 攻击
      - [SEC-002] 限制 SDK API 响应大小(1MB),防止内存耗尽
      - [SEC-003] 修复 SSRF 风险,添加 URL 验证并强制使用代理配置
      
      BUG 修复(代码审核):
      - [BUG-001] 修复 for 循环内 defer 累积导致的资源泄漏
      - [BUG-002] 修复图片并发槽位获取失败时已持有锁未释放的永久泄漏
      
      性能优化(代码审核):
      - [PERF-001] 添加 Sentinel Token 缓存(3 分钟有效期),减少 PoW 计算开销
      
      技术细节:
      - 使用 io.LimitReader 限制所有外部输入的大小
      - 添加 urlvalidator 验证防止 SSRF 攻击
      - 使用 sync.Map 实现线程安全的包级缓存
      - 优化并发槽位管理,添加 releaseAll 模式防止泄漏
      
      影响范围:
      - 后端:新增 Sora 相关数据模型、服务、网关和管理接口
      - 前端:新增 Sora 平台配置、账号管理和监控界面
      - 配置:新增 Sora 相关配置项和环境变量
      Co-Authored-By: default avatarClaude Sonnet 4.5 <noreply@anthropic.com>
      13262a56