1. 19 Mar, 2026 1 commit
  2. 18 Mar, 2026 8 commits
    • shaw's avatar
      feat: add 529 overload cooldown toggle and duration settings in admin gateway page · bf3d6c0e
      shaw authored
      Move 529 overload cooldown configuration from config file to admin
      settings UI. Adds an enable/disable toggle and configurable cooldown
      duration (1-120 min) under /admin/settings gateway tab, stored as
      JSON in the settings table.
      
      When disabled, 529 errors are logged but accounts are no longer
      paused from scheduling. Falls back to config file value when DB
      is unreachable or settingService is nil.
      bf3d6c0e
    • erio's avatar
      feat: map claude-haiku-4-5 variants to claude-sonnet-4-6 · af96c8ea
      erio authored
      Update model mapping target for claude-haiku-4-5 and
      claude-haiku-4-5-20251001 from claude-sonnet-4-5 to claude-sonnet-4-6.
      Includes migration script, default constants, and test updates.
      af96c8ea
    • alfadb's avatar
    • alfadb's avatar
      fix: strip empty text blocks in retry filter and fix error pattern matching · b8ada63a
      alfadb authored
      
      
      Empty text blocks ({"type":"text","text":""}) cause Anthropic upstream to
      return 400: "text content blocks must be non-empty". This was not caught
      by the existing error detection pattern in isThinkingBlockSignatureError,
      nor handled by FilterThinkingBlocksForRetry.
      
      - Add empty text block stripping to FilterThinkingBlocksForRetry
      - Fix isThinkingBlockSignatureError to match new Anthropic error format
      - Add fast-path byte patterns to avoid unnecessary JSON parsing
      Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
      b8ada63a
    • shaw's avatar
      fix: 兼容 Claude Code v2.1.78+ 新 JSON 格式 metadata.user_id · a14babdc
      shaw authored
      Claude Code v2.1.78 起将 metadata.user_id 从拼接字符串改为 JSON:
      旧: user_{hex}_account_{uuid}_session_{uuid}
      新: {"device_id":"...","account_uuid":"...","session_id":"..."}
      
      新增集中解析/格式化模块 metadata_userid.go:
      - ParseMetadataUserID: 自动识别两种格式,提取 DeviceID/AccountUUID/SessionID
      - FormatMetadataUserID: 根据 UA 版本输出对应格式(>= 2.1.78 输出 JSON)
      - ExtractCLIVersion: 从 UA 提取版本号,消除与 ClaudeCodeValidator.ExtractVersion 的重复
      
      修改消费者统一使用新模块:
      - claude_code_validator: 用 ParseMetadataUserID 替代只匹配旧格式的 userIDPattern
      - identity_service: RewriteUserID/WithMasking 增加 fingerprintUA 参数,
        解析用 ParseMetadataUserID,输出用 FormatMetadataUserID(版本感知)
      - gateway_service: GenerateSessionHash 用 ParseMetadataUserID 提取 session_id,
        buildOAuthMetadataUserID 用 FormatMetadataUserID 输出版本匹配格式,
        两处 RewriteUserIDWithMasking 调用传入 fp.UserAgent
      - account_test_service: generateSessionString 改用 FormatMetadataUserID,
        自动跟随 DefaultHeaders UA 版本
      
      删除三个旧正则: userIDPattern, userIDRegex, sessionIDRegex
      统一 hex 匹配为 [a-fA-F0-9],修复旧 userIDRegex 只匹配小写的不一致
      a14babdc
    • QTom's avatar
      feat(admin): 分组管理新增容量列(并发/会话/RPM 实时聚合) · d4cc9871
      QTom authored
      
      
      复用 GroupCapacityService,在 admin 分组列表中添加容量列,
      显示每个分组的实时并发/会话/RPM 使用量和上限。
      Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
      d4cc9871
    • QTom's avatar
      feat(admin): 分组管理列表新增用量列与账号数分类 · 961c30e7
      QTom authored
      
      
      分组管理列表增强:
      
      1. 今日/累计用量列:
         - 新增独立端点 GET /admin/groups/usage-summary
         - 一次查询返回所有分组的今日费用和累计费用(actual_cost)
         - 前端异步加载后合并显示在分组列表中
      
      2. 账号数区分可用/限流/总量:
         - 将账号数列从单一总量改为 badge 内多行展示
         - 可用: active + schedulable 的账号数(绿色)
         - 限流: rate_limit/overload/temp_unschedulable 的账号数(橙色,无限流时隐藏)
         - 总量: 全部关联账号数
      Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
      961c30e7
    • Gemini Wen's avatar
      feat: add platform type filter to subscription management page · 50a3c7fa
      Gemini Wen authored
      
      
      Add a platform filter dropdown to the admin subscriptions view, allowing
      filtering subscriptions by platform (Anthropic, OpenAI, Gemini, etc.)
      through the group association.
      Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
      50a3c7fa
  3. 17 Mar, 2026 5 commits
    • Ethan0x0000's avatar
      test(backend): add tests for upstream model tracking and model source filtering · eeff451b
      Ethan0x0000 authored
      Cover IsValidModelSource/NormalizeModelSource, resolveModelDimensionExpression SQL expressions, invalid model_source 400 responses on both GetModelStats and GetUserBreakdown, upstream_model in scan/insert SQL mock expectations, and updated passthrough/billing test signatures.
      eeff451b
    • Ethan0x0000's avatar
      feat(dashboard): add model source dimension to stats queries · 7134266a
      Ethan0x0000 authored
      Support querying model statistics by 'requested', 'upstream', or 'mapping' dimension. Add resolveModelDimensionExpression for safe SQL expression generation, IsValidModelSource whitelist validator, and NormalizeModelSource fallback. Repository persists and scans upstream_model in all insert/select paths.
      7134266a
    • Ethan0x0000's avatar
      feat(service): record upstream model across all gateway paths · 2e4ac88a
      Ethan0x0000 authored
      Propagate UpstreamModel through ForwardResult and OpenAIForwardResult in Anthropic direct, API-key passthrough, Bedrock, and OpenAI gateway flows. Extract optionalNonEqualStringPtr and optionalTrimmedStringPtr into usage_log_helpers.go. Store upstream_model only when it differs from the requested model.
      
      Also introduces anthropicPassthroughForwardInput struct to reduce parameter count.
      2e4ac88a
    • haruka's avatar
      fix(review): address Copilot PR feedback · 869952d1
      haruka authored
      
      
      - Add compile-time interface assertion for sessionWindowMockRepo
      - Fix flaky fallback test by capturing time.Now() before calling UpdateSessionWindow
      - Replace stale hardcoded timestamps with dynamic future values
      - Add millisecond detection and bounds validation for reset header timestamp
      - Use pause/resume pattern for interval in UsageProgressBar to avoid idle timers on large lists
      - Fix gofmt comment alignment
      Co-Authored-By: default avatarClaude Sonnet 4.6 <noreply@anthropic.com>
      869952d1
    • luxiang's avatar
      7e34bb94
  4. 16 Mar, 2026 11 commits
    • erio's avatar
      refactor(antigravity): unify TestConnection with dispatch retry loop · a6f99cf5
      erio authored
      TestConnection now reuses antigravityRetryLoop instead of a standalone
      HTTP loop, gaining credits overages, smart retry, and 429/503 backoff
      for free. AccountSwitchError is caught and surfaced as a friendly
      message. Also populates RateLimitedModel in TempUnscheduled switch error.
      
      Test fixes:
      - Use RATE_LIMIT_EXCEEDED in 503 short-delay test to avoid 60x1s timeout
      - Clamp waitDuration=0 instead of 999s to avoid 15s max-wait timeout
      - Enhance mockSmartRetryUpstream with repeatLast and body caching
      a6f99cf5
    • erio's avatar
      feat(dashboard): add per-user drill-down for group, model, and endpoint distributions · 4b41e898
      erio authored
      Click on a group name, model name, or endpoint name in the distribution
      tables to expand and show per-user usage breakdown (requests, tokens,
      actual cost, standard cost).
      
      Backend: new GET /admin/dashboard/user-breakdown API with group_id,
      model, endpoint, endpoint_type filters.
      Frontend: clickable rows with expand/collapse sub-table in all three
      distribution charts.
      4b41e898
    • Elysia's avatar
      fix(usage): use real reset header for session window instead of prediction · 668e1647
      Elysia authored
      
      
      The 5h window reset time displayed for Setup Token accounts was inaccurate
      because UpdateSessionWindow predicted the window end as "current hour + 5h"
      instead of reading the actual `anthropic-ratelimit-unified-5h-reset` response
      header. This caused the countdown to differ from the official Claude page.
      
      Backend: parse the reset header (Unix timestamp) and use it as the real
      window end, falling back to the hour-truncated prediction only when the
      header is absent. Also correct stale predictions when a subsequent request
      provides the real reset time.
      
      Frontend: add a reactive 60s timer so the reset countdown in
      UsageProgressBar ticks down in real-time instead of freezing at the
      initial value.
      Co-Authored-By: default avatarClaude Opus 4.6 <noreply@anthropic.com>
      668e1647
    • Elysia's avatar
      fix(oauth): extract system-role input items into instructions field · fa2e6188
      Elysia authored
      
      
      OAuth upstreams (ChatGPT) reject requests containing role:"system" in
      the input array with HTTP 400 "System messages are not allowed". Extract
      such items before forwarding and merge their content into the top-level
      instructions field, prepending to any existing value.
      Co-Authored-By: default avatarClaude Sonnet 4.6 <noreply@anthropic.com>
      fa2e6188
    • Ethan0x0000's avatar
    • QTom's avatar
      feat(backup): 备份/恢复异步化,解决 504 超时 · c1fab7f8
      QTom authored
      
      
      POST /backups 和 POST /backups/:id/restore 改为异步:立即返回 HTTP 202,
      后台 goroutine 独立执行 pg_dump → gzip → S3 上传,前端每 2s 轮询状态。
      
      后端:
      - 新增 StartBackup/StartRestore 方法,后台 goroutine 不依赖 HTTP 连接
      - Graceful shutdown 等待活跃操作完成,启动时清理孤立 running 记录
      - BackupRecord 新增 progress/restore_status 字段支持进度和恢复状态追踪
      
      前端:
      - 创建备份/恢复后轮询 GET /backups/:id 直到完成或失败
      - 标签页切换暂停/恢复轮询,组件卸载清理定时器
      - 正确处理 409(备份进行中)和轮询超时
      Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
      c1fab7f8
    • kunish's avatar
      fix(antigravity): add stream keepalive to prevent connection drops · d7957343
      kunish authored
      Antigravity streaming handlers were missing the keepalive mechanism
      that exists in the standard gateway, causing proxy/CDN idle timeouts
      to break connections during long thinking phases (e.g. claude-opus-4-6).
      This resulted in truncated responses with missing tool calls.
      
      Add StreamKeepaliveInterval support to all three Antigravity streaming
      paths: Claude SSE, Gemini SSE, and upstream passthrough.
      d7957343
    • Ethan0x0000's avatar
      fix: always attach OpenAI 5h/7d window stats regardless of zero values · fa782e70
      Ethan0x0000 authored
      Removes hasMeaningfulWindowStats guard so the /usage endpoint consistently
      returns WindowStats for both time windows. The frontend now controls
      zero-value display filtering at the component level.
      fa782e70
    • Ethan0x0000's avatar
      fix: allow empty extra payload to clear account quota limits · afd72abc
      Ethan0x0000 authored
      UpdateAccount previously required len(input.Extra) > 0, causing explicit
      empty payloads (extra:{}) to be silently skipped. Change condition to
      input.Extra != nil so clearing quota keys actually persists.
      afd72abc
    • QTom's avatar
      fix(gateway): WS 连接池条件式 MarkBroken 防止跨请求串流 · 3741617e
      QTom authored
      正常终端事件(response.completed 等)退出后连接归还复用,
      仅异常路径(读写错误、error 事件、客户端断连)MarkBroken 销毁。
      
      Generate 模式:
      - 引入 cleanExit 标记,仅在 isTerminalEvent break 时设置 true
      - defer 中根据 cleanExit 决定是否 MarkBroken
      - 所有异常路径已在各自分支中提前调用 MarkBroken
      
      Ingress 模式:
      - 引入 lastTurnClean 标记,sendAndRelay 正常完成时设为 true
      - releaseSessionLease 根据 lastTurnClean 决定是否 MarkBroken
      - 错误路径重置 lastTurnClean = false
      - 客户端断连后 drain 仍保守 MarkBroken(L2916)
      3741617e
    • QTom's avatar
      fix(gateway): 防止 OpenAI Codex 跨用户串流 · ab4e8b2c
      QTom authored
      根因:多个用户共享同一 OAuth 账号时,conversation_id/session_id 头
      未做用户隔离,导致上游 chatgpt.com 将不同用户的请求关联到同一会话。
      
      HTTP SSE 修复:
      - 新增 isolateOpenAISessionID(apiKeyID, raw),将 API Key ID 混入
        session 标识符(xxhash),确保不同 Key 的用户产生不同上游会话
      - buildUpstreamRequest: OAuth 分支先 Del 客户端透传的 session 头,
        再用隔离值覆盖
      - buildUpstreamRequestOpenAIPassthrough: 透传路径同样隔离
      - ForwardAsAnthropic: Anthropic Messages 兼容路径同步修复
      - buildOpenAIWSHeaders: WS 路径的 OAuth session 头同步隔离
      ab4e8b2c
  5. 15 Mar, 2026 15 commits
    • erio's avatar
      fix: resolve golangci-lint issues (gofmt, errcheck) · 552a4b99
      erio authored
      - Fix gofmt alignment in admin_service.go and trailing newline in
        antigravity_credits_overages.go
      - Suppress errcheck for fmt.Sscanf in client.go GetMinimumAmount
      552a4b99
    • erio's avatar
      fix: remove ClaudeMax references not yet in upstream/main · 0d2061b2
      erio authored
      Remove SimulateClaudeMaxEnabled field and related logic from
      admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
      applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
      from antigravity_gateway_service.go. These symbols are not yet
      available in upstream/main.
      0d2061b2
    • erio's avatar
      refactor: replace sync.Map credits state with AICredits rate limit key · 8a260def
      erio authored
      Replace process-memory sync.Map + per-model runtime state with a single
      "AICredits" key in model_rate_limits, making credits exhaustion fully
      isomorphic with model-level rate limiting.
      
      Scheduler: rate-limited accounts with overages enabled + credits available
      are now scheduled instead of excluded.
      
      Forwarding: when model is rate-limited + credits available, inject credits
      proactively without waiting for a 429 round trip.
      
      Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
      5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.
      
      Frontend: show credits_active (yellow ) when model rate-limited but
      credits available, credits_exhausted (red) when AICredits key active.
      
      Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
      clearCreditsExhausted, and update existing overages tests.
      8a260def
    • SilentFlower's avatar
    • SilentFlower's avatar
    • SilentFlower's avatar
      feat: implement resolveCreditsOveragesModelKey function to stabilize model key... · 17e40333
      SilentFlower authored
      feat: implement resolveCreditsOveragesModelKey function to stabilize model key resolution for credit overages
      17e40333
    • erio's avatar
      044d3a01
    • erio's avatar
      feat: unified OAuth token refresh API with distributed locking · 1fc9dd7b
      erio authored
      Introduce OAuthRefreshAPI as the single entry point for all OAuth token
      refresh operations, eliminating the race condition where background
      refresh and inline refresh could simultaneously use the same
      refresh_token (fixes #1035).
      
      Key changes:
      - Add OAuthRefreshExecutor interface extending TokenRefresher with CacheKey
      - Add OAuthRefreshAPI.RefreshIfNeeded with lock → DB re-read → double-check flow
      - Add ProviderRefreshPolicy / BackgroundRefreshPolicy strategy types
      - Simplify all 4 TokenProviders to delegate to OAuthRefreshAPI
      - Rewrite TokenRefreshService.refreshWithRetry to use unified API path
      - Add MergeCredentials and BuildClaudeAccountCredentials helpers
      - Add 40 unit tests covering all new and modified code paths
      1fc9dd7b
    • Ethan0x0000's avatar
      feat: add InboundEndpoint/UpstreamEndpoint fields to non-OpenAI usage records · 1b79b0f3
      Ethan0x0000 authored
      Extend RecordUsageInput and RecordUsageLongContextInput structs with InboundEndpoint and UpstreamEndpoint so that Claude, Gemini, and Sora handlers can record endpoint info alongside OpenAI handlers.
      
      Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-opencode
      
      )
      Co-authored-by: default avatarSisyphus <clio-agent@sisyphuslabs.ai>
      1b79b0f3
    • shaw's avatar
      ae44a943
    • IanShaw027's avatar
    • erio's avatar
      feat(ops): add ignore insufficient balance errors toggle and extract error constants · cfe72159
      erio authored
      - Add 5th error filter switch IgnoreInsufficientBalanceErrors to suppress
        upstream insufficient balance / insufficient_quota errors from ops log
      - Extract hardcoded error strings into package-level constants for
        shouldSkipOpsErrorLog, normalizeOpsErrorType, classifyOpsPhase, and
        classifyOpsIsBusinessLimited
      - Define ErrNoAvailableAccounts sentinel error and replace all
        errors.New("no available accounts") call sites
      - Update tests to use require.ErrorIs with the sentinel error
      cfe72159
    • erio's avatar
      fix(billing): allow clearing group quota limits and treat 0 as zero-limit · 5899784a
      erio authored
      Previously, v-model.number produced "" when input was cleared, causing
      JSON decode errors on the backend. Also, normalizeLimit treated 0 as
      "unlimited" which prevented setting a zero quota. Now "" is converted
      to null (unlimited) in frontend, and 0 is preserved as a valid limit.
      
      Closes Wei-Shaw/sub2api#1021
      5899784a
    • erio's avatar
      fix(billing): treat nil rate limit window as expired to prevent usage accumulation · 9e8959c5
      erio authored
      When Redis cache is populated from DB with a NULL window_1d_start, the
      Lua increment script only updates usage counters without setting window
      timestamps. IsWindowExpired(nil) previously returned false, so the
      accumulated usage was never reset across time windows, effectively
      turning usage_1d into a lifetime counter. Once this exceeded
      rate_limit_1d the key was incorrectly blocked with "日限额已用完".
      
      Fixes Wei-Shaw/sub2api#1022
      9e8959c5
    • YanzheL's avatar
      fix: extract and log Claude output_config.effort in usage records · 1bff2292
      YanzheL authored
      Claude's output_config.effort parameter (low/medium/high/max) was not
      being extracted from requests or logged in the reasoning_effort column
      of usage logs. Only the OpenAI path populated this field.
      
      Changes:
      - Extract output_config.effort in ParseGatewayRequest
      - Add ReasoningEffort field to ForwardResult
      - Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext
      - Guard against overwriting service-set effort values in handler
      - Update stale comments that described reasoning_effort as OpenAI-only
      - Add unit tests for extraction, normalization, and persistence
      1bff2292