1. 15 Mar, 2026 3 commits
    • erio's avatar
      fix: remove ClaudeMax references not yet in upstream/main · 0d2061b2
      erio authored
      Remove SimulateClaudeMaxEnabled field and related logic from
      admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
      applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
      from antigravity_gateway_service.go. These symbols are not yet
      available in upstream/main.
      0d2061b2
    • erio's avatar
      refactor: replace sync.Map credits state with AICredits rate limit key · 8a260def
      erio authored
      Replace process-memory sync.Map + per-model runtime state with a single
      "AICredits" key in model_rate_limits, making credits exhaustion fully
      isomorphic with model-level rate limiting.
      
      Scheduler: rate-limited accounts with overages enabled + credits available
      are now scheduled instead of excluded.
      
      Forwarding: when model is rate-limited + credits available, inject credits
      proactively without waiting for a 429 round trip.
      
      Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
      5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.
      
      Frontend: show credits_active (yellow ) when model rate-limited but
      credits available, credits_exhausted (red) when AICredits key active.
      
      Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
      clearCreditsExhausted, and update existing overages tests.
      8a260def
    • SilentFlower's avatar
      feat: implement resolveCreditsOveragesModelKey function to stabilize model key... · 17e40333
      SilentFlower authored
      feat: implement resolveCreditsOveragesModelKey function to stabilize model key resolution for credit overages
      17e40333
  2. 12 Mar, 2026 2 commits
  3. 07 Mar, 2026 1 commit
  4. 06 Mar, 2026 1 commit
  5. 28 Feb, 2026 1 commit
  6. 27 Feb, 2026 1 commit
    • erio's avatar
      feat: replace gemini-3-pro-image with gemini-3.1-flash-image · a6f9f9f9
      erio authored
      - Add migration 060 to update model_mapping for all antigravity accounts
      - Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
      - Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
      - Update frontend usage window to show GImage for new model
      - Update isImageGenerationModel to support new model
      a6f9f9f9
  7. 24 Feb, 2026 2 commits
    • erio's avatar
      fix(antigravity): bill with mapped model and use final model key for rate limiting · 4573868c
      erio authored
      - Use mapped model (billingModel) instead of original request model for billing
      - Use resolveFinalAntigravityModelKey for 429 rate limit model key,
        ensuring rate limit records match the actual upstream model
      - Add regression tests for both fixes
      4573868c
    • erio's avatar
      fix: distinguish client disconnection from upstream retry failure · 0dacdf48
      erio authored
      Before this change, when a client disconnected mid-request, the error
      message was "Upstream request failed after retries", which is misleading
      and pollutes error logs. Now we check context.Err() to return a more
      accurate "Client disconnected" message for both Claude and Gemini
      forward paths.
      0dacdf48
  8. 14 Feb, 2026 1 commit
    • shaw's avatar
      feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费 · a817cafe
      shaw authored
      Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
      两种缓存创建 token,1h 单价远高于 5m(如 claude-3-5-haiku: 5m=$1/MTok,
      1h=$6/MTok)。此前系统统一按 5m 单价计费,导致计费偏低。
      
      后端:
      - pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
      - billing_service: GetModelPricing 启用分类计费(安全守卫 1h>5m),
        CalculateCost 按 5m/1h 分别计费,无明细时回退到 5m 单价
      - gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
        提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
      - antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
      - usage_log: 修复 GORM column tag 确保写入正确的数据库列
      - 新增迁移 054: 删除 GORM 自动生成的重复列
      
      前端:
      - 使用记录 tooltip 展示 5m/1h 缓存创建明细(带彩色 badge 区分)
      - 表格单元格缓存写入数值旁显示 1h 标识
      a817cafe
  9. 12 Feb, 2026 1 commit
    • yangjianbo's avatar
      chore(logging): 完成后端日志审计与结构化迁移 · 584cfc3d
      yangjianbo authored
      - 将高密度服务与处理器日志迁移到新日志系统(LegacyPrintf/结构化日志)
      - 增加 stdlog bridge 与兼容测试,保留旧日志捕获能力
      - 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
      - 补齐后端相关文件 logger 引用并通过全量 go test
      584cfc3d
  10. 11 Feb, 2026 1 commit
  11. 10 Feb, 2026 5 commits
    • Edric Li's avatar
      perf: 错误处理性能优化 · a54b81cf
      Edric Li authored
      - MatchRule 延迟/限制 body ToLower,先用 statusCode 短路,只在需要关键词匹配时转换且限制 8KB
      - 预计算规则的小写关键词/平台和 error code set,消除运行时重复 ToLower 和线性扫描
      - MODEL_CAPACITY_EXHAUSTED 全局去重,避免并发请求重复重试同一模型
      - 503 重试 body 读取限制从 2MB 降至 8KB
      - time.After 替换为 time.NewTimer,防止 context 取消时 timer 泄漏
      a54b81cf
    • Edric Li's avatar
      fix: 修复错误透传规则 skip_monitoring 未生效的问题 · 2d4236f7
      Edric Li authored
      - ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
      - ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent,中间重试/故障转移事件也能触发跳过标记
      - gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
      - antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用
      2d4236f7
    • song's avatar
      1f647b12
    • shaw's avatar
    • yangjianbo's avatar
      perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理 · 58912d4a
      yangjianbo authored
      
      
      将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入:
      - unwrapV1InternalResponse 性能提升 22x(4009ns→182ns),内存分配减少 28.5x
      - unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取
      - ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取
      - Handler 层(sora/openai)改用 gjson 提取字段、sjson 注入/修改字段,移除 map[string]any 中间变量
      - Sora Client 响应解析改用 gjson ForEach 遍历,减少内存分配
      - 新增约 100 个单元测试用例,所有改动函数覆盖率 >85%
      Co-Authored-By: default avatarClaude Opus 4.6 <noreply@anthropic.com>
      58912d4a
  12. 09 Feb, 2026 8 commits
    • Edric Li's avatar
      feat: MODEL_CAPACITY_EXHAUSTED 使用固定1s间隔重试60次,不切换账号 · 6114f69c
      Edric Li authored
      MODEL_CAPACITY_EXHAUSTED (503) 表示模型容量不足,所有账号共享同一容量池,
      切换账号无意义。改为固定1s间隔重试最多60次,重试耗尽后直接返回上游错误。
      
      - 新增 antigravityModelCapacityRetryMaxAttempts=60 和 antigravityModelCapacityRetryWait=1s
      - shouldTriggerAntigravitySmartRetry 新增 isModelCapacityExhausted 返回值
      - handleSmartRetry 对 MODEL_CAPACITY_EXHAUSTED 使用独立重试策略
      - handleModelRateLimit 对 MODEL_CAPACITY_EXHAUSTED 仅标记 Handled,不设限流
      - 重试耗尽后不设置模型限流、不清除粘性会话、不切换账号
      6114f69c
    • Edric Li's avatar
      feat: same-account retry before failover for transient errors · d6c2921f
      Edric Li authored
      For retryable transient errors (Google 400 "invalid project resource name"
      and empty stream responses), retry on the same account up to 2 times
      (with 500ms delay) before switching to another account.
      
      - Add RetryableOnSameAccount field to UpstreamFailoverError
      - Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
      - Move temp-unschedule from service layer to handler layer (only after
        all same-account retries exhausted)
      - Reduce temp-unschedule cooldown from 30 minutes to 1 minute
      d6c2921f
    • Edric Li's avatar
      feat: failover and temp-unschedule on empty stream response · 61c73287
      Edric Li authored
      - Empty stream responses now return UpstreamFailoverError instead of
        plain 502, triggering automatic account switching (up to 10 retries)
      - Add tempUnscheduleEmptyResponse: accounts returning empty responses
        are temp-unscheduled for 30 minutes
      - Apply to both Claude and Gemini non-streaming paths
      - Align googleConfigErrorCooldown from 60m to 30m for consistency
      61c73287
    • Edric Li's avatar
      feat: failover and temp-unschedule on Google "Invalid project resource name" 400 · 89905ec4
      Edric Li authored
      Google 后端间歇性返回 400 "Invalid project resource name" 错误,
      此前该错误直接透传给客户端且不触发账号切换,导致请求失败。
      
      - 在 Antigravity 和 Gemini 两个平台的所有转发路径中,
        精确匹配该错误消息后触发 failover 自动换号重试
      - 命中后将账号临时封禁 1 小时,避免反复调度到同一故障账号
      - 提取共享函数 isGoogleProjectConfigError / tempUnscheduleGoogleConfigError
        消除跨 Service 的代码重复
      89905ec4
    • erio's avatar
      fix: skip rate limiting when custom error codes don't match upstream status · 6892e84a
      erio authored
      Add ShouldHandleErrorCode guard at the entry of handleGeminiUpstreamError
      and AntigravityGatewayService.handleUpstreamError so that accounts with
      custom error codes (e.g. [599]) are not rate-limited when the upstream
      returns a non-matching status (e.g. 429).
      6892e84a
    • erio's avatar
      feat: ErrorPolicySkipped returns 500 instead of upstream status code · 73f45574
      erio authored
      When custom error codes are enabled and the upstream error code is NOT
      in the configured list, return HTTP 500 to the client instead of
      transparently forwarding the original status code.
      
      Also adds integration test TestCustomErrorCode599 verifying that 429,
      500, 503, 401, 403 all return 500 without triggering SetRateLimited
      or SetError.
      73f45574
    • Rose Ding's avatar
      feat: 添加 Antigravity 单账号 503 退避重试机制 · f6cfab99
      Rose Ding authored
      当分组内只有一个可用账号且上游返回 503 (MODEL_CAPACITY_EXHAUSTED) 时,
      不再设置模型限流+切换账号(因为切换回来还是同一个账号),而是在 Service 层
      原地等待+重试,避免双重等待问题。
      
      主要变更:
      - Handler 层:检测单账号 503 场景,清除排除列表并设置 SingleAccountRetry 标记
      - Service 层:新增 handleSingleAccountRetryInPlace 原地重试逻辑
      - Service 层:预检查跳过单账号模式下的限流检查
      - 新增 ctxkey.SingleAccountRetry 上下文标记
      f6cfab99
    • erio's avatar
      refactor: replace scope-level rate limiting with model-level rate limiting · fc095bf0
      erio authored
      Merge functional changes from develop branch:
      - Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image)
      - Replace with per-model rate limiting using resolveAntigravityModelKey
      - Remove model load statistics (IncrModelCallCount/GetModelLoadBatch)
      - Simplify account selection to unified priority→load→LRU algorithm
      - Remove SetAntigravityQuotaScopeLimit from AccountRepository
      - Clean up scope-related UI indicators and API fields
      fc095bf0
  13. 08 Feb, 2026 9 commits
  14. 07 Feb, 2026 4 commits
    • erio's avatar
      fix(gateway): restore upstream account forwarding with dedicated methods · 77b66653
      erio authored
      v0.1.74 merged upstream accounts into the OAuth path, causing requests
      to hit the wrong protocol and endpoint. Add three upstream-specific
      methods (testUpstreamConnection, ForwardUpstream, ForwardUpstreamGemini)
      that use base_url + apiKey auth and passthrough the original body, while
      reusing the existing response handling and error/retry logic.
      77b66653
    • yangjianbo's avatar
      fix: 修复函数签名变更后的调用参数不匹配 · 836ba14b
      yangjianbo authored
      
      
      - handleUpstreamError 补齐新增的三个参数 (0, "", false)
      - handleStreamingResponse 移除已删除的 nil 参数
      Co-Authored-By: default avatarClaude Opus 4.6 <noreply@anthropic.com>
      836ba14b
    • yangjianbo's avatar
      fix(audit): 第二批审计修复 — P0 生产 Bug、安全加固、性能优化、缓存一致性、代码质量 · 2588fa6a
      yangjianbo authored
      
      
      基于 backend-code-audit 审计报告,修复剩余 P0/P1/P2 共 34 项问题:
      
      P0 生产 Bug:
      - 修复 time.Since(time.Now()) 计时逻辑错误 (P0-03)
      - generateRandomID 改用 crypto/rand 替代固定索引 (P0-04)
      - IncrementQuotaUsed 重写为 Ent 原子操作消除 TOCTOU 竞态 (P0-05)
      
      安全加固:
      - gateway/openai handler 错误响应替换为泛化消息,防止内部信息泄露 (P1-14)
      - usage_log_repo dateFormat 参数改用白名单映射,防止 SQL 注入 (P1-16)
      - 默认配置安全加固:sslmode=prefer、response_headers=true、mode=release (P1-18/19, P2-15)
      
      性能优化:
      - gateway handler 循环内 defer 替换为显式 releaseWait 闭包 (P1-02)
      - group_repo/promo_code_repo Count 前 Clone 查询避免状态污染 (P1-03)
      - usage_log_repo 四个查询添加 LIMIT 10000 防止 OOM (P1-07)
      - GetBatchUsageStats 添加时间范围参数,默认最近 30 天 (P1-10)
      - ip.go CIDR 预编译为包级变量 (P1-11)
      - BatchUpdateCredentials 重构为先验证后更新 (P1-13)
      
      缓存一致性:
      - billing_cache 添加 jitteredTTL 防止缓存雪崩 (P2-10)
      - DeductUserBalance/UpdateSubscriptionUsage 错误传播修复 (P2-12)
      - UserService.UpdateBalance 成功后异步失效 billingCache (P2-13)
      
      代码质量:
      - search 截断改为按 rune 处理,支持多字节字符 (P2-01)
      - TLS Handshake 改为 HandshakeContext 支持 context 取消 (P2-07)
      - CORS 预检添加 Access-Control-Max-Age: 86400 (P2-16)
      
      测试覆盖:
      - 新增 user_service_test.go(UpdateBalance 缓存失效 6 个用例)
      - 新增 batch_update_credentials_test.go(fail-fast + 类型验证 7 个用例)
      - 新增 response_transformer_test.go、ip_test.go、usage_log_repo_unit_test.go、search_truncate_test.go
      - 集成测试:IncrementQuotaUsed 并发测试、billing_cache 错误传播测试
      - config_test.go 补充 server.mode/sslmode 默认值断言
      Co-Authored-By: default avatarClaude Opus 4.6 <noreply@anthropic.com>
      2588fa6a
    • erio's avatar
      feat: smart retry max 1 attempt + clear sticky session on failure · 3077fd27
      erio authored
      - Change antigravitySmartRetryMaxAttempts from 3 to 1 to prevent
        repeated rate limiting and long waits
      - Clear sticky session binding (DeleteSessionAccountID) after smart
        retry exhaustion, so subsequent requests don't hit the same
        rate-limited account
      - Add flow diagrams to Forward/ForwardGemini doc comments
      - Add comprehensive unit tests covering:
        - Sticky session cleared on retry failure (429, 503, network error)
        - Sticky session NOT cleared on retry success
        - Sticky session NOT cleared for non-sticky requests (empty hash)
        - Sticky session NOT cleared on long delay path (handled by handler)
        - Nil cache safety (no panic)
        - MaxAttempts constant verification
        - End-to-end retryLoop → switchError propagation with session clear
      3077fd27