Commits · 528ff5d28c9b21cae4535129df8c6c158345ea4b · 陈曦 / sub2api

19 Mar, 2026 1 commit

fix(antigravity): fast-fail on proxy unavailable, temp-unschedule account · 528ff5d2

erio authored Mar 19, 2026

## Problem

When a proxy is unreachable, token refresh retries up to 4 times with
30s timeout each, causing requests to hang for ~2 minutes before
failing with a generic 502 error. The failed account is not marked,
so subsequent requests keep hitting it.

## Changes

### Proxy connection fast-fail
- Set TCP dial timeout to 5s and TLS handshake timeout to 5s on
  antigravity client, so proxy connectivity issues fail within 5s
  instead of 30s
- Reduce overall HTTP client timeout from 30s to 10s
- Export `IsConnectionError` for service-layer use
- Detect proxy connection errors in `RefreshToken` and return
  immediately with "proxy unavailable" error (no retries)

### Token refresh temp-unschedulable
- Add 8s context timeout for token refresh on request path
- Mark account as temp-unschedulable for 10min when refresh fails
  (both background `TokenRefreshService` and request-path
  `GetAccessToken`)
- Sync temp-unschedulable state to Redis cache for immediate
  scheduler effect
- Inject `TempUnschedCache` into `AntigravityTokenProvider`

### Account failover
- Return `UpstreamFailoverError` on `GetAccessToken` failure in
  `Forward`/`ForwardGemini` to trigger handler-level account switch
  instead of returning 502 directly

### Proxy probe alignment
- Apply same 5s dial/TLS timeout to shared `httpclient` pool
- Reduce proxy probe timeout from 30s to 10s

528ff5d2

16 Mar, 2026 2 commits

refactor(antigravity): unify TestConnection with dispatch retry loop · a6f99cf5

erio authored Mar 17, 2026

TestConnection now reuses antigravityRetryLoop instead of a standalone
HTTP loop, gaining credits overages, smart retry, and 429/503 backoff
for free. AccountSwitchError is caught and surfaced as a friendly
message. Also populates RateLimitedModel in TempUnscheduled switch error.

Test fixes:
- Use RATE_LIMIT_EXCEEDED in 503 short-delay test to avoid 60x1s timeout
- Clamp waitDuration=0 instead of 999s to avoid 15s max-wait timeout
- Enhance mockSmartRetryUpstream with repeatLast and body caching

a6f99cf5

fix(antigravity): add stream keepalive to prevent connection drops · d7957343

kunish authored Mar 16, 2026

Antigravity streaming handlers were missing the keepalive mechanism
that exists in the standard gateway, causing proxy/CDN idle timeouts
to break connections during long thinking phases (e.g. claude-opus-4-6).
This resulted in truncated responses with missing tool calls.

Add StreamKeepaliveInterval support to all three Antigravity streaming
paths: Claude SSE, Gemini SSE, and upstream passthrough.

d7957343

15 Mar, 2026 3 commits

fix: remove ClaudeMax references not yet in upstream/main · 0d2061b2

erio authored Mar 16, 2026

Remove SimulateClaudeMaxEnabled field and related logic from
admin_service.go, and remove applyClaudeMaxCacheBillingPolicyToUsage,
applyClaudeMaxNonStreamingRewrite, setupClaudeMaxStreamingHook calls
from antigravity_gateway_service.go. These symbols are not yet
available in upstream/main.

0d2061b2

refactor: replace sync.Map credits state with AICredits rate limit key · 8a260def

erio authored Mar 16, 2026

Replace process-memory sync.Map + per-model runtime state with a single
"AICredits" key in model_rate_limits, making credits exhaustion fully
isomorphic with model-level rate limiting.

Scheduler: rate-limited accounts with overages enabled + credits available
are now scheduled instead of excluded.

Forwarding: when model is rate-limited + credits available, inject credits
proactively without waiting for a 429 round trip.

Storage: credits exhaustion stored as model_rate_limits["AICredits"] with
5h duration, reusing SetModelRateLimit/isRateLimitActiveForKey.

Frontend: show credits_active (yellow ⚡) when model rate-limited but
credits available, credits_exhausted (red) when AICredits key active.

Tests: add unit tests for shouldMarkCreditsExhausted, injectEnabledCreditTypes,
clearCreditsExhausted, and update existing overages tests.

8a260def

feat: implement resolveCreditsOveragesModelKey function to stabilize model key... · 17e40333
SilentFlower authored Mar 15, 2026
```
feat: implement resolveCreditsOveragesModelKey function to stabilize model key resolution for credit overages
```
17e40333

12 Mar, 2026 2 commits
- fix 第一次 400，第二次触发切账号信号 · 25cb5e75
  haruka authored Mar 12, 2026
  
  25cb5e75
- add test for fix #935 · f44927b9
  haruka authored Mar 12, 2026
  
  f44927b9
07 Mar, 2026 1 commit
- feat: 支持后台设置是否启用整流开关 · a3791104
  shaw authored Mar 07, 2026
  
  a3791104
06 Mar, 2026 1 commit
- fix issue #791 · 65a10679
  Elysia authored Mar 06, 2026
  
  65a10679
28 Feb, 2026 1 commit
- feat(sync): full code sync from release · bb664d9b
  yangjianbo authored Feb 28, 2026
  
  bb664d9b
27 Feb, 2026 1 commit

feat: replace gemini-3-pro-image with gemini-3.1-flash-image · a6f9f9f9

erio authored Feb 27, 2026

- Add migration 060 to update model_mapping for all antigravity accounts
- Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings
- Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings
- Update frontend usage window to show GImage for new model
- Update isImageGenerationModel to support new model

a6f9f9f9

24 Feb, 2026 2 commits

fix(antigravity): bill with mapped model and use final model key for rate limiting · 4573868c

erio authored Feb 24, 2026

- Use mapped model (billingModel) instead of original request model for billing
- Use resolveFinalAntigravityModelKey for 429 rate limit model key,
  ensuring rate limit records match the actual upstream model
- Add regression tests for both fixes

4573868c

fix: distinguish client disconnection from upstream retry failure · 0dacdf48

erio authored Feb 24, 2026

Before this change, when a client disconnected mid-request, the error
message was "Upstream request failed after retries", which is misleading
and pollutes error logs. Now we check context.Err() to return a more
accurate "Client disconnected" message for both Claude and Gemini
forward paths.

0dacdf48

14 Feb, 2026 1 commit

feat: 区分 Anthropic 5m/1h 缓存创建 token 的差异化计费 · a817cafe

shaw authored Feb 14, 2026

Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h
两种缓存创建 token，1h 单价远高于 5m（如 claude-3-5-haiku: 5m=$1/MTok,
1h=$6/MTok）。此前系统统一按 5m 单价计费，导致计费偏低。

后端：
- pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr
- billing_service: GetModelPricing 启用分类计费（安全守卫 1h>5m），
  CalculateCost 按 5m/1h 分别计费，无明细时回退到 5m 单价
- gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson
  提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens
- antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取
- usage_log: 修复 GORM column tag 确保写入正确的数据库列
- 新增迁移 054: 删除 GORM 自动生成的重复列

前端：
- 使用记录 tooltip 展示 5m/1h 缓存创建明细（带彩色 badge 区分）
- 表格单元格缓存写入数值旁显示 1h 标识

a817cafe

12 Feb, 2026 1 commit

chore(logging): 完成后端日志审计与结构化迁移 · 584cfc3d

yangjianbo authored Feb 12, 2026

- 将高密度服务与处理器日志迁移到新日志系统（LegacyPrintf/结构化日志）
- 增加 stdlog bridge 与兼容测试，保留旧日志捕获能力
- 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获
- 补齐后端相关文件 logger 引用并通过全量 go test

584cfc3d

11 Feb, 2026 1 commit

[UPDATE] 增强 Claude Thinking 模式支持与 Opus 4.6 动态预算适配 · 19cca11e

SilentFlower authored Feb 11, 2026

✨ feat(antigravity): 支持 thinking adaptive 类型并适配 Opus 4.6 动态预算
🧪 test(gateway): 增加 thinking 模式解析与签名块过滤的边界用例测试

19cca11e

10 Feb, 2026 5 commits

perf: 错误处理性能优化 · a54b81cf

Edric Li authored Feb 10, 2026

- MatchRule 延迟/限制 body ToLower，先用 statusCode 短路，只在需要关键词匹配时转换且限制 8KB
- 预计算规则的小写关键词/平台和 error code set，消除运行时重复 ToLower 和线性扫描
- MODEL_CAPACITY_EXHAUSTED 全局去重，避免并发请求重复重试同一模型
- 503 重试 body 读取限制从 2MB 降至 8KB
- time.After 替换为 time.NewTimer，防止 context 取消时 timer 泄漏

a54b81cf

fix: 修复错误透传规则 skip_monitoring 未生效的问题 · 2d4236f7

Edric Li authored Feb 10, 2026

- ops_error_logger: status < 400 分支增加 OpsSkipPassthroughKey 检查
- ops_upstream_context: 新增 checkSkipMonitoringForUpstreamEvent，中间重试/故障转移事件也能触发跳过标记
- gateway_handler/openai_gateway_handler/gemini_v1beta_handler: handleFailoverExhausted 匹配规则后设置 OpsSkipPassthroughKey
- antigravity_gateway_service: writeMappedClaudeError 增加 applyErrorPassthroughRule 调用

2d4236f7

feat(antigravity): 转发与测试支持daily/prod单URL切换 · 1f647b12
song authored Feb 10, 2026

1f647b12
fix: 移除特定system以适配新版cc客户端缓存失效的bug · 5dd83d3c
shaw authored Feb 10, 2026

5dd83d3c

perf(backend): 使用 gjson/sjson 优化热路径 JSON 处理 · 58912d4a

yangjianbo authored Feb 10, 2026



将 API 网关热路径中的 json.Unmarshal+json.Marshal 替换为 gjson 零拷贝查询和 sjson 精准写入：
- unwrapV1InternalResponse 性能提升 22x（4009ns→182ns），内存分配减少 28.5x
- unwrapGeminiResponse、extractGeminiUsage、estimateGeminiCountTokens、ParseGeminiRateLimitResetTime 改为接收 []byte 使用 gjson 提取
- ParseGatewayRequest 的 model/stream/metadata/thinking/max_tokens 改用 gjson 类型安全提取
- Handler 层（sora/openai）改用 gjson 提取字段、sjson 注入/修改字段，移除 map[string]any 中间变量
- Sora Client 响应解析改用 gjson ForEach 遍历，减少内存分配
- 新增约 100 个单元测试用例，所有改动函数覆盖率 >85%
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

58912d4a

09 Feb, 2026 8 commits

feat: MODEL_CAPACITY_EXHAUSTED 使用固定1s间隔重试60次，不切换账号 · 6114f69c

Edric Li authored Feb 10, 2026

MODEL_CAPACITY_EXHAUSTED (503) 表示模型容量不足，所有账号共享同一容量池，
切换账号无意义。改为固定1s间隔重试最多60次，重试耗尽后直接返回上游错误。

- 新增 antigravityModelCapacityRetryMaxAttempts=60 和 antigravityModelCapacityRetryWait=1s
- shouldTriggerAntigravitySmartRetry 新增 isModelCapacityExhausted 返回值
- handleSmartRetry 对 MODEL_CAPACITY_EXHAUSTED 使用独立重试策略
- handleModelRateLimit 对 MODEL_CAPACITY_EXHAUSTED 仅标记 Handled，不设限流
- 重试耗尽后不设置模型限流、不清除粘性会话、不切换账号

6114f69c

feat: same-account retry before failover for transient errors · d6c2921f

Edric Li authored Feb 10, 2026

For retryable transient errors (Google 400 "invalid project resource name"
and empty stream responses), retry on the same account up to 2 times
(with 500ms delay) before switching to another account.

- Add RetryableOnSameAccount field to UpstreamFailoverError
- Add same-account retry loop in both Gemini and Claude/OpenAI handler paths
- Move temp-unschedule from service layer to handler layer (only after
  all same-account retries exhausted)
- Reduce temp-unschedule cooldown from 30 minutes to 1 minute

d6c2921f

feat: failover and temp-unschedule on empty stream response · 61c73287

Edric Li authored Feb 09, 2026

- Empty stream responses now return UpstreamFailoverError instead of
  plain 502, triggering automatic account switching (up to 10 retries)
- Add tempUnscheduleEmptyResponse: accounts returning empty responses
  are temp-unscheduled for 30 minutes
- Apply to both Claude and Gemini non-streaming paths
- Align googleConfigErrorCooldown from 60m to 30m for consistency

61c73287

feat: failover and temp-unschedule on Google "Invalid project resource name" 400 · 89905ec4

Edric Li authored Feb 09, 2026

Google 后端间歇性返回 400 "Invalid project resource name" 错误，
此前该错误直接透传给客户端且不触发账号切换，导致请求失败。

- 在 Antigravity 和 Gemini 两个平台的所有转发路径中，
  精确匹配该错误消息后触发 failover 自动换号重试
- 命中后将账号临时封禁 1 小时，避免反复调度到同一故障账号
- 提取共享函数 isGoogleProjectConfigError / tempUnscheduleGoogleConfigError
  消除跨 Service 的代码重复

89905ec4

fix: skip rate limiting when custom error codes don't match upstream status · 6892e84a

erio authored Feb 09, 2026

Add ShouldHandleErrorCode guard at the entry of handleGeminiUpstreamError
and AntigravityGatewayService.handleUpstreamError so that accounts with
custom error codes (e.g. [599]) are not rate-limited when the upstream
returns a non-matching status (e.g. 429).

6892e84a

feat: ErrorPolicySkipped returns 500 instead of upstream status code · 73f45574

erio authored Feb 09, 2026

When custom error codes are enabled and the upstream error code is NOT
in the configured list, return HTTP 500 to the client instead of
transparently forwarding the original status code.

Also adds integration test TestCustomErrorCode599 verifying that 429,
500, 503, 401, 403 all return 500 without triggering SetRateLimited
or SetError.

73f45574

feat: 添加 Antigravity 单账号 503 退避重试机制 · f6cfab99

Rose Ding authored Feb 09, 2026

当分组内只有一个可用账号且上游返回 503 (MODEL_CAPACITY_EXHAUSTED) 时，
不再设置模型限流+切换账号（因为切换回来还是同一个账号），而是在 Service 层
原地等待+重试，避免双重等待问题。

主要变更：
- Handler 层：检测单账号 503 场景，清除排除列表并设置 SingleAccountRetry 标记
- Service 层：新增 handleSingleAccountRetryInPlace 原地重试逻辑
- Service 层：预检查跳过单账号模式下的限流检查
- 新增 ctxkey.SingleAccountRetry 上下文标记

f6cfab99

refactor: replace scope-level rate limiting with model-level rate limiting · fc095bf0

erio authored Feb 09, 2026

Merge functional changes from develop branch:
- Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image)
- Replace with per-model rate limiting using resolveAntigravityModelKey
- Remove model load statistics (IncrModelCallCount/GetModelLoadBatch)
- Simplify account selection to unified priority→load→LRU algorithm
- Remove SetAntigravityQuotaScopeLimit from AccountRepository
- Clean up scope-related UI indicators and API fields

fc095bf0

08 Feb, 2026 9 commits

feat: route AccountTypeUpstream to ForwardUpstream in Forward() entry · 9236936a

erio authored Feb 09, 2026

Without this routing guard, ForwardUpstream is never called because
Forward() always proceeds with the standard OAuth/cookie flow.

9236936a

fix: use upstream retryDelay for rate limit duration instead of fixed default · 12515246

erio authored Feb 09, 2026

- In handleSmartRetry, use the actual upstream retryDelay to set model
  rate limit duration instead of always using the 30s default
- Return info.RetryDelay from shouldTriggerAntigravitySmartRetry when
  shouldRateLimitModel=true, so callers know the actual delay
- Extract getDefaultRateLimitDuration() and resolveResetTime() helpers
  to reduce duplication in handleUpstreamError 429 handling
- Improve debug logging with upstream_retry_delay and response body

12515246

feat: detect client disconnect during streaming and continue draining upstream for billing · 6d90fb0b
erio authored Feb 09, 2026

6d90fb0b
feat: unified error policy for Antigravity + enable custom error codes for Gemini accounts · 2f1182e8
erio authored Feb 09, 2026

2f1182e8
fix: remove unused upstreamHopByHopHeaders variable to pass golangci-lint · 69816f86
erio authored Feb 08, 2026

69816f86

refactor(upstream): replace upstream account type with apikey, auto-append /antigravity · fb58560d

erio authored Feb 08, 2026

Upstream accounts now use the standard APIKey type instead of a dedicated
upstream type. GetBaseURL() and new GetGeminiBaseURL() automatically append
/antigravity for Antigravity platform APIKey accounts, eliminating the need
for separate upstream forwarding methods.

- Remove ForwardUpstream, ForwardUpstreamGemini, testUpstreamConnection
- Remove upstream branch guards in Forward/ForwardGemini/TestConnection
- Add migration 052 to convert existing upstream accounts to apikey
- Update frontend CreateAccountModal to create apikey type
- Add unit tests for GetBaseURL and GetGeminiBaseURL

fb58560d

fix(upstream): passthrough response body directly instead of parsing SSE · 6ab77f5e

erio authored Feb 08, 2026

ForwardUpstream/ForwardUpstreamGemini should pipe the upstream response
directly to the client (headers + body), not parse it as SSE stream.

6ab77f5e

fix: add nil guard for gin.Context in header passthrough to satisfy staticcheck SA5011 · 4f57d7f7
erio authored Feb 08, 2026

4f57d7f7

feat(upstream): passthrough all client headers instead of manual header setting · 1563bd3d

erio authored Feb 08, 2026

Replace manual header setting (Content-Type, anthropic-version, anthropic-beta)
with full client header passthrough in ForwardUpstream/ForwardUpstreamGemini.
Only authentication headers (Authorization, x-api-key) are overridden with
upstream account credentials. Hop-by-hop headers are excluded.

Add unit tests covering header passthrough, auth override, and hop-by-hop filtering.

1563bd3d

07 Feb, 2026 1 commit

fix(gateway): restore upstream account forwarding with dedicated methods · 77b66653

erio authored Feb 08, 2026

v0.1.74 merged upstream accounts into the OAuth path, causing requests
to hit the wrong protocol and endpoint. Add three upstream-specific
methods (testUpstreamConnection, ForwardUpstream, ForwardUpstreamGemini)
that use base_url + apiKey auth and passthrough the original body, while
reusing the existing response handling and error/retry logic.

77b66653