- 05 Mar, 2026 3 commits
- 03 Mar, 2026 2 commits
-
-
shaw authored
-
QTom authored
当 API Key 无分组时,调度仅从未分组账号池中选取。 修复 isAccountInGroup 在 groupID==nil 时的逻辑, 同时补全 scheduler_snapshot_service 和 gemini_compat_service 中的 SimpleMode 保护,确保分组隔离在所有调度路径生效。 新增 ListSchedulableUngroupedByPlatform/s 方法, 使用 Ent 的 Not(HasAccountGroups()) 谓词实现未分组账号隔离。 新增 17 个单元和端到端隔离测试,覆盖所有分支和边界条件。
-
- 02 Mar, 2026 1 commit
-
-
QTom authored
新增 UMQ (User Message Queue) 双模式支持: - serialize: 账号级分布式串行锁 + RPM 自适应延迟(严格限流) - throttle: 仅 RPM 自适应前置延迟,不阻塞并发(软性限速) 后端: - config: 新增 Mode 字段,保留 Enabled 向后兼容 - service: 新增 UserMessageQueueService(Lua 锁/延迟算法/清理 worker) - repository: 新增 UserMsgQueueCache(Redis Lua acquire/release/force-release) - handler: 新增 UserMsgQueueHelper(SSE ping + 等待循环 + throttle) - gateway: 按 mode 分支集成 serialize/throttle 逻辑 - lint: 修复 gofmt rewrite rules、errcheck 类型断言、staticcheck QF1012 前端: - 三态选择器 UI(关闭/软性限速/串行队列)替代 toggle 开关 - BulkEdit 支持 null 语义(不修改) - i18n 中英文文案 通过 6 轮专家评审(42 次 review)、golangci-lint、单元测试、集成测试。
-
- 28 Feb, 2026 8 commits
-
-
QTom authored
- Add sanitizeExtraBaseRPM to BulkUpdate handler (was missing) - Add WindowCost scheduling checks to legacy non-sticky selection paths (4 sites), matching existing sticky + load-aware coverage - Export ParseExtraInt from service package, remove duplicate parseExtraIntForValidation from admin handler
-
QTom authored
- Move IncrementRPM after Forward success to prevent phantom RPM consumption during account switch retries - Add base_rpm input sanitization (clamp to 0-10000) in Create/Update - Add WindowCost scheduling checks to legacy path sticky sessions (4 check sites + 4 prefetch sites), fixing pre-existing gap - Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in BulkEditModal (JSONB merge cannot delete keys, use empty values) - Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer - Document TOCTOU race as accepted soft-limit design trade-off
-
QTom authored
Ensures isAccountSchedulableForRPM calls within the routing segment hit the prefetch cache instead of querying Redis individually.
-
QTom authored
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE - Filter negative values in GetBaseRPM(), update test expectation - Add RPM batch query (GetRPMBatch) to account List API - Add warn logs for RPM increment failures in gateway handler - Reset enableRpmLimit on BulkEditAccountModal close - Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs - Add design decision comments for rdb.Time() RTT trade-off
-
QTom authored
-
QTom authored
-
QTom authored
-
yangjianbo authored
-
- 27 Feb, 2026 1 commit
-
-
erio authored
- Add migration 060 to update model_mapping for all antigravity accounts - Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings - Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings - Update frontend usage window to show GImage for new model - Update isImageGenerationModel to support new model
-
- 26 Feb, 2026 4 commits
-
-
alfadb authored
-
alfadb authored
PR #635 returned HTTP 200 with {"input_tokens": 0} when upstream doesn't support count_tokens (404). This caused Claude Code CLI to trust the zero value, believing context uses 0 tokens, so auto-compression never triggers. Fix: return 404 with proper error body so CLI falls back to its local tokenizer for accurate estimation. Return nil (not error) to avoid polluting ops error metrics with expected 404s. Affected paths: - Passthrough APIKey accounts: upstream 404 now passed through as 404 - Antigravity accounts: same fix (was also returning fake 200) -
shaw authored
-
alfadb authored
第三方 Anthropic 中转站通常不支持 /v1/messages/count_tokens 端点, 上游返回 404 时降级返回 {input_tokens: 0},客户端 fallback 到本地估算。 - 仅匹配 404 状态码,语义明确:端点不存在 - 其他错误 (400/429/500) 保留原始处理链和 ops 遥测 - 无需解析错误消息内容,不依赖字符串匹配 - 新增 table-driven 测试覆盖 fallback 和 non-fallback 路径
-
- 24 Feb, 2026 1 commit
-
-
erio authored
Remove the special case that bypassed model-supported checks for Gemini API Key accounts, allowing model_mapping to filter requests properly. Add tests for multiplatform model filtering behavior.
-
- 22 Feb, 2026 2 commits
-
-
yangjianbo authored
-
yangjianbo authored
-
- 21 Feb, 2026 2 commits
-
-
yangjianbo authored
-
yangjianbo authored
- 新增 Anthropic API Key 自动透传开关与后端透传分支(仅替换认证) - 账号编辑页新增自动透传开关,默认关闭 - 优化透传性能:SSE usage 解析 gjson 快路径、减少请求体重复拷贝、优化流式写回与非流式 usage 解析 - 补充单元测试与 benchmark,确保 Claude OAuth 路径不受影响
-
- 19 Feb, 2026 2 commits
-
-
yangjianbo authored
-
yangjianbo authored
- 在 failover 场景透传上游响应头并识别 Cloudflare challenge/cf-ray - 统一 Sora 任务请求的 UA 与代理使用,sentinel 与业务请求保持一致 - 修复流式错误事件 JSON 转义问题并补充相关单元测试
-
- 18 Feb, 2026 1 commit
-
-
shaw authored
-
- 17 Feb, 2026 1 commit
-
-
John Doe authored
- Account-level cache TTL override: rewrite Anthropic cache_creation token classification (5m
↔ 1h) in streaming/non-streaming responses - New DB field cache_ttl_overridden in usage_log for billing tracking - Migration 055_add_cache_ttl_overridden - Frontend: CacheTTL override toggle in account create/edit modals - Ent schema regenerated for new usage_log fields Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
- 16 Feb, 2026 1 commit
-
-
yangjianbo authored
- 仅在 delta 中 5m/1h 值大于0时覆盖 usage 明细 - 新增回归测试覆盖 delta 默认 0 不应覆盖 message_start 非零值 - 迁移 054 在删除 legacy 字段前追加一次回填,避免升级实例丢失历史写入 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
- 14 Feb, 2026 2 commits
-
-
shaw authored
Anthropic API 的 cache_creation 对象区分了 ephemeral_5m 和 ephemeral_1h 两种缓存创建 token,1h 单价远高于 5m(如 claude-3-5-haiku: 5m=$1/MTok, 1h=$6/MTok)。此前系统统一按 5m 单价计费,导致计费偏低。 后端: - pricing_service: 加载 LiteLLM 的 cache_creation_input_token_cost_above_1hr - billing_service: GetModelPricing 启用分类计费(安全守卫 1h>5m), CalculateCost 按 5m/1h 分别计费,无明细时回退到 5m 单价 - gateway_service: parseSSEUsage/handleNonStreamingResponse 用 gjson 提取嵌套 cache_creation 对象的 ephemeral_5m/1h_input_tokens - antigravity_gateway_service: extractSSEUsage/extractClaudeUsage 同步提取 - usage_log: 修复 GORM column tag 确保写入正确的数据库列 - 新增迁移 054: 删除 GORM 自动生成的重复列 前端: - 使用记录 tooltip 展示 5m/1h 缓存创建明细(带彩色 badge 区分) - 表格单元格缓存写入数值旁显示 1h 标识
-
yangjianbo authored
-
- 12 Feb, 2026 1 commit
-
-
yangjianbo authored
- 将高密度服务与处理器日志迁移到新日志系统(LegacyPrintf/结构化日志) - 增加 stdlog bridge 与兼容测试,保留旧日志捕获能力 - 将 OpenAI 断流告警改为结构化 Warn 并改造对应测试为 sink 捕获 - 补齐后端相关文件 logger 引用并通过全量 go test
-
- 11 Feb, 2026 1 commit
-
-
SilentFlower authored
✨ feat(antigravity): 支持 thinking adaptive 类型并适配 Opus 4.6 动态预算 🧪 test(gateway): 增加 thinking 模式解析与签名块过滤的边界用例测试
-
- 10 Feb, 2026 2 commits
- 09 Feb, 2026 4 commits
-
-
Edric Li authored
For retryable transient errors (Google 400 "invalid project resource name" and empty stream responses), retry on the same account up to 2 times (with 500ms delay) before switching to another account. - Add RetryableOnSameAccount field to UpstreamFailoverError - Add same-account retry loop in both Gemini and Claude/OpenAI handler paths - Move temp-unschedule from service layer to handler layer (only after all same-account retries exhausted) - Reduce temp-unschedule cooldown from 30 minutes to 1 minute
-
yangjianbo authored
-
Rose Ding authored
单账号 antigravity 分组收到 503 (MODEL_CAPACITY_EXHAUSTED) 时, 原逻辑会设置 ~29s 模型限流标记。由于只有一个账号无法切换, 后续所有新请求在预检查时命中限流 → 几毫秒内直接返回 503, 导致约 30 秒的雪崩窗口。 修复:在 Handler 入口处检查分组是否只有单个 antigravity 账号, 如果是则提前设置 SingleAccountRetry context 标记,让 Service 层 首次 503 就走原地重试逻辑(不设限流标记),避免污染后续请求。
-
erio authored
Merge functional changes from develop branch: - Remove AntigravityQuotaScope system (claude/gemini_text/gemini_image) - Replace with per-model rate limiting using resolveAntigravityModelKey - Remove model load statistics (IncrModelCallCount/GetModelLoadBatch) - Simplify account selection to unified priority→load→LRU algorithm - Remove SetAntigravityQuotaScopeLimit from AccountRepository - Clean up scope-related UI indicators and API fields
-
- 08 Feb, 2026 1 commit
-
-
erio authored
Add post-sort shuffle for accounts with identical (priority, loadRate, lastUsedAt) to break deterministic ordering when concurrent requests read the same scheduler snapshot. Applies to both Antigravity and OpenAI scheduling paths, plus the sortAccountsByPriorityAndLastUsed helper. Keeps upstream CallCount/ModelLoadInfo scheduling intact; shuffle is additive and only randomises within equivalent-rank groups.
-