- 15 Mar, 2026 1 commit
-
-
YanzheL authored
Claude's output_config.effort parameter (low/medium/high/max) was not being extracted from requests or logged in the reasoning_effort column of usage logs. Only the OpenAI path populated this field. Changes: - Extract output_config.effort in ParseGatewayRequest - Add ReasoningEffort field to ForwardResult - Populate reasoning_effort in both RecordUsage and RecordUsageWithLongContext - Guard against overwriting service-set effort values in handler - Update stale comments that described reasoning_effort as OpenAI-only - Add unit tests for extraction, normalization, and persistence
-
- 14 Mar, 2026 2 commits
-
-
SsageParuders authored
Consolidate two separate channel types (bedrock + bedrock-apikey) into a single "AWS Bedrock" channel. Authentication mode is now distinguished by credentials.auth_mode ("sigv4" | "apikey") instead of separate types. Backend: - Remove AccountTypeBedrockAPIKey constant - IsBedrock() simplified; IsBedrockAPIKey() checks auth_mode - Add IsAPIKeyOrBedrock() helper to eliminate repeated type checks - Extend pool mode, quota scheduling, and billing to bedrock - Add RetryableOnSameAccount to handleBedrockUpstreamErrors - Add "bedrock" scope to Beta Policy for independent control Frontend: - Merge two buttons into one "AWS Bedrock" with auth mode radio - Badge displays "Anthropic | AWS" - Pool mode and quota limit UI available for bedrock - Quota display in account list (usage bars, capacity badges, reset) - Remove all bedrock-apikey type references -
InCerry authored
-
- 13 Mar, 2026 2 commits
- 12 Mar, 2026 3 commits
- 11 Mar, 2026 1 commit
-
-
amberwarden authored
Anthropic Messages API 的流式转发路径(gateway_service.go)在上游长时间 无数据时(如 Opus extended thinking 阶段)不会向下游发送任何内容,导致 Cloudflare Tunnel 等代理因连接空闲而断开。 复用已有的 StreamKeepaliveInterval 配置(默认 10 秒),在 select 循环中 添加 keepalive 分支,定时发送 Anthropic 原生格式的 ping 事件保活,与 OpenAI 兼容路径的实现模式保持一致。 Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
- 10 Mar, 2026 1 commit
-
-
shaw authored
-
- 09 Mar, 2026 3 commits
- 08 Mar, 2026 1 commit
-
-
kyx236 authored
-
- 07 Mar, 2026 2 commits
-
-
shaw authored
-
erio authored
Extend the existing total quota limit with daily and weekly periodic dimensions. Each dimension is independently configurable and uses lazy reset — when the period expires, usage is automatically reset to zero on the next increment. Any dimension exceeding its limit will pause the account from scheduling. Backend: - Add GetQuotaDailyLimit/Used, GetQuotaWeeklyLimit/Used, HasAnyQuotaLimit - Rewrite IncrementQuotaUsed with atomic CTE SQL for 3-dimension update - Rewrite ResetQuotaUsed to clear all dimensions and period timestamps - Update postUsageBilling to use HasAnyQuotaLimit() - Preserve daily/weekly used values on account edit Frontend: - Refactor QuotaLimitCard from single v-model to 3-dimension props - Add QuotaBadge component for compact D/W/$ display - Update AccountCapacityCell with per-dimension badges - Update Create/Edit modals with daily/weekly quota fields - Update AccountActionMenu hasQuotaLimit to check all dimensions - Add i18n strings for daily/weekly/total quota labels Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
- 06 Mar, 2026 4 commits
-
-
yangjianbo authored
抽取共享的用户分组专属倍率解析器,统一缓存、singleflight 与回退逻辑。\n\n让 OpenAI 独立计费链路复用专属倍率解析,修复 usage 记录与实际扣费未命中用户专属倍率的问题。\n\n补齐 OpenAI 计费与解析器单元测试,并修复全量回归中暴露的 lint 阻塞项。\n\nCo-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
-
erio authored
isModelSupportedByAccount 不被 OpenAI 调度路径调用, OpenAI /responses 和 /chat/completions 走的是 openai_account_scheduler.go,透传短路已在 PR #806 的 第二个 commit 中正确添加到该文件。 此处的检查是多余的死代码,因为 OpenAI 账号不会走到 isModelSupportedByAccount 的这个分支。
-
alfadb authored
Add Anthropic Messages API support for OpenAI platform groups, enabling clients using Claude-style /v1/messages format to access OpenAI accounts through automatic protocol conversion. - Add apicompat package with type definitions and bidirectional converters (Anthropic
↔ Chat, Chat↔ Responses, Anthropic↔ Responses) - Implement /v1/messages endpoint for OpenAI gateway with streaming support - Add model mapping UI for OpenAI OAuth accounts (whitelist + mapping modes) - Support prompt caching fields and codex OAuth transforms - Fix tool call ID conversion for Responses API (fc_ prefix) - Ensure function_call_output has non-empty output field Co-Authored-By:Claude Opus 4.6 <noreply@anthropic.com>
-
erio authored
透传模式账号仅替换认证,应允许所有模型通过。之前调度阶段的 isModelSupportedByAccount 不感知透传模式,导致 model_mapping 中未配置的新模型(如 gpt-5.4)被拒绝返回 503。
-
- 05 Mar, 2026 5 commits
-
-
erio authored
-
erio authored
- Extract postUsageBilling() to consolidate billing logic across GatewayService.RecordUsage, RecordUsageWithLongContext, and OpenAIGatewayService.RecordUsage, eliminating ~120 lines of duplicated code - Fix account quota to use TotalCost × accountRateMultiplier (was using raw TotalCost, inconsistent with account cost stats) - Fix RecordUsageWithLongContext API Key quota only updating in balance mode (now updates regardless of billing type) - Fix WebSocket client disconnect detection on Windows by adding "an established connection was aborted" to known disconnect errors
-
erio authored
- Add configurable spending limit (quota_limit) for apikey-type accounts - Atomic quota accumulation via PostgreSQL JSONB operations on TotalCost - Scheduler filters out over-quota accounts with outbox-triggered snapshot refresh - Display quota usage ($used / $limit) in account capacity column - Add "Reset Quota" action in account menu to reset usage to zero - Editing account settings preserves quota_used (no accidental reset) - Covers all 3 billing paths: Anthropic, Gemini, OpenAI RecordUsage chore: bump version to 0.1.90.4
-
shaw authored
-
shaw authored
-
- 03 Mar, 2026 2 commits
-
-
shaw authored
-
QTom authored
当 API Key 无分组时,调度仅从未分组账号池中选取。 修复 isAccountInGroup 在 groupID==nil 时的逻辑, 同时补全 scheduler_snapshot_service 和 gemini_compat_service 中的 SimpleMode 保护,确保分组隔离在所有调度路径生效。 新增 ListSchedulableUngroupedByPlatform/s 方法, 使用 Ent 的 Not(HasAccountGroups()) 谓词实现未分组账号隔离。 新增 17 个单元和端到端隔离测试,覆盖所有分支和边界条件。
-
- 02 Mar, 2026 1 commit
-
-
QTom authored
新增 UMQ (User Message Queue) 双模式支持: - serialize: 账号级分布式串行锁 + RPM 自适应延迟(严格限流) - throttle: 仅 RPM 自适应前置延迟,不阻塞并发(软性限速) 后端: - config: 新增 Mode 字段,保留 Enabled 向后兼容 - service: 新增 UserMessageQueueService(Lua 锁/延迟算法/清理 worker) - repository: 新增 UserMsgQueueCache(Redis Lua acquire/release/force-release) - handler: 新增 UserMsgQueueHelper(SSE ping + 等待循环 + throttle) - gateway: 按 mode 分支集成 serialize/throttle 逻辑 - lint: 修复 gofmt rewrite rules、errcheck 类型断言、staticcheck QF1012 前端: - 三态选择器 UI(关闭/软性限速/串行队列)替代 toggle 开关 - BulkEdit 支持 null 语义(不修改) - i18n 中英文文案 通过 6 轮专家评审(42 次 review)、golangci-lint、单元测试、集成测试。
-
- 28 Feb, 2026 8 commits
-
-
QTom authored
- Add sanitizeExtraBaseRPM to BulkUpdate handler (was missing) - Add WindowCost scheduling checks to legacy non-sticky selection paths (4 sites), matching existing sticky + load-aware coverage - Export ParseExtraInt from service package, remove duplicate parseExtraIntForValidation from admin handler
-
QTom authored
- Move IncrementRPM after Forward success to prevent phantom RPM consumption during account switch retries - Add base_rpm input sanitization (clamp to 0-10000) in Create/Update - Add WindowCost scheduling checks to legacy path sticky sessions (4 check sites + 4 prefetch sites), fixing pre-existing gap - Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in BulkEditModal (JSONB merge cannot delete keys, use empty values) - Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer - Document TOCTOU race as accepted soft-limit design trade-off
-
QTom authored
Ensures isAccountSchedulableForRPM calls within the routing segment hit the prefetch cache instead of querying Redis individually.
-
QTom authored
- Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE - Filter negative values in GetBaseRPM(), update test expectation - Add RPM batch query (GetRPMBatch) to account List API - Add warn logs for RPM increment failures in gateway handler - Reset enableRpmLimit on BulkEditAccountModal close - Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs - Add design decision comments for rdb.Time() RTT trade-off
-
QTom authored
-
QTom authored
-
QTom authored
-
yangjianbo authored
-
- 27 Feb, 2026 1 commit
-
-
erio authored
- Add migration 060 to update model_mapping for all antigravity accounts - Remove gemini-3-pro-image and gemini-3-pro-image-preview mappings - Add gemini-3.1-flash-image and gemini-3.1-flash-image-preview mappings - Update frontend usage window to show GImage for new model - Update isImageGenerationModel to support new model
-
- 26 Feb, 2026 3 commits
-
-
alfadb authored
-
alfadb authored
PR #635 returned HTTP 200 with {"input_tokens": 0} when upstream doesn't support count_tokens (404). This caused Claude Code CLI to trust the zero value, believing context uses 0 tokens, so auto-compression never triggers. Fix: return 404 with proper error body so CLI falls back to its local tokenizer for accurate estimation. Return nil (not error) to avoid polluting ops error metrics with expected 404s. Affected paths: - Passthrough APIKey accounts: upstream 404 now passed through as 404 - Antigravity accounts: same fix (was also returning fake 200) -
shaw authored
-