- 24 Apr, 2026 4 commits
-
-
keh4l authored
Real Claude Code CLI always sends a 2-block system array: [0] {"type":"text", "text":"x-anthropic-billing-header: cc_version=X.Y.Z.{fp}; cc_entrypoint=cli; cch=00000;"} [1] {"type":"text", "text":"You are Claude Code...", "cache_control":{...}} Before this commit, sub2api's mimicry path only produced block [1]. The missing billing block is one of the primary third-party detection signals Anthropic uses for Claude-Code-scoped OAuth tokens. New file gateway_billing_block.go ports the fingerprint algorithm (byte-for-byte from Parrot cc_mimicry.py:compute_fingerprint): pick chars at positions [4,7,20] of the first user text, then `sha256(SALT + chars + cc_version)[:3]`. - claude/constants.go: CLICurrentVersion = "2.1.92" (must match UA) - gateway_billing_block.go: computeClaudeCodeFingerprint + buildBillingAttributionBlockJSON + extractFirstUserText - gateway_service.go: rewriteSystemForNonClaudeCode now emits both blocks in order; cch=00000 is filled in later by signBillingHeaderCCH in buildUpstreamRequest. Downstream compat note: syncBillingHeaderVersion's regex `cc_version=\d+\.\d+\.\d+` only matches the semver triple, leaving the `.{fp}` suffix intact when rewriting in buildUpstreamRequest. -
keh4l authored
Real Claude CLI traffic sends cache_control as `{"type":"ephemeral","ttl":"1h"}`. Our previous payload only sent `{"type":"ephemeral"}`, which is a bytewise mismatch with the official CLI and one more third-party detection signal. Policy: client-provided ttl is always passed through unchanged. Proxy-generated cache_control blocks default to 5m (vs Parrot's 1h) to avoid burning the 1h cache budget on automatic breakpoints while still aligning with the `ttl` field being present. - claude/constants.go: DefaultCacheControlTTL = "5m" - apicompat/types.go: new AnthropicCacheControl type with TTL field; AnthropicTool gains optional CacheControl pointer so the mimicry path can attach a cache breakpoint to tools[-1] later. - service/gateway_service.go: anthropicCacheControlPayload gains TTL; marshalAnthropicSystemTextBlock and rewriteSystemForNonClaudeCode emit ttl=5m by default. -
keh4l authored
The previous commit added FullClaudeCodeMimicryBetas() but the two call sites in buildUpstreamRequest still hardcoded the old 3-token subset. Anthropic now checks the complete set of beta tokens to decide if a request qualifies as Claude Code. Wire them up: - /v1/messages mimic path: requiredBetas = FullClaudeCodeMimicryBetas() - /v1/messages/count_tokens mimic path: same + BetaTokenCounting Haiku models keep the 2-token exemption (BetaOAuth + InterleaveThinking).
-
keh4l authored
Before: the OpenAI-compat forwarders only called injectClaudeCodePrompt, which prepends the Claude Code banner but leaves the rest of the body in its original non-Claude-Code shape. The codebase already admits this is insufficient (see the comment on rewriteSystemForNonClaudeCode in gateway_service.go: "仅前置追加 Claude Code 提示词无法通过检测"). Effect: OAuth accounts served through /v1/chat/completions or /v1/responses were detected as third-party apps and bled plan quota with: Third-party apps now draw from your extra usage, not your plan limits. Fix: - apicompat.AnthropicRequest: add Metadata json.RawMessage so metadata survives the OpenAI->Anthropic->Marshal round trip; without it the downstream rewrite has no user_id to work with. - service: extract applyClaudeCodeOAuthMimicryToBody, a ParsedRequest-free variant of the /v1/messages mimicry pipeline (rewriteSystemForNonClaudeCode + normalizeClaudeOAuthRequestBody + metadata.user_id injection) so the OpenAI-compat forwarders can reuse it. - service: add buildOAuthMetadataUserIDFromBody + hashBodyForSessionSeed for the same reason (no ParsedRequest at the call site). - ForwardAsChatCompletions / ForwardAsResponses: replace the 3-line prompt-prepend with the full mimicry pipeline. - applyClaudeCodeMimicHeaders: set x-client-request-id per-request (real Claude CLI always does); missing/duplicated values are one more third-party fingerprint signal. No change to the native /v1/messages path: it already called the full pipeline, we only lift those helpers into a reusable function. Tests: - go build ./... passes - go test ./internal/service/... ./internal/pkg/apicompat/... passes - lsp_diagnostics clean on all touched files - pre-existing failures in internal/config are unrelated (env-sensitive tests that also fail on upstream main)
-
- 19 Apr, 2026 1 commit
-
-
erio authored
Add quota exceeded check to IsSchedulable() and refactor shouldClearStickySession to delegate to IsSchedulable(), eliminating duplicated logic and fixing missed overload/rate-limit/expired checks. Frontend displays quota exceeded status independently via quota fields.
-
- 17 Apr, 2026 1 commit
-
-
erio authored
Subscription-mode billing was consuming quota at TotalCost (raw) instead of ActualCost (TotalCost * RateMultiplier), so per-group rate multipliers — including free subscriptions (multiplier = 0) — were silently ignored. Switch the three subscription cost writes in buildUsageBillingCommand, finalizePostUsageBilling, and the legacy postUsageBilling fallback to ActualCost, and add a table-driven test covering 2x / 0.5x / free multipliers plus a balance-mode regression check.
-
- 15 Apr, 2026 1 commit
-
-
erio authored
refactor: extract ReadUpstreamResponseBody to deduplicate upstream response read + too-large error handling Consolidates 9 call sites of resolveUpstreamResponseReadLimit + readUpstreamResponseBodyLimited + ErrUpstreamResponseBodyTooLarge error handling into a single ReadUpstreamResponseBody function with TooLargeWriter callback for API-format-specific error responses (Anthropic, OpenAI, countTokens).
-
- 14 Apr, 2026 13 commits
-
-
erio authored
Security (HIGH): - Normalize all Redis cache keys to lowercase (verifyCode, passwordReset) - Fix verify code TTL renewal on failed attempts: use remaining TTL via ExpiresAt field instead of resetting to full 15-minute window - Add 3 missing fields to diffSettings audit log (promo_code, invitation_code, custom_endpoints) Code quality (MEDIUM): - Extract filterVerifiedEmails shared helper (balance_notify_service.go) - Add Pricing array non-empty validation for channel pricing rules - Add platform token semantics comment in gateway_service.go - Complete validatePlanPatch test coverage (+10 test cases) - Replace string types with QuotaThresholdType/QuotaResetMode across frontend - Remove duplicate getPlatformTextColor/getRateBadgeClass in ChannelsView - Return EMAIL_NOT_FOUND error on RemoveNotifyEmail miss UI improvements: - Reorder cost tooltip: user billing above separator, account billing below - Add NaN guard to accountBilled function - Move timezone selector inline into reset-mode row (no longer standalone)
-
erio authored
Priority was wrong: - Before: custom rules → LiteLLM (when ApplyPricingToAccountStats) → nil - After: custom rules → totalCost (when ApplyPricingToAccountStats) → LiteLLM → nil When ApplyPricingToAccountStats is enabled, use the request's actual client billing cost (before multiplier) as account_stats_cost, instead of recalculating from LiteLLM per-token prices which produced incorrect values for per-request billing mode. LiteLLM model pricing is now the final fallback (priority 3), used only when neither custom rules nor ApplyPricingToAccountStats apply.
-
erio authored
WebSearch tri-state switch: - Account-level web_search_emulation changed from bool to tri-state string: "default" (follow channel) / "enabled" / "disabled" - shouldEmulateWebSearch checks channel config when account is "default" - SQL migration converts old bool values - Frontend select replaces toggle in Edit/CreateAccountModal Account stats pricing: - resolveAccountStatsCost uses upstream model (post-mapping) for matching - Priority: custom rules → model pricing file (when toggle on) → default - Custom rules always configurable, independent of toggle - Account ID field changed to searchable selector filtered by platform - Description updated to reflect new behavior Quota notification cache fix: - CheckAccountQuotaAfterIncrement fetches real-time account from DB - Reconstructs pre-increment usage for accurate threshold crossing detection - New AccountQuotaReader interface (minimal: GetByID only) Usage tooltip: - Per-request/image billing shows per-request price instead of $0 token price - Token billing continues to show input/output price per million tokens
-
erio authored
- resolveAccountStatsCost now uses the final upstream model (after account-level mapping) to match custom pricing rules, fixing the issue where requested model (e.g. claude-sonnet-4-5) didn't match rules configured for upstream model (e.g. claude-opus-4-6) - Remove tryChannelPricing fallback — only custom rules are applied, unmatched requests use default formula (total_cost × rate) - Remove unused billingService and serviceTier parameters - Update description: "启用后将支持自定义账号统计的模型价格"
-
erio authored
- Fix cached balance causing threshold crossing to never trigger: read real-time balance from billingCacheService instead of stale API key auth snapshot - Remove email="" placeholder concept; all emails are user-managed - Only send notifications to verified && non-disabled emails - Frontend: pre-fill user's email in add input when list is empty - Remove FilterEnabledEmails/IsPrimaryDisabled helpers (no longer needed)
-
erio authored
- Fix accountCost calculation in finalizePostUsageBilling to match postUsageBilling (always multiply by AccountRateMultiplier) - Use strings.EqualFold for email dedup in collectBalanceNotifyRecipients - Extract CheckAccountQuotaAfterIncrement into smaller functions: buildQuotaDims + asyncSendQuotaAlert (< 30 lines each) - Add "not splittable" comments for HTML template functions - Extract QuotaNotifyToggle.vue sub-component to reduce QuotaLimitCard.vue from 404 to 339 lines
-
erio authored
- User balance low notification: email alert when balance drops below configurable threshold (user email + verified extra emails) - Account quota notification: broadcast email to admin-configured recipients when daily/weekly/total quota usage exceeds alert threshold - Admin settings: global enable/disable, default threshold, quota notification email list (Email Settings tab) - User profile: enable/disable, custom threshold, add/remove extra notification emails with verification code flow - Account quota: per-dimension alert toggle and threshold in quota control card - Trigger logic: first-crossing only (old >= threshold && new < threshold for balance; old < threshold && new >= threshold for quota), naturally prevents duplicate notifications without Redis dedup
-
erio authored
Allow channels to configure independent model pricing for account statistics cost calculation, decoupled from user billing. Backend: - Migration 101: channels.apply_pricing_to_account_stats toggle, channel_account_stats_pricing_rules/model_pricing tables, usage_logs.account_stats_cost column - resolveAccountStatsCost: match rules by group/account, then channel pricing, fallback to original formula when unconfigured - Integrate into both GatewayService.recordUsageCore and OpenAIGatewayService.RecordUsage - Update 8 account stats SQL queries to use COALESCE(account_stats_cost, total_cost) * account_rate_multiplier - 23 unit tests for matching, pricing lookup, and cost calculation Frontend: - Channel edit dialog: toggle + custom rules UI with group/account multi-select and pricing entry cards - API types and i18n (zh/en)
-
erio authored
Inject web search capability for Claude Console (API Key) accounts that don't natively support Anthropic's web_search tool. When a pure web_search request is detected, the gateway calls Brave Search or Tavily API directly and constructs an Anthropic-protocol-compliant SSE/JSON response without forwarding to upstream. Backend: - New `pkg/websearch/` SDK: Brave and Tavily provider implementations with io.LimitReader, proxy support, and Redis-based quota tracking (Lua atomic INCR + TTL, DECR rollback on failure) - Global config via `settings.web_search_emulation_config` (JSON) with in-process cache + singleflight, input validation, API key merge on save, and sanitized API responses - Channel-level toggle via `channels.features_config` JSONB column (DB migration 101) - Account-level toggle via `accounts.extra.web_search_emulation` - Request interception in `Forward()` with SSE streaming response construction using json.Marshal (no manual string concatenation) - Manager hot-reload: `RebuildWebSearchManager()` called on config save and startup via `SetWebSearchRedisClient()` - 70 unit tests covering providers, manager, config validation, sanitization, tool detection, query extraction, and response building Frontend: - Settings → Gateway tab: Web Search Emulation config card with global toggle, provider list (add/remove, API key, priority, quota, proxy) - Channels → Anthropic tab: web search emulation toggle with global state linkage (disabled when global off) - Account Create/Edit modals: web search emulation toggle for API Key type with Toggle component - Full i18n coverage (zh + en)
-
erio authored
-
erio authored
- Change channel cache TTL from 60s to 10min (reduce unnecessary DB queries) - Actively rebuild cache after CRUD instead of lazy invalidation - Add slog.Warn logging for channel pricing restriction blocks (4 places)
-
erio authored
- Fix 7 stale comments still mentioning "限制检查" in handlers/services - Make billingModelForRestriction explicitly list channel_mapped case - Add slog.Warn for error swallowing in ResolveChannelMapping and needsUpstreamChannelRestrictionCheck - Document sticky session upstream check exemption
-
erio authored
Move the model pricing restriction check from 8 handler entry points to the account scheduling phase (SelectAccountForModelWithExclusions / SelectAccountWithLoadAwareness), aligning restriction with billing: - requested: check original request model against pricing list - channel_mapped: check channel-mapped model against pricing list - upstream: per-account check using account-mapped model Handler layer now only resolves channel mapping (no restriction). Scheduling layer performs pre-check for requested/channel_mapped, and per-account filtering for upstream billing source.
-
- 08 Apr, 2026 3 commits
-
-
ius authored
-
shaw authored
- Sync cc_version in x-anthropic-billing-header with the fingerprint User-Agent version, preserving the message-derived suffix - Implement xxHash64-based CCH signing to replace the cch=00000 placeholder with a computed hash - Add admin toggle (enable_cch_signing) under gateway forwarding settings, disabled by default
-
shaw authored
commit f3aa54b7 的 rewriteSystemForNonClaudeCode 未能通过 Anthropic 第三方检测, 根因是两个关键信号与真实 Claude Code 不一致: 1. anthropic-beta 头缺少 claude-code-20250219:伪装路径主动将该 beta 加入 drop set 并移除,但 Anthropic 依赖此 beta 识别 Claude Code 请求。 修复:非 haiku 模型的伪装请求强制包含 claude-code beta。 2. system 字段使用 string 格式而非 array+cache_control:真实 Claude Code 始终以 [{type,text,cache_control:{type:"ephemeral"}}] 发送 system, string 格式成为第三方检测信号。 修复:rewriteSystemForNonClaudeCode 改为注入 array 格式。 附带调整:stripSystemCacheControl 按 system 是否被重写动态决定, 重写时保留 CC prompt 的 cache_control,未重写时(haiku/已含CC前缀) 保持原有剥离行为。
-
- 07 Apr, 2026 2 commits
-
-
shaw authored
-
shaw authored
Anthropic近期引入基于system参数内容的第三方应用检测机制,原有的前置追加 Claude Code提示词策略无法通过检测(后续内容仍为非Claude Code格式触发429)。 新策略:对非Claude Code客户端的OAuth/SetupToken账号请求,将system字段 完整替换为Claude Code标识提示词,原始system内容作为user/assistant消息对 注入messages开头,模型仍接收完整指令。 仅影响/v1/messages路径,chat_completions和responses路径保持原有逻辑不变。 真正的Claude Code客户端请求完全不受影响(原样透传)。
-
- 05 Apr, 2026 3 commits
-
-
erio authored
-
erio authored
Restore gateway_service.go, setting_handler.go, routes/admin.go, dto/settings.go, group_repo.go, api_key_repo.go, wire_gen.go to upstream/main versions and surgically remove only Sora references. This preserves upstream-only features (RequireOauthOnly, RequirePrivacySet, GroupResolution, etc.) that were missing when using release branch versions.
-
erio authored
-
- 04 Apr, 2026 12 commits
-
-
erio authored
- applyRequestTierOverrides now uses filterValidIntervals consistently with applyTokenOverrides (per_request/image modes were not filtering) - CostInput accepts optional pre-resolved pricing via Resolved field, eliminating duplicate Resolver.Resolve() calls in gateway billing paths
-
erio authored
- Remove unused claudeMax*Tokens constants (Claude Max feature not included) - Remove unused UsageMapHook type, SetUsageMapHook method, and usageToMap function - Fix gofmt formatting in channel_service.go, openai_model_mapping_test.go, chatcompletions_to_responses.go
-
erio authored
- Add int64(0) param to SelectAccountWithLoadAwareness callers (signature change from channel scheduling refactor) - Add UsageMapHook type and struct field to StreamingProcessor - Revert Claude Max cache billing code to upstream/main (not part of channel feature) - Revert credits overages logic to upstream/main (non-channel change) - Remove Instructions field reference (non-channel OpenAI feature) - Restore sora_client_handler_test.go from upstream + add channel service nil params
-
erio authored
- Change channel cache TTL from 60s to 10min (reduce unnecessary DB queries) - Actively rebuild cache after CRUD instead of lazy invalidation - Add slog.Warn logging for channel pricing restriction blocks (4 places)
-
erio authored
P0-1: Credits degraded response retry + fail-open - Add isAntigravityDegradedResponse() to detect transient API failures - Retry up to 3 times with exponential backoff (500ms/1s/2s) - Invalidate singleflight cache between retries - Fail-open after exhausting retries instead of 5h circuit break P1-1: Fix channel restriction pre-check timing conflict - Swap checkClaudeCodeRestriction before checkChannelPricingRestriction - Ensures channel restriction is checked against final fallback groupID P1-2: Add interval pricing validation (frontend + backend) - Backend: ValidateIntervals() with boundary, price, overlap checks - Frontend: validateIntervals() with Chinese error messages - Rules: MinTokens>=0, MaxTokens>MinTokens, prices>=0, no overlap P2: Fix cross-platform same-model pricing/mapping override - Store cache keys using original platform instead of group platform - Lookup across matching platforms (antigravity→anthropic→gemini) - Prevents anthropic/gemini same-name models from overwriting each other
-
erio authored
- Fix 7 stale comments still mentioning "限制检查" in handlers/services - Make billingModelForRestriction explicitly list channel_mapped case - Add slog.Warn for error swallowing in ResolveChannelMapping and needsUpstreamChannelRestrictionCheck - Document sticky session upstream check exemption
-
erio authored
Move the model pricing restriction check from 8 handler entry points to the account scheduling phase (SelectAccountForModelWithExclusions / SelectAccountWithLoadAwareness), aligning restriction with billing: - requested: check original request model against pricing list - channel_mapped: check channel-mapped model against pricing list - upstream: per-account check using account-mapped model Handler layer now only resolves channel mapping (no restriction). Scheduling layer performs pre-check for requested/channel_mapped, and per-account filtering for upstream billing source.
-
erio authored
- Extract resolveChannelPricing to DRY the resolver pattern shared by calculateImageCost/calculateTokenCost - Remove unnecessary IIFE wrapper and pass accountRateMultiplier as parameter - Extract resolveBillingMode, resolveMediaType, optionalSubscriptionID to simplify buildRecordUsageLog (104→65 lines) - Extract shouldDeductAPIKeyQuota/shouldUpdateRateLimits/shouldUpdateAccountQuota methods on postUsageBillingParams to unify duplicated billing conditions
-
erio authored
- Extract recordUsageCore with recordUsageOpts for parameterized differences - RecordUsage (276 lines) → thin wrapper (~40 lines) - RecordUsageWithLongContext (251 lines) → thin wrapper (~20 lines) - Split billing logic into calculateSoraMediaCost, calculateImageCost, calculateTokenCost sub-functions - Extract buildRecordUsageLog for usage log construction - Net reduction: -79 lines, eliminated ~170 lines of duplication
-
erio authored
- PricingSourceChannel/LiteLLM/Fallback for resolver source - MediaTypeImage/Video/Prompt for result.MediaType - Reuse BillingModeToken/BillingModeImage for billing mode - Reuse BillingModelSourceChannelMapped/PlatformAnthropic in handler
-
erio authored
Instead of hardcoding BillingMode="image" when ImageCount>0, let cost.BillingMode (set by CalculateCostUnified/CalculateImageCost) take priority. This ensures channel token pricing shows "token" mode.
-
erio authored
When ImageCount > 0, check if channel has token pricing configured: - YES (source=channel, mode=token) → use token billing with image_output_tokens - NO → fall back to CalculateImageCost (original per-image billing) This allows channels to configure $/MTok pricing for image generation models while maintaining backward compatibility for setups without channel pricing.
-