- 28 Apr, 2026 3 commits
-
-
alfadb authored
Anthropic streaming path (gateway_service.go) returned a plain error on upstream SSE read failure, so the handler-level UpstreamFailoverError check never fired and the client received a bare `stream_read_error` event, breaking long-running tasks even when no bytes had been written yet. The most common trigger is HTTP/2 GOAWAY from api.anthropic.com edge backends doing graceful rotation: Go's http.Transport surfaces this as `unexpected EOF` and never auto-retries. Mirror what the OpenAI and antigravity gateways already do: when the read error happens before any byte has reached the client (`!c.Writer.Written()`), return `*UpstreamFailoverError{StatusCode: 502, RetryableOnSameAccount: true}` so the handler can retry on the same or another account. After client output has begun, SSE has no resume protocol — keep the existing passthrough behavior. Tests cover both branches via streamReadCloser-based fixtures. Co-Authored-By:Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
Wesley Liddick authored
feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin)
-
DaydreamCoding authored
对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游 service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的 pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。 后端核心 - 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、 OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope × 模型白名单 × fallback action 维度。 - SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略 (所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计 依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定 model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON 解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。 - service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex) 与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。 抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings 快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext), WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。 HTTP 入口(4 个) - Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次 命中)、原生 Responses、Passthrough Responses 全部接入 applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block 返回 403 forbidden_error JSON。 - 4 入口统一使用 upstream 视角的 model(GetMappedModel + normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug), 避免 chat/messages/native /responses/passthrough 因为 model 维度不同 造成 whitelist 命中差异。 - 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body, 否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游 导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier 此前已具备同等行为)。 WebSocket 入口 - 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配 type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段, block 返回 typed *OpenAIFastBlockedError。 - ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime 风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation =1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于 close。 - passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过 openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream 帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。 filter 闭包内同时侦测 session.update / session.created 帧的 session.model 字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→ session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback 到 gpt-4o"的 mid-session 绕过路径。 - passthrough billing:requestServiceTier 在策略 filter 之后再从 firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier 上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map) / WS ingress(payload 来自 post-filter bytes)的语义一致。 - 错误事件 schema:{event_id: "evt_<32hex>", type: "error", error: {type: "forbidden_error", code: "policy_violation", message}}, 与 OpenAI codex 客户端 error event 解析兼容。 Admin / Frontend - dto.SystemSettings / UpdateSettingsRequest 新增 openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。 - Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片: service_tier × action × scope × 模型白名单 × fallback action 全字段配置。 - 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写, 避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段, 由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端 omitempty 行为。 测试 - HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有 模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 / block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI fast policy 等场景。 - WebSocket 路径:openai_fast_policy_ws_test.go 覆盖 helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type 帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+ pass 路径 fast 别名归一化回归 + ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收 error event 再收 close 1008 且上游 0 写)+ passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧 建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+ passthrough session.update / session.created 旋转 capturedSessionModel 的 mid-session 绕过回归 + passthrough billing post-filter ServiceTier 与 idempotent filter 回归。 Co-Authored-By:Claude Opus 4.7 (1M context) <noreply@anthropic.com>
-
- 27 Apr, 2026 2 commits
-
-
Wesley Liddick authored
fix(anthropic): drop empty Read.pages in responses-to-anthropic tool input
-
Wesley Liddick authored
fix(openai): avoid implicit image sticky sessions
-
- 26 Apr, 2026 7 commits
-
-
gaoren002 authored
-
Cloud370 authored
-
github-actions[bot] authored
-
Wesley Liddick authored
fix(payment): 修复 Zpay 退款接口调用
-
Wesley Liddick authored
fix(anthropic): 修正缓存 token 的 Anthropic 用量语义
-
Nobody-Zhang authored
-
shaw authored
- 修复返利不到账的根因:tryClaimAffiliateRebateAudit 中 PostgreSQL 参数类型推断冲突 - 补全 OAuth 注册路径(LinuxDo/OIDC/WeChat/Pending Flow)的邀请码绑定 - 前端 OAuth 注册页面传递 aff_code 参数 - 新增返利冻结期机制:可配置冻结时间,到期后自动解冻(懒解冻) - 新增返利有效期:绑定后 N 天内有效,过期不再产生返利 - 新增单人返利上限:超出上限部分精确截断 - 增强返利流程 slog 结构化日志,便于排查问题 - 已邀请用户列表增加返利明细列
-
- 25 Apr, 2026 22 commits
-
-
deqiying authored
-
shaw authored
PR #1914 unconditionally applied the full mimicry pipeline to all OAuth accounts, including real Claude Code CLI clients. This replaced the client's long system prompt (~10K+ tokens with stable cache_control breakpoints) with a short ~45 token [billing, CC prompt] pair, which falls below Anthropic's 1024-token minimum cacheable prefix threshold. The result: every request created a new cache but never hit an existing one. Fix: restore the Claude Code client detection gate so that real CC clients bypass body-level mimicry (system rewrite, message cache management, tool name obfuscation). Non-CC third-party clients (opencode, etc.) continue to receive full mimicry. Also harden the detection logic: - Make UA regex case-insensitive (align with claude_code_validator.go) - Validate metadata.user_id format via ParseMetadataUserID() instead of just checking non-empty, preventing third-party tools from spoofing a claude-cli/* UA with an arbitrary user_id string to bypass mimicry
-
shaw authored
Stripe payment routes (/payment/stripe, /payment/stripe-popup) are reached via hard navigation (window.location.href), which caused the router guard to block access before the page could load. Set requiresAuth and requiresPayment to false, consistent with /payment/result. Backend API still enforces authentication.
-
shaw authored
- gofmt: realign AffiliateDetail struct tags in affiliate_service.go - ineffassign: remove dead seenCompleted assignment before return in account_test_service.go
-
Wesley Liddick authored
fix(openai): tighten responses stream account tests
-
Wesley Liddick authored
fix(openai): keep responses stream alive during pre-output failover
-
shaw authored
- 在系统设置「功能开关」中新增邀请返利总开关,默认关闭; 关闭态:菜单隐藏、注册忽略 aff、新充值不返利,但已有 quota 仍可转余额 - 支持管理员为指定用户设置专属邀请码(覆盖随机码,全局唯一) - 支持管理员为指定用户设置专属返利比例(覆盖全局比例,可单条/批量调整) - 在系统设置邀请返利卡片内嵌入专属用户管理表格(搜索/编辑/批量/删除), 删除采用项目通用 ConfirmDialog,会同时清除专属比例并把邀请码重置为系统随机码 - /affiliate 用户页新增「我的返利比例」卡片与动态使用说明,让用户直观看到 分享后能拿到多少(同源 resolveRebateRatePercent 计算,与实际充值一致) - 新增数据库迁移 132 添加 aff_rebate_rate_percent 与 aff_code_custom 列 - 新增 admin 路由组 /api/v1/admin/affiliates/users/* 共 5 个端点 - AffiliateService 改为只依赖 *SettingService,去除冗余的 SettingRepository - 邀请码格式校验放宽到 [A-Z0-9_-]{4,32},兼容旧 12 位系统码与新自定义码 - 补充单元测试与集成测试覆盖新方法、冲突路径与边界值 -
gaoren002 authored
-
hungryboy1025 authored
-
github-actions[bot] authored
-
Wesley Liddick authored
fix(openai): 修复 Responses 流式失败前置事件导致无法 failover
-
AyeSt0 authored
-
Wesley Liddick authored
fix(openai): bump codex CLI version from 0.104.0 to 0.125.0
-
shaw authored
将 vansour/sub2api#1555 的 OpenAI compact 能力建模手工移植到当前 main:账号 级 compact 状态/auto-force_on-force_off 模式、compact-only 模型映射、调度器 tier 分层(已支持 > 未知 > 已知不支持)、管理后台 compact 主动探测,以及对应 i18n/状态徽章。普通 /responses 流量行为不变,无数据库迁移。
-
4fuu authored
The hardcoded codex CLI version (0.104.0) causes upstream rejection when using gpt-5.5 with compact, as the server treats the request as an outdated client and returns 400/502. Update codexCLIVersion, codexCLIUserAgent, and openAICodexProbeVersion to 0.125.0 to match the current Codex CLI release. Fixes #1933, #1887, #1865 Related: #1609, #1298, #849
-
Wesley Liddick authored
[codex] reconcile OpenAI admin test rate-limit state
-
shaw authored
VISIBLE_METHOD_ALIASES 漏了 stripe,导致 getVisibleMethods 把后端返回 的 stripe 过滤掉。点 Stripe 按钮时省略 method 查询参数,让落地页渲染 完整的 Payment Element。
-
shaw authored
-
Wesley Liddick authored
fix(apicompat): recognize web_search_20250305 / google_search in Responses→Anthropic tool conversion
-
shaw authored
- staticcheck QF1001: apply De Morgan's law to the OAuth-mimic header passthrough guard (`!(a && b)` → `a != ... || !b`). - unused: drop `isClaudeCodeRequest`, which became dead after PR #1914 switched both `/v1/messages` and `/count_tokens` paths to unconditional `account.IsOAuth()` mimicry. The lowercase helper `isClaudeCodeClient` is kept (still referenced by `TestIsClaudeCodeClient`).
-
Wesley Liddick authored
fix(claude): align Claude Code OAuth mimicry with real CLI traffic
-
shaw authored
- Drop SetAffiliateService setters and ProvideAuthService / ProvidePaymentService / ProvideUserHandler wrappers in favor of direct Wire constructor injection. AffiliateService has no back-edge to Auth/Payment/User, so the indirection was never required. - Change RegisterWithVerification's variadic affiliateCode to a fixed parameter; adjust all call sites. - Validate aff_code length and charset in BindInviterByCode before any DB lookup, eliminating timing-side-channel and useless DB roundtrips on malformed input. - Make affiliate cache invalidation synchronous; surface Redis errors via the project logger instead of swallowing them in a detached goroutine. - Add an integration test guarding cross-layer tx propagation in AccrueQuota and a unit test pinning the aff_code format rules.
-
- 24 Apr, 2026 6 commits
-
-
Wuxie233 authored
fix(apicompat): recognize web_search_20250305 / google_search in Responses to Anthropic tool conversion
-
keh4l authored
Root cause of persistent third-party detection: sub2api's buildUpstreamRequest transparently forwards client headers via allowedHeaders whitelist (addHeaderRaw) before applying mimicry overrides. When third-party clients (opencode, etc.) send their own anthropic-beta / user-agent / x-stainless-* / x-claude-code-session-id values, these get appended to the request alongside our injected headers, creating an inconsistent header set that Anthropic detects. Parrot's build_upstream_headers constructs exactly 9 headers from scratch and never forwards anything from the client. This is why 'same opencode version, some users work some don't' — different opencode configs/versions send different header combinations. Fix: when tokenType=oauth and mimicClaudeCode=true, skip the client header passthrough loop entirely. The subsequent applyClaudeCodeMimicHeaders + ApplyFingerprint + beta merge pipeline constructs all necessary headers from our controlled values. Also: remove systemIncludesClaudeCodePrompt gate — OAuth accounts now unconditionally rewrite system (even if client already sent a Claude Code-style prompt), ensuring billing attribution block is always present.
-
keh4l authored
Before: isClaudeCodeRequest() checked whether the client looks like a real Claude Code CLI (UA, system prompt, X-App header, metadata format). If it looked like Claude Code, all mimicry was skipped — the assumption being that a real CLI needs no help. Problem: third-party tools like opencode partially impersonate Claude Code (sending claude-cli UA + claude-code beta + CC system prompt) but miss critical details (billing attribution block, tool-name obfuscation, cache breakpoints, full beta set). Some users' opencode instances pass the isClaudeCodeRequest check, causing sub2api to skip mimicry entirely, while Anthropic still detects the request as third-party. This explains why 'same opencode version, some users work, some don't' — it depends on which opencode features/config trigger the validator. Fix: OAuth accounts now unconditionally run the full mimicry pipeline, matching Parrot's behavior (Parrot never checks client identity). This is safe because our mimicry is strictly more complete than any third-party client's partial impersonation. Changed: - /v1/messages path: remove isClaudeCode gate - /v1/messages/count_tokens path: same
-
keh4l authored
The previous commit only wired stripMessageCacheControl, addMessageCacheBreakpoints, and tool-name obfuscation into applyClaudeCodeOAuthMimicryToBody (used by /chat/completions and /responses). The native /v1/messages path and count_tokens path have their own independent mimicry code blocks and were missed. Now all three entry points share the same D/E/F pipeline: - /v1/messages (gateway_service.go forwardAnthropic) - /v1/messages/count_tokens (gateway_service.go countTokens) - OpenAI compat (applyClaudeCodeOAuthMimicryToBody)
-
keh4l authored
Implements the remaining three parity items with Parrot cc_mimicry: D) Tool-name obfuscation - Dynamic mapping when tools.length > 5 (matches Parrot threshold). Fake names follow {prefix}{name[:3]}{i:02d} (e.g. 'manage_bas00'). Go port of random.Random(hash(tuple(names))) uses fnv64a seed + math/rand; byte-exact reproduction is impossible (Python hash vs Go hash), but the two invariants that matter are preserved: * same input tool_names yield identical mapping (cache hit) * prefix pool is shuffled (names look distributed) - Static prefix map (sessions_ -> cc_sess_, session_ -> cc_ses_) applied as fallback, matching Parrot TOOL_NAME_REWRITES verbatim. - Server tools (web_search_20250305, computer_*, etc.) are NOT renamed; only type=='function' and type=='custom' tools are. - tool_choice.name is rewritten in sync (only when type=='tool'). - Response side: bytes-level replace on every SSE chunk / JSON body at 6 injection points (standard stream/non-stream, passthrough stream/non-stream, chat_completions stream + non-stream, responses stream + non-stream). Reverse mapping applied longest-fake-name-first to prevent substring conflicts (parity with Parrot _restore_tool_names_in_chunk). - tool_choice is no longer unconditionally deleted in normalizeClaudeOAuthRequestBody — Parrot passes it through. E) tools[-1] cache_control breakpoint - Injected as {type:ephemeral, ttl:<DefaultCacheControlTTL>} when the last tool has no cache_control. Client-provided ttl is passed through unchanged (repo-wide policy). F) messages cache_control strategy - stripMessageCacheControl removes every client-provided messages[*].content[*].cache_control (multi-turn stability). - addMessageCacheBreakpoints then injects two stable breakpoints: (1) last message, and (2) second-to-last user turn when messages.length >= 4. - Combined with the system block breakpoint and tools[-1] breakpoint, this gives exactly the 4 breakpoints Anthropic allows per request. Non-trivial implementation details to be aware of when rebasing: * Two new files, no upstream collision: gateway_tool_rewrite.go (D + E algorithms) gateway_messages_cache.go (F strip + breakpoints) * Two new feature calls bolted onto the tail of applyClaudeCodeOAuthMimicryToBody in gateway_service.go — rebase conflicts will be ~10 lines maximum. * Response-side injection points all wrap their existing write with reverseToolNamesIfPresent(c, ...), preserving original behavior when no mapping is stored (static prefix rollback still runs). * Non-stream chat/responses switched from c.JSON to json.Marshal + c.Data so bytes-level replace is possible. * Retry bodies (FilterThinkingBlocksForRetry, FilterSignatureSensitiveBlocksForRetry, RectifyThinkingBudget) only prune blocks — they preserve the already-obfuscated tool names, so no extra mapping re-application is needed. Manual QA: end-to-end scenario verified with 6 tools (above threshold) and tool_choice.type=='tool'. Obfuscation + restore roundtrip shown in test logs; then removed the temp test file. Tests (16 new): - buildDynamicToolMap stability + below-threshold guard - sanitizeToolName precedence (dynamic > static) - restoreToolNamesInBytes longest-first + static rollback - applyToolNameRewriteToBody skips server tools + syncs tool_choice - applyToolsLastCacheBreakpoint defaults to 5m + passes client ttl - stripMessageCacheControl + addMessageCacheBreakpoints in the 1/4/string-content cases + second-to-last user turn selection - buildToolNameRewriteFromBody ReverseOrdered is desc-by-fake-length - fake name shape follows Parrot {prefix}{head3}{i:02d} -
keh4l authored
Three field-level alignments in normalizeClaudeOAuthRequestBody to match real Claude Code CLI traffic byte-for-byte: 1. temperature: previously deleted unconditionally; now passes through client value, defaults to 1 when absent (real CLI always sends temperature, default 1). 2. max_tokens: defaults to 128000 when absent (real CLI default). 3. context_management: when thinking.type is enabled/adaptive and the client did not provide context_management, inject {"edits":[{"type":"clear_thinking_20251015","keep":"all"}]} to mirror real CLI behavior. tool_choice removal is unchanged (Claude Code OAuth credentials do not allow client-supplied tool_choice). Tests updated: - gateway_body_order_test.go: temperature/max_tokens are now expected in output; tool_choice still removed. - gateway_prompt_test.go: system array is now 2 blocks (billing + cc prompt), assertions adjusted. - gateway_anthropic_apikey_passthrough_test.go: same 2-block assertion.
-