1. 28 Apr, 2026 2 commits
    • Oganneson's avatar
      fix(openai): drop reasoning items from /v1/responses input on OAuth path · 7452fad8
      Oganneson authored
      Closes #1957
      
      The OAuth path forwards client requests to chatgpt.com/backend-api/codex/responses,
      where applyCodexOAuthTransform forces store=false (chatgpt.com's codex backend
      rejects store=true). Reasoning items emitted under store=false are NEVER
      persisted upstream, so any rs_* reference that a client carries forward in a
      subsequent input[] array triggers a guaranteed upstream 404:
      
          Item with id 'rs_...' not found. Items are not persisted when `store` is
          set to false. Try again with `store` set to true, or remove this item
          from your input.
      
      sub2api wraps this as 502 "Upstream request failed" and the conversation
      breaks on every multi-turn /v1/responses request that uses reasoning + tools
      (reproducible with gpt-5.5; gpt-5.4 happens to dodge it because the upstream
      does not emit reasoning items for that model).
      
      Affected clients include any that follow the OpenAI Responses API spec and
      replay prior assistant items verbatim — in practice this hit OpenClaw and
      similar agent harnesses on every turn ≥2 with tool use.
      
      The fix: in filterCodexInput, drop input items with type == "reasoning"
      entirely. The model never reads reasoning summary text from input (only
      encrypted_content can carry reasoning context across turns, and chatgpt.com
      under store=false does not emit it), so this is a no-op for the model itself
      and a clean removal of unreachable upstream lookups.
      
      Scope is intentionally narrow:
        * Only OAuth account requests (account.Type == AccountTypeOAuth) reach
          applyCodexOAuthTransform / filterCodexInput.
        * API-key accounts going to api.openai.com/v1/responses are unaffected
          (store=true works there, rs_* persists, multi-turn already works).
        * Anthropic / Gemini platform groups go through different transforms and
          are unaffected.
        * /v1/chat/completions is unaffected (no reasoning items).
        * item_reference items (different type) are unaffected — only type ==
          "reasoning" is dropped.
      
      Verification:
        * Existing tests pass: go test ./internal/service/ -run Codex|Tool|OAuth
        * New regression test asserts reasoning items are dropped under both
          preserveReferences=true and preserveReferences=false.
        * End-to-end repro on gpt-5.5 multi-turn + tools: pre-patch 502, post-patch
          200. Repro on gpt-5.4 unchanged. Three-turn deep loop on gpt-5.5 passes.
      7452fad8
    • DaydreamCoding's avatar
      feat(openai): OpenAI Fast/Flex Policy 完整实现(HTTP + WebSocket + Admin) · 30f55a1f
      DaydreamCoding authored
      
      
      对称参照 Claude BetaPolicy 的 fast-mode 过滤实现,新增针对 OpenAI 上游
      service_tier 字段(priority / flex,含客户端 "fast" → "priority" 归一化)的
      pass / filter / block 三态策略,覆盖全部 OpenAI 入口 + admin 配置入口。
      
      后端核心
      - 新增 SettingKeyOpenAIFastPolicySettings、OpenAIFastPolicyRule、
        OpenAIFastPolicySettings 配置模型,含规则的 service_tier × action × scope
        × 模型白名单 × fallback action 维度。
      - SettingService.Get/SetOpenAIFastPolicySettings;缺失时返回内置默认策略
        (所有模型的 priority 走 filter,whitelist 为空,fallback=pass)。设计
        依据:service_tier=fast 是用户级开关,与 model 字段正交,默认锁定特定
        model slug 会留下"用 gpt-4 + fast 透传 priority 上游"的绕过路径。JSON
        解析失败不再静默 fallback,slog.Warn 记录脏数据,便于运维定位。
      - service_tier 归一化(trim + ToLower + fast→priority + 白名单 priority/flex)
        与策略评估(evaluateOpenAIFastPolicy)作为唯一真实来源,HTTP / WS 共用。
        抽出纯函数 evaluateOpenAIFastPolicyWithSettings,配合 ctx-bound settings
        快照(withOpenAIFastPolicyContext / openAIFastPolicySettingsFromContext),
        WS 长会话入口预取一次后所有帧复用,避免每帧打到 settingService。
      
      HTTP 入口(4 个)
      - Chat Completions、Anthropic 兼容(Messages,含 BetaFastMode→priority 二次
        命中)、原生 Responses、Passthrough Responses 全部接入
        applyOpenAIFastPolicyToBody,filter 走 sjson 顶层删除 service_tier,block
        返回 403 forbidden_error JSON。
      - 4 入口统一使用 upstream 视角的 model(GetMappedModel +
        normalizeOpenAIModelForUpstream + Codex OAuth normalize 后的 slug),
        避免 chat/messages/native /responses/passthrough 因为 model 维度不同
        造成 whitelist 命中差异。
      - 在 pass 路径也把客户端 "fast" 别名归一化为 "priority" 写回 body,
        否则 native /responses 与 passthrough 入口会把 "fast" 原样透传给上游
        导致 400/拒绝(chat-completions 入口的 normalizeResponsesBodyServiceTier
        此前已具备同等行为)。
      
      WebSocket 入口
      - 新增 applyOpenAIFastPolicyToWSResponseCreate:严格匹配
        type="response.create",仅处理顶层 service_tier;filter 用 sjson 删字段,
        block 返回 typed *OpenAIFastBlockedError。
      - ingress 路径在 parseClientPayload 内调用,block 命中先 Write Realtime
        风格 error event 再返回 OpenAIWSClientCloseError(StatusPolicyViolation
        =1008),依赖底层 WebSocket Conn.Write 的同步 flush 保证 error 先于
        close。
      - passthrough 路径在 RunEntry 前对 firstClientMessage 应用策略,并通过
        openAIWSPolicyEnforcingFrameConn 包装 ReadFrame 对每个 client→upstream
        帧执行策略;后续帧无 model 字段时回退到 capturedSessionModel。
        filter 闭包内同时侦测 session.update / session.created 帧的 session.model
        字段刷新 capturedSessionModel,封堵"首帧 model=gpt-4o(pass)→
        session.update 改为 gpt-5.5 → 不带 model 的 response.create fallback
        到 gpt-4o"的 mid-session 绕过路径。
      - passthrough billing:requestServiceTier 在策略 filter 之后再从
        firstClientMessage 提取,filter 命中时 OpenAIForwardResult.ServiceTier
        上报 nil(default tier),与 HTTP 入口(reqBody 来自 post-filter map)
        / WS ingress(payload 来自 post-filter bytes)的语义一致。
      - 错误事件 schema:{event_id: "evt_<32hex>", type: "error",
        error: {type: "forbidden_error", code: "policy_violation", message}},
        与 OpenAI codex 客户端 error event 解析兼容。
      
      Admin / Frontend
      - dto.SystemSettings / UpdateSettingsRequest 新增
        openai_fast_policy_settings 字段(omitempty),bulk GET/PUT 接入。
      - Settings 页 Gateway 页签新增 Fast/Flex Policy 表单卡片:
        service_tier × action × scope × 模型白名单 × fallback action 全字段配置。
      - 前端守门:openaiFastPolicyLoaded 标志仅在 GET 真带回字段时才允许回写,
        避免 rollout/错误把默认规则覆盖成空;saveSettings 回写循环 skip 该字段,
        由专用刷新逻辑处理;仅 action=block 时发送 error_message,匹配后端
        omitempty 行为。
      
      测试
      - HTTP 路径:openai_fast_policy_test.go 覆盖默认配置(whitelist=[],所有
        模型 priority filter)/ block 自定义错误 / scope 区分 / filter 删字段 /
        block 不改 body / block 短路上游 / Anthropic BetaFastMode 触发 OpenAI
        fast policy 等场景。
      - WebSocket 路径:openai_fast_policy_ws_test.go 覆盖
          helper 单元(filter / fast→priority 归一化 / flex 透传 / block typed
          error / 无 service_tier 字节不变 / 非 response.create 帧不动 / 空 type
          帧不动 / event_id+code 字段断言 / 非字符串 service_tier 容错)+
          pass 路径 fast 别名归一化回归 +
          ingress 端到端(filter 后上游不含 service_tier / block 后客户端先收
          error event 再收 close 1008 且上游 0 写)+
          passthrough capturedSessionModel fallback 用例(whitelist 策略下首帧
          建立、缺 model 命中 fallback、缺少 fallback 时的 leak 文档化)+
          passthrough session.update / session.created 旋转 capturedSessionModel
          的 mid-session 绕过回归 +
          passthrough billing post-filter ServiceTier 与 idempotent filter 回归。
      Co-Authored-By: default avatarClaude Opus 4.7 (1M context) <noreply@anthropic.com>
      30f55a1f
  2. 26 Apr, 2026 4 commits
  3. 25 Apr, 2026 11 commits
    • deqiying's avatar
      b17704d6
    • shaw's avatar
      fix(gateway): skip body mimicry for real Claude Code clients to restore prompt caching · 496469ac
      shaw authored
      PR #1914 unconditionally applied the full mimicry pipeline to all OAuth
      accounts, including real Claude Code CLI clients. This replaced the
      client's long system prompt (~10K+ tokens with stable cache_control
      breakpoints) with a short ~45 token [billing, CC prompt] pair, which
      falls below Anthropic's 1024-token minimum cacheable prefix threshold.
      The result: every request created a new cache but never hit an existing
      one.
      
      Fix: restore the Claude Code client detection gate so that real CC
      clients bypass body-level mimicry (system rewrite, message cache
      management, tool name obfuscation). Non-CC third-party clients
      (opencode, etc.) continue to receive full mimicry.
      
      Also harden the detection logic:
      - Make UA regex case-insensitive (align with claude_code_validator.go)
      - Validate metadata.user_id format via ParseMetadataUserID() instead of
        just checking non-empty, preventing third-party tools from spoofing
        a claude-cli/* UA with an arbitrary user_id string to bypass mimicry
      496469ac
    • shaw's avatar
      style: fix gofmt and ineffassign lint errors · 3af9940b
      shaw authored
      - gofmt: realign AffiliateDetail struct tags in affiliate_service.go
      - ineffassign: remove dead seenCompleted assignment before return in account_test_service.go
      3af9940b
    • shaw's avatar
      feat(affiliate): add feature toggle and per-user custom invite settings · 4e1bb2b4
      shaw authored
      - 在系统设置「功能开关」中新增邀请返利总开关,默认关闭;
        关闭态:菜单隐藏、注册忽略 aff、新充值不返利,但已有 quota 仍可转余额
      - 支持管理员为指定用户设置专属邀请码(覆盖随机码,全局唯一)
      - 支持管理员为指定用户设置专属返利比例(覆盖全局比例,可单条/批量调整)
      - 在系统设置邀请返利卡片内嵌入专属用户管理表格(搜索/编辑/批量/删除),
        删除采用项目通用 ConfirmDialog,会同时清除专属比例并把邀请码重置为系统随机码
      - /affiliate 用户页新增「我的返利比例」卡片与动态使用说明,让用户直观看到
        分享后能拿到多少(同源 resolveRebateRatePercent 计算,与实际充值一致)
      - 新增数据库迁移 132 添加 aff_rebate_rate_percent 与 aff_code_custom 列
      - 新增 admin 路由组 /api/v1/admin/affiliates/users/* 共 5 个端点
      - AffiliateService 改为只依赖 *SettingService,去除冗余的 SettingRepository
      - 邀请码格式校验放宽到 [A-Z0-9_-]{4,32},兼容旧 12 位系统码与新自定义码
      - 补充单元测试与集成测试覆盖新方法、冲突路径与边界值
      4e1bb2b4
    • gaoren002's avatar
    • hungryboy1025's avatar
      8987e0ba
    • AyeSt0's avatar
      5b63a9b0
    • shaw's avatar
      feat(openai): port /responses/compact account support flow (PR #1555) · 095f457c
      shaw authored
      将 vansour/sub2api#1555 的 OpenAI compact 能力建模手工移植到当前 main:账号
      级 compact 状态/auto-force_on-force_off 模式、compact-only 模型映射、调度器
      tier 分层(已支持 > 未知 > 已知不支持)、管理后台 compact 主动探测,以及对应
      i18n/状态徽章。普通 /responses 流量行为不变,无数据库迁移。
      095f457c
    • 4fuu's avatar
      fix(openai): bump codex CLI version from 0.104.0 to 0.125.0 · 1e57e88e
      4fuu authored
      The hardcoded codex CLI version (0.104.0) causes upstream rejection
      when using gpt-5.5 with compact, as the server treats the request
      as an outdated client and returns 400/502.
      
      Update codexCLIVersion, codexCLIUserAgent, and openAICodexProbeVersion
      to 0.125.0 to match the current Codex CLI release.
      
      Fixes #1933, #1887, #1865
      Related: #1609, #1298, #849
      1e57e88e
    • shaw's avatar
      chore(gateway): fix lint issues from cc-mimicry-parity merge · 732d6495
      shaw authored
      - staticcheck QF1001: apply De Morgan's law to the OAuth-mimic header
        passthrough guard (`!(a && b)` → `a != ... || !b`).
      - unused: drop `isClaudeCodeRequest`, which became dead after PR #1914
        switched both `/v1/messages` and `/count_tokens` paths to unconditional
        `account.IsOAuth()` mimicry. The lowercase helper `isClaudeCodeClient`
        is kept (still referenced by `TestIsClaudeCodeClient`).
      732d6495
    • shaw's avatar
      refactor(affiliate): tighten DI and harden inviter code validation · aa8ee33b
      shaw authored
      - Drop SetAffiliateService setters and ProvideAuthService /
        ProvidePaymentService / ProvideUserHandler wrappers in favor of direct
        Wire constructor injection. AffiliateService has no back-edge to
        Auth/Payment/User, so the indirection was never required.
      - Change RegisterWithVerification's variadic affiliateCode to a fixed
        parameter; adjust all call sites.
      - Validate aff_code length and charset in BindInviterByCode before any
        DB lookup, eliminating timing-side-channel and useless DB roundtrips
        on malformed input.
      - Make affiliate cache invalidation synchronous; surface Redis errors
        via the project logger instead of swallowing them in a detached
        goroutine.
      - Add an integration test guarding cross-layer tx propagation in
        AccrueQuota and a unit test pinning the aff_code format rules.
      aa8ee33b
  4. 24 Apr, 2026 21 commits
    • Wuxie233's avatar
      fix(apicompat): recognize web_search_20250305 / google_search in Responses to... · 5f630fbb
      Wuxie233 authored
      fix(apicompat): recognize web_search_20250305 / google_search in Responses to Anthropic tool conversion
      5f630fbb
    • keh4l's avatar
      fix(gateway): skip client header passthrough on OAuth mimicry path · bdbd2916
      keh4l authored
      Root cause of persistent third-party detection: sub2api's
      buildUpstreamRequest transparently forwards client headers via
      allowedHeaders whitelist (addHeaderRaw) before applying mimicry
      overrides. When third-party clients (opencode, etc.) send their own
      anthropic-beta / user-agent / x-stainless-* / x-claude-code-session-id
      values, these get appended to the request alongside our injected
      headers, creating an inconsistent header set that Anthropic detects.
      
      Parrot's build_upstream_headers constructs exactly 9 headers from
      scratch and never forwards anything from the client. This is why
      'same opencode version, some users work some don't' — different
      opencode configs/versions send different header combinations.
      
      Fix: when tokenType=oauth and mimicClaudeCode=true, skip the
      client header passthrough loop entirely. The subsequent
      applyClaudeCodeMimicHeaders + ApplyFingerprint + beta merge
      pipeline constructs all necessary headers from our controlled values.
      
      Also: remove systemIncludesClaudeCodePrompt gate — OAuth accounts
      now unconditionally rewrite system (even if client already sent a
      Claude Code-style prompt), ensuring billing attribution block is
      always present.
      bdbd2916
    • keh4l's avatar
      fix(gateway): always apply full mimicry for OAuth accounts regardless of client identity · 6dc89765
      keh4l authored
      Before: isClaudeCodeRequest() checked whether the client looks like a
      real Claude Code CLI (UA, system prompt, X-App header, metadata format).
      If it looked like Claude Code, all mimicry was skipped — the assumption
      being that a real CLI needs no help.
      
      Problem: third-party tools like opencode partially impersonate Claude
      Code (sending claude-cli UA + claude-code beta + CC system prompt) but
      miss critical details (billing attribution block, tool-name obfuscation,
      cache breakpoints, full beta set). Some users' opencode instances pass
      the isClaudeCodeRequest check, causing sub2api to skip mimicry entirely,
      while Anthropic still detects the request as third-party.
      
      This explains why 'same opencode version, some users work, some don't'
      — it depends on which opencode features/config trigger the validator.
      
      Fix: OAuth accounts now unconditionally run the full mimicry pipeline,
      matching Parrot's behavior (Parrot never checks client identity).
      This is safe because our mimicry is strictly more complete than any
      third-party client's partial impersonation.
      
      Changed:
        - /v1/messages path: remove isClaudeCode gate
        - /v1/messages/count_tokens path: same
      6dc89765
    • keh4l's avatar
      fix(gateway): apply D/E/F mimicry to native /v1/messages and count_tokens paths · f3233db0
      keh4l authored
      The previous commit only wired stripMessageCacheControl,
      addMessageCacheBreakpoints, and tool-name obfuscation into
      applyClaudeCodeOAuthMimicryToBody (used by /chat/completions and
      /responses). The native /v1/messages path and count_tokens path
      have their own independent mimicry code blocks and were missed.
      
      Now all three entry points share the same D/E/F pipeline:
        - /v1/messages (gateway_service.go forwardAnthropic)
        - /v1/messages/count_tokens (gateway_service.go countTokens)
        - OpenAI compat (applyClaudeCodeOAuthMimicryToBody)
      f3233db0
    • keh4l's avatar
      feat(gateway): port Parrot tool-name obfuscation + message cache breakpoints · 6e12578b
      keh4l authored
      Implements the remaining three parity items with Parrot cc_mimicry:
      
        D) Tool-name obfuscation
           - Dynamic mapping when tools.length > 5 (matches Parrot threshold).
             Fake names follow {prefix}{name[:3]}{i:02d} (e.g. 'manage_bas00').
             Go port of random.Random(hash(tuple(names))) uses fnv64a seed +
             math/rand; byte-exact reproduction is impossible (Python hash vs
             Go hash), but the two invariants that matter are preserved:
               * same input tool_names yield identical mapping (cache hit)
               * prefix pool is shuffled (names look distributed)
           - Static prefix map (sessions_ -> cc_sess_, session_ -> cc_ses_)
             applied as fallback, matching Parrot TOOL_NAME_REWRITES verbatim.
           - Server tools (web_search_20250305, computer_*, etc.) are NOT
             renamed; only type=='function' and type=='custom' tools are.
           - tool_choice.name is rewritten in sync (only when type=='tool').
           - Response side: bytes-level replace on every SSE chunk / JSON
             body at 6 injection points (standard stream/non-stream,
             passthrough stream/non-stream, chat_completions stream +
             non-stream, responses stream + non-stream). Reverse mapping
             applied longest-fake-name-first to prevent substring conflicts
             (parity with Parrot _restore_tool_names_in_chunk).
           - tool_choice is no longer unconditionally deleted in
             normalizeClaudeOAuthRequestBody — Parrot passes it through.
      
        E) tools[-1] cache_control breakpoint
           - Injected as {type:ephemeral, ttl:<DefaultCacheControlTTL>} when
             the last tool has no cache_control. Client-provided ttl is
             passed through unchanged (repo-wide policy).
      
        F) messages cache_control strategy
           - stripMessageCacheControl removes every client-provided
             messages[*].content[*].cache_control (multi-turn stability).
           - addMessageCacheBreakpoints then injects two stable breakpoints:
             (1) last message, and (2) second-to-last user turn when
             messages.length >= 4.
           - Combined with the system block breakpoint and tools[-1]
             breakpoint, this gives exactly the 4 breakpoints Anthropic
             allows per request.
      
      Non-trivial implementation details to be aware of when rebasing:
      
        * Two new files, no upstream collision:
            gateway_tool_rewrite.go       (D + E algorithms)
            gateway_messages_cache.go     (F strip + breakpoints)
        * Two new feature calls bolted onto the tail of
          applyClaudeCodeOAuthMimicryToBody in gateway_service.go — rebase
          conflicts will be ~10 lines maximum.
        * Response-side injection points all wrap their existing write with
          reverseToolNamesIfPresent(c, ...), preserving original behavior
          when no mapping is stored (static prefix rollback still runs).
        * Non-stream chat/responses switched from c.JSON to
          json.Marshal + c.Data so bytes-level replace is possible.
        * Retry bodies (FilterThinkingBlocksForRetry,
          FilterSignatureSensitiveBlocksForRetry, RectifyThinkingBudget)
          only prune blocks — they preserve the already-obfuscated tool
          names, so no extra mapping re-application is needed.
      
      Manual QA: end-to-end scenario verified with 6 tools (above threshold)
      and tool_choice.type=='tool'. Obfuscation + restore roundtrip shown
      in test logs; then removed the temp test file.
      
      Tests (16 new):
        - buildDynamicToolMap stability + below-threshold guard
        - sanitizeToolName precedence (dynamic > static)
        - restoreToolNamesInBytes longest-first + static rollback
        - applyToolNameRewriteToBody skips server tools + syncs tool_choice
        - applyToolsLastCacheBreakpoint defaults to 5m + passes client ttl
        - stripMessageCacheControl + addMessageCacheBreakpoints in the
          1/4/string-content cases + second-to-last user turn selection
        - buildToolNameRewriteFromBody ReverseOrdered is desc-by-fake-length
        - fake name shape follows Parrot {prefix}{head3}{i:02d}
      6e12578b
    • keh4l's avatar
      feat(gateway): align body shape with real Claude Code CLI defaults · a25faeca
      keh4l authored
      Three field-level alignments in normalizeClaudeOAuthRequestBody to
      match real Claude Code CLI traffic byte-for-byte:
      
        1. temperature: previously deleted unconditionally; now passes
           through client value, defaults to 1 when absent (real CLI
           always sends temperature, default 1).
      
        2. max_tokens: defaults to 128000 when absent (real CLI default).
      
        3. context_management: when thinking.type is enabled/adaptive
           and the client did not provide context_management, inject
           {"edits":[{"type":"clear_thinking_20251015","keep":"all"}]}
           to mirror real CLI behavior.
      
      tool_choice removal is unchanged (Claude Code OAuth credentials
      do not allow client-supplied tool_choice).
      
      Tests updated:
        - gateway_body_order_test.go: temperature/max_tokens are now
          expected in output; tool_choice still removed.
        - gateway_prompt_test.go: system array is now 2 blocks
          (billing + cc prompt), assertions adjusted.
        - gateway_anthropic_apikey_passthrough_test.go: same 2-block
          assertion.
      a25faeca
    • keh4l's avatar
      feat(gateway): add billing attribution block with cc_version fingerprint · 5862e2d8
      keh4l authored
      Real Claude Code CLI always sends a 2-block system array:
      
        [0] {"type":"text", "text":"x-anthropic-billing-header: cc_version=X.Y.Z.{fp}; cc_entrypoint=cli; cch=00000;"}
        [1] {"type":"text", "text":"You are Claude Code...", "cache_control":{...}}
      
      Before this commit, sub2api's mimicry path only produced block [1].
      The missing billing block is one of the primary third-party detection
      signals Anthropic uses for Claude-Code-scoped OAuth tokens.
      
      New file gateway_billing_block.go ports the fingerprint algorithm
      (byte-for-byte from Parrot cc_mimicry.py:compute_fingerprint):
      pick chars at positions [4,7,20] of the first user text, then
      `sha256(SALT + chars + cc_version)[:3]`.
      
        - claude/constants.go: CLICurrentVersion = "2.1.92" (must match UA)
        - gateway_billing_block.go: computeClaudeCodeFingerprint +
          buildBillingAttributionBlockJSON + extractFirstUserText
        - gateway_service.go: rewriteSystemForNonClaudeCode now emits both
          blocks in order; cch=00000 is filled in later by
          signBillingHeaderCCH in buildUpstreamRequest.
      
      Downstream compat note: syncBillingHeaderVersion's regex
      `cc_version=\d+\.\d+\.\d+` only matches the semver triple,
      leaving the `.{fp}` suffix intact when rewriting in buildUpstreamRequest.
      5862e2d8
    • keh4l's avatar
      feat(claude): add ttl to cache_control with default 5m · 66d64545
      keh4l authored
      Real Claude CLI traffic sends cache_control as
      `{"type":"ephemeral","ttl":"1h"}`. Our previous payload only
      sent `{"type":"ephemeral"}`, which is a bytewise mismatch with
      the official CLI and one more third-party detection signal.
      
      Policy: client-provided ttl is always passed through unchanged.
      Proxy-generated cache_control blocks default to 5m (vs Parrot's 1h)
      to avoid burning the 1h cache budget on automatic breakpoints while
      still aligning with the `ttl` field being present.
      
        - claude/constants.go: DefaultCacheControlTTL = "5m"
        - apicompat/types.go: new AnthropicCacheControl type with TTL field;
          AnthropicTool gains optional CacheControl pointer so the mimicry
          path can attach a cache breakpoint to tools[-1] later.
        - service/gateway_service.go: anthropicCacheControlPayload gains TTL;
          marshalAnthropicSystemTextBlock and rewriteSystemForNonClaudeCode
          emit ttl=5m by default.
      66d64545
    • keh4l's avatar
      fix(gateway): use full beta list in buildUpstreamRequest mimicry path · 165553cf
      keh4l authored
      The previous commit added FullClaudeCodeMimicryBetas() but the two
      call sites in buildUpstreamRequest still hardcoded the old 3-token
      subset. Anthropic now checks the complete set of beta tokens to
      decide if a request qualifies as Claude Code. Wire them up:
      
        - /v1/messages mimic path: requiredBetas = FullClaudeCodeMimicryBetas()
        - /v1/messages/count_tokens mimic path: same + BetaTokenCounting
      
      Haiku models keep the 2-token exemption (BetaOAuth + InterleaveThinking).
      165553cf
    • keh4l's avatar
      fix(gateway): apply full Claude Code mimicry on /chat/completions and /responses · b5467d61
      keh4l authored
      Before: the OpenAI-compat forwarders only called injectClaudeCodePrompt,
      which prepends the Claude Code banner but leaves the rest of the body
      in its original non-Claude-Code shape. The codebase already admits this
      is insufficient (see the comment on rewriteSystemForNonClaudeCode in
      gateway_service.go: "仅前置追加 Claude Code 提示词无法通过检测").
      
      Effect: OAuth accounts served through /v1/chat/completions or /v1/responses
      were detected as third-party apps and bled plan quota with:
      
          Third-party apps now draw from your extra usage, not your plan limits.
      
      Fix:
        - apicompat.AnthropicRequest: add Metadata json.RawMessage so metadata
          survives the OpenAI->Anthropic->Marshal round trip; without it the
          downstream rewrite has no user_id to work with.
        - service: extract applyClaudeCodeOAuthMimicryToBody, a ParsedRequest-free
          variant of the /v1/messages mimicry pipeline
          (rewriteSystemForNonClaudeCode + normalizeClaudeOAuthRequestBody +
          metadata.user_id injection) so the OpenAI-compat forwarders can reuse it.
        - service: add buildOAuthMetadataUserIDFromBody + hashBodyForSessionSeed
          for the same reason (no ParsedRequest at the call site).
        - ForwardAsChatCompletions / ForwardAsResponses: replace the 3-line
          prompt-prepend with the full mimicry pipeline.
        - applyClaudeCodeMimicHeaders: set x-client-request-id per-request
          (real Claude CLI always does); missing/duplicated values are one more
          third-party fingerprint signal.
      
      No change to the native /v1/messages path: it already called the full
      pipeline, we only lift those helpers into a reusable function.
      
      Tests:
        - go build ./... passes
        - go test ./internal/service/... ./internal/pkg/apicompat/... passes
        - lsp_diagnostics clean on all touched files
        - pre-existing failures in internal/config are unrelated (env-sensitive
          tests that also fail on upstream main)
      b5467d61
    • keh4l's avatar
      chore(claude): bump mimicked CLI to 2.1.92 and extend anthropic-beta list · 57ff9796
      keh4l authored
      Align Claude Code mimicry constants with the latest real CLI traffic
      (see Parrot's src/transform/cc_mimicry.py). Anthropic now uses the full
      set of anthropic-beta tokens to decide whether a request counts as
      "official Claude Code"; requests missing tokens that real CLI ships
      today are demoted to third-party usage:
      
        Third-party apps now draw from your extra usage, not your plan limits.
      
      Changes:
        - claude/constants.go: add new beta tokens (prompt-caching-scope,
          effort, redact-thinking, context-management, extended-cache-ttl) and
          expose FullClaudeCodeMimicryBetas() for the OAuth mimicry path.
        - claude/constants.go: bump default User-Agent to claude-cli/2.1.92.
        - identity_service.go: bump defaultFingerprint User-Agent accordingly.
      
      No behavioral change for clients that already send a newer UA (fingerprint
      merge still prefers the incoming value).
      57ff9796
    • VpSanta33's avatar
    • gaoren002's avatar
      fix(openai): preserve mcp tool call ids · 27ee141c
      gaoren002 authored
      27ee141c
    • gaoren002's avatar
      e65574de
    • song's avatar
      fix(openai): preserve codex tool call ids · 959af1c8
      song authored
      959af1c8
    • gaoren002's avatar
      c4d496da
    • KnowSky404's avatar
      d80469ea
    • KnowSky404's avatar
      5fc30ea9
    • KnowSky404's avatar
      f68909a6
    • shaw's avatar
      fix: openai默认模型新增gpt5.5 · a4e329c1
      shaw authored
      a4e329c1
    • shaw's avatar
      fix(openai): preserve image outputs when text content serialization fails · ca204ddd
      shaw authored
      In reconstructResponseOutputFromSSE, text content Marshal/Unmarshal
      failure previously caused an early return that silently discarded
      already-extracted image_generation_call outputs. Now serialization
      errors are tolerated so image results still reach the client.
      ca204ddd
  5. 23 Apr, 2026 2 commits