1. 27 Apr, 2026 1 commit
    • keh4l's avatar
      feat(gateway): port Parrot tool-name obfuscation + message cache breakpoints · f507bae5
      keh4l authored and 陈曦's avatar 陈曦 committed
      Implements the remaining three parity items with Parrot cc_mimicry:
      
        D) Tool-name obfuscation
           - Dynamic mapping when tools.length > 5 (matches Parrot threshold).
             Fake names follow {prefix}{name[:3]}{i:02d} (e.g. 'manage_bas00').
             Go port of random.Random(hash(tuple(names))) uses fnv64a seed +
             math/rand; byte-exact reproduction is impossible (Python hash vs
             Go hash), but the two invariants that matter are preserved:
               * same input tool_names yield identical mapping (cache hit)
               * prefix pool is shuffled (names look distributed)
           - Static prefix map (sessions_ -> cc_sess_, session_ -> cc_ses_)
             applied as fallback, matching Parrot TOOL_NAME_REWRITES verbatim.
           - Server tools (web_search_20250305, computer_*, etc.) are NOT
             renamed; only type=='function' and type=='custom' tools are.
           - tool_choice.name is rewritten in sync (only when type=='tool').
           - Response side: bytes-level replace on every SSE chunk / JSON
             body at 6 injection points (standard stream/non-stream,
             passthrough stream/non-stream, chat_completions stream +
             non-stream, responses stream + non-stream). Reverse mapping
             applied longest-fake-name-first to prevent substring conflicts
             (parity with Parrot _restore_tool_names_in_chunk).
           - tool_choice is no longer unconditionally deleted in
             normalizeClaudeOAuthRequestBody — Parrot passes it through.
      
        E) tools[-1] cache_control breakpoint
           - Injected as {type:ephemeral, ttl:<DefaultCacheControlTTL>} when
             the last tool has no cache_control. Client-provided ttl is
             passed through unchanged (repo-wide policy).
      
        F) messages cache_control strategy
           - stripMessageCacheControl removes every client-provided
             messages[*].content[*].cache_control (multi-turn stability).
           - addMessageCacheBreakpoints then injects two stable breakpoints:
             (1) last message, and (2) second-to-last user turn when
             messages.length >= 4.
           - Combined with the system block breakpoint and tools[-1]
             breakpoint, this gives exactly the 4 breakpoints Anthropic
             allows per request.
      
      Non-trivial implementation details to be aware of when rebasing:
      
        * Two new files, no upstream collision:
            gateway_tool_rewrite.go       (D + E algorithms)
            gateway_messages_cache.go     (F strip + breakpoints)
        * Two new feature calls bolted onto the tail of
          applyClaudeCodeOAuthMimicryToBody in gateway_service.go — rebase
          conflicts will be ~10 lines maximum.
        * Response-side injection points all wrap their existing write with
          reverseToolNamesIfPresent(c, ...), preserving original behavior
          when no mapping is stored (static prefix rollback still runs).
        * Non-stream chat/responses switched from c.JSON to
          json.Marshal + c.Data so bytes-level replace is possible.
        * Retry bodies (FilterThinkingBlocksForRetry,
          FilterSignatureSensitiveBlocksForRetry, RectifyThinkingBudget)
          only prune blocks — they preserve the already-obfuscated tool
          names, so no extra mapping re-application is needed.
      
      Manual QA: end-to-end scenario verified with 6 tools (above threshold)
      and tool_choice.type=='tool'. Obfuscation + restore roundtrip shown
      in test logs; then removed the temp test file.
      
      Tests (16 new):
        - buildDynamicToolMap stability + below-threshold guard
        - sanitizeToolName precedence (dynamic > static)
        - restoreToolNamesInBytes longest-first + static rollback
        - applyToolNameRewriteToBody skips server tools + syncs tool_choice
        - applyToolsLastCacheBreakpoint defaults to 5m + passes client ttl
        - stripMessageCacheControl + addMessageCacheBreakpoints in the
          1/4/string-content cases + second-to-last user turn selection
        - buildToolNameRewriteFromBody ReverseOrdered is desc-by-fake-length
        - fake name shape follows Parrot {prefix}{head3}{i:02d}
      f507bae5