• shaw's avatar
    fix(gateway): skip body mimicry for real Claude Code clients to restore prompt caching · 496469ac
    shaw authored
    PR #1914 unconditionally applied the full mimicry pipeline to all OAuth
    accounts, including real Claude Code CLI clients. This replaced the
    client's long system prompt (~10K+ tokens with stable cache_control
    breakpoints) with a short ~45 token [billing, CC prompt] pair, which
    falls below Anthropic's 1024-token minimum cacheable prefix threshold.
    The result: every request created a new cache but never hit an existing
    one.
    
    Fix: restore the Claude Code client detection gate so that real CC
    clients bypass body-level mimicry (system rewrite, message cache
    management, tool name obfuscation). Non-CC third-party clients
    (opencode, etc.) continue to receive full mimicry.
    
    Also harden the detection logic:
    - Make UA regex case-insensitive (align with claude_code_validator.go)
    - Validate metadata.user_id format via ParseMetadataUserID() instead of
      just checking non-empty, preventing third-party tools from spoofing
      a claude-cli/* UA with an arbitrary user_id string to bypass mimicry
    496469ac
gateway_service.go 314 KB