• QTom's avatar
    feat(gateway): Cache-Driven RPM Buffer · 72e5876c
    QTom authored
    
    
    - buffer 公式从 baseRPM/5 改为 concurrency + maxSessions
      保留 baseRPM/5 作为 floor 向后兼容
    - 粘性路径 fallback 新增 [StickyCacheMiss] 结构化日志
      reason: rpm_red / gate_check / session_limit / wait_queue_full / account_cleared
    - session_limit 路径跳过 wait queue 重试(RegisterSession 拒绝无副作用)
    - 典型配置 buffer 从 3 提升至 13,大幅减少高峰期 Prompt Cache Miss
    Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
    72e5876c
gateway_service.go 294 KB