1. 31 Mar, 2026 1 commit
    • QTom's avatar
      feat(gateway): Cache-Driven RPM Buffer · 72e5876c
      QTom authored
      
      
      - buffer 公式从 baseRPM/5 改为 concurrency + maxSessions
        保留 baseRPM/5 作为 floor 向后兼容
      - 粘性路径 fallback 新增 [StickyCacheMiss] 结构化日志
        reason: rpm_red / gate_check / session_limit / wait_queue_full / account_cleared
      - session_limit 路径跳过 wait queue 重试(RegisterSession 拒绝无副作用)
      - 典型配置 buffer 从 3 提升至 13,大幅减少高峰期 Prompt Cache Miss
      Co-Authored-By: default avatarClaude Opus 4.6 (1M context) <noreply@anthropic.com>
      72e5876c
  2. 28 Feb, 2026 3 commits
    • QTom's avatar
      fix: address deep code review issues for RPM limiting · e63c8395
      QTom authored
      - Move IncrementRPM after Forward success to prevent phantom RPM
        consumption during account switch retries
      - Add base_rpm input sanitization (clamp to 0-10000) in Create/Update
      - Add WindowCost scheduling checks to legacy path sticky sessions
        (4 check sites + 4 prefetch sites), fixing pre-existing gap
      - Clean up rpm_strategy/rpm_sticky_buffer when disabling RPM in
        BulkEditModal (JSONB merge cannot delete keys, use empty values)
      - Add json.Number test cases to TestGetBaseRPM/TestGetRPMStickyBuffer
      - Document TOCTOU race as accepted soft-limit design trade-off
      e63c8395
    • QTom's avatar
      fix: address code review issues for RPM limiting feature · 60723757
      QTom authored
      - Use TxPipeline (MULTI/EXEC) instead of Pipeline for atomic INCR+EXPIRE
      - Filter negative values in GetBaseRPM(), update test expectation
      - Add RPM batch query (GetRPMBatch) to account List API
      - Add warn logs for RPM increment failures in gateway handler
      - Reset enableRpmLimit on BulkEditAccountModal close
      - Use union type 'tiered' | 'sticky_exempt' for rpmStrategy refs
      - Add design decision comments for rdb.Time() RTT trade-off
      60723757
    • QTom's avatar