Skip to content

feat(providers): add context-compression proxy plugin with tokenSavingMode#271

Open
8nevil8 wants to merge 37 commits intomainfrom
feat/claude-prompt-caching-autocompact-defaults
Open

feat(providers): add context-compression proxy plugin with tokenSavingMode#271
8nevil8 wants to merge 37 commits intomainfrom
feat/claude-prompt-caching-autocompact-defaults

Conversation

@8nevil8
Copy link
Copy Markdown
Collaborator

@8nevil8 8nevil8 commented May 3, 2026

Summary

Adds an Intelligent Context Manager (ICM) proxy plugin that compresses older conversation messages before forwarding requests to the LLM API, reducing input token usage when tokenSavingMode is enabled in the user's profile.

Changes

  • Context compression plugin (src/providers/plugins/sso/proxy/plugins/context-compression/): Full plugin with SmartCrusher, content-aware routing (log/diff/search/generic compressors), provider adapters (Anthropic/OpenAI/OpenAI-compatible), ICM three-phase pipeline (proactive non-tail, emergency tail, message dropping), and savings tracker
  • tokenSavingMode profile feature: codemie profile set token-saving-mode on/off CLI command + features.tokenSavingMode config field
  • /codemie:token-saver slash command: One-command setup that enables tokenSavingMode in the active profile
  • Agent hooks: bash-ban-raw-tools, cbm-mcp-marker, cbm-code-discovery-gate, rtk-rewrite, handoff-precompact, handoff-session-resume
  • /handoff slash command: Session handoff with context preservation across compaction
  • Integration test: scripts/test-compression.mjs — runs codemie-claude --task, scans logs, verifies pipeline

Impact

Token savings for multi-turn conversations with older messages containing logs, diffs, or repetitive content. ICM Phase 2 (proactive) compresses non-tail messages at ~50% ratio; Phase 1 (emergency) triggers only when over the context limit, preserving current-turn messages.

Checklist

  • Self-reviewed
  • Manual testing performed
  • Documentation updated (if needed)
  • No breaking changes (or clearly documented)

8nevil8 and others added 11 commits May 3, 2026 08:25
…gMode

Implements an intelligent context manager (ICM) that compresses older
conversation messages before forwarding to the LLM API, reducing token
usage when tokenSavingMode is enabled in the profile.

- SmartCrusher: anchor-selection + token-budget compression (50% ratio)
- Content-aware routing: log/diff/search/generic compressors
- Adapters for Anthropic, OpenAI, and OpenAI-compatible formats
- ICM phases: proactive non-tail compression, emergency tail compression,
  and message dropping as last resort
- Savings tracker persists cumulative token savings to disk
- /codemie:token-saver slash command to enable the feature
- `codemie profile set token-saving-mode on/off` CLI support
- Integration test script: scripts/test-compression.mjs

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
@8nevil8 8nevil8 force-pushed the feat/claude-prompt-caching-autocompact-defaults branch from 4dbc809 to 24757d9 Compare May 3, 2026 05:25
8nevil8 and others added 18 commits May 3, 2026 08:28
…perpowers routing

- Rewrites SKILL.md v0.1 → v0.2: full SDLC flow (requirements → worktree →
  complexity scoring → brainstorming/writing-plans → spec-reviewer →
  subagent-driven-development → code review → qa-lead)
- Replaces solution-architect references with superpowers:brainstorming +
  superpowers:writing-plans; 2-tier routing (5-7 simple, 8-15 brainstorming)
- Rewrites branch-workflow.md: removes manual git checkout -b sections,
  delegates branch/worktree creation to superpowers:using-git-worktrees,
  fixes Node.js commands (npm run lint / npm test)
- Updates complexity-assessment-guide.md: 3-tier routing → 2-tier routing
  aligned with SKILL.md; removes 11-12 borderline case

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Orchestrates automated-tests → ui-tests (conditional) → spec-refinement
(conditional) → /memory-refresh reminder. Invoked by tech-lead as Phase 5
after all implementation tasks complete and code review passes.

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
- automated-tests: lint → build → unit tests pipeline, stops at first failure
- ui-tests: conditional frontend test runner, three-state outcome (PASS/FAIL/SNAPSHOT)
- spec-refinement: aligns spec with implementation post-drift, commit deferred to user when invoked by qa-lead

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
…lowed-tools

- Add YAML frontmatter (description + allowed-tools) to codemie-catchup, codemie-init, codemie-subagents, memory-refresh
- Remove hardcoded absolute path from handoff.md (dev artifact)
- Add allowed-tools to report-issue SKILL.md

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
…ia skills CLI

- Add Phase 0: installs obra/superpowers project-level via `npx skills add obra/superpowers --all -a claude-code`
- Flatten phase headers (## not nested ###)
- Simplify Critical Rules section (remove emoji noise)
- Update Next Steps to reference tech-lead as SDLC entry point

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Centralize keyword sets and regex patterns into error-detection.ts,
mirroring Python headroom error_detection.py. Fix CCR ratio threshold
from 0.4 to 0.6 in diff/log/search compressors to match Python source.
Also allowlist docs/superpowers/plans/ in .gitleaks.toml to suppress
false-positive detection of well-known public JWT test vectors in plan docs.

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
…ceptor

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
…Router

Add computePressureMinRatio(), expand ContentRouterConfig with three fields, and add routeWithPressure() method. 6 tests passing.

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
- Add JSDoc to routeWithPressure clarifying compressionRatio=1.0 means threshold not met
- Add JSDoc to skipUserMessages config field (consumed by ICM layer, not ContentRouter)
- Replace stub routeWithPressure test with 3 behavioral tests covering fill=0, fill=1, and result shape

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Replace position/bias-based Phase 3 drop with score-based dropping
(recency + error keyword signals). Export ContextStrategy, scoreMessage,
buildDropCandidates. Add icm-strategy.test.ts (6 tests).

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
…gthen sort test

- Phase 3 drop loop now captures message references before mutation,
  preventing stale-index filtering on subsequent iterations
- Added JSDoc to ContextStrategy enum explaining it as preparatory
  infrastructure for future caller-controlled phase selection
- Strengthened buildDropCandidates sort assertion to full ascending-order
  check; fixed misleading comment on index 0 score

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
8nevil8 and others added 8 commits May 3, 2026 11:33
…n ICM

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
…room 3e.5 parity)

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
… modules

Core modules that were created but not yet committed:
- ccr/types.ts: CcrStore interface contract
- ccr/bm25.ts: BM25 scoring for semantic retrieval
- ccr/store.ts: CompressionStore with CCR get/set and BM25 indexing
- transforms/config.ts: CompressConfig interface and buildCompressConfig factory
- transforms/hooks.ts: CompressionHooks and CompressionEvent interfaces
- compressors/types.ts: add cacheKey field to CompressionResult

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Remove skills added on this branch (automated-tests, qa-lead,
spec-refinement, ui-tests) and restore spec-reviewer, tech-lead,
and codemie-worktree to their main branch versions.

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
- Add `codemie profile set features.<group>.<flag> <value>` command
- FEATURE_GROUPS registry with full contextCompression flag set
- Value coercion: booleans (true/false/on/off), numbers, nullable (none→null)
- FlagDef.min/max for generic range validation (targetRatio 0–1)
- Replace old token-saving-mode set command with typed feature-flag command
- Migrate interceptor gate from features.tokenSavingMode to features.contextCompression.enabled
- Migrate cacheAligner read to ccFeatures.cacheAligner
- buildCompressConfig now typed ProfileFeatures, reads features.contextCompression.*
- AgentCLI --enable-token-saving sets contextCompression.enabled instead of tokenSavingMode

Generated with AI

Co-Authored-By: codemie-ai <codemie.ai@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant