Every turn costs more than the last.
Coding agents re-send the full conversation on every API call. Tool results accumulate — file reads, JSON configs, CLI output — all re-sent as input tokens, every single turn.
200+
API calls per session
each sends full history
60%
is tool_result bloat
JSON, files, CLI output
$6–15
per coding session
at $3/Mtok pricing
8 stages. Zero config.
Tamp sits between your agent and the API. It classifies each tool result and applies the right compression — automatically.
-
minify
lossless
Strip JSON whitespace.
package.jsonshrinks 22%. - toon lossless Columnar encoding for arrays. File listings shrink 49%.
- prune lossless Strip lockfile hashes, registry URLs, npm metadata. -81% on lockfiles.
- dedup lossless Same file read twice? Send a reference, not the content.
- diff lossless Tiny edit? Send a patch, not the full file.
- strip-lines lossless Remove line-number prefixes from Read tool output.
- whitespace lossless Collapse blank lines, trim trailing spaces.
- llmlingua neural LLMLingua-2 token pruning for text. Source code shrinks 40%. Auto-starts sidecar.
Opt-in stages (lossy, not enabled by default)
-
strip-comments
opt-in
Remove
//,/* */,#comments. -35% on commented code. - textpress opt-in LLM semantic compression via Ollama or OpenRouter. -73% on stacktraces.
Works with Claude Code, Aider, Cursor, Cline, and any OpenAI-compatible agent.
🦞 Works with OpenClaw
Route your AI gateway through Tamp. Every request gets compressed before it hits Anthropic — your agents work the same, your bill doesn't.
Setup in 2 minutes
- Run
npm i -g @sliday/tamp && tamp -yon your server - Add a provider in your OpenClaw config pointing to
http://localhost:7778 - Set it as primary model — done. All requests now flow through Tamp.
Chat sessions (Telegram, short turns)
3–5%
mostly text, few tool calls
Coding sessions (file reads, JSON)
30–50%
heavy tool_result compression
70MB RAM. <5ms latency. No Python needed. If Tamp goes down, requests bypass it automatically.
Measured: 52.6% fewer tokens
A/B tested via OpenRouter with Claude Sonnet 4.6. Twelve scenarios, five runs each, 120 API calls.
Quality verified: 8/8 A/B scenarios — compressed responses identical to uncompressed. Sonnet 4.6, $3/Mtok.
Claude Max? Last 2× longer.
Max subscribers have a fixed token budget. Tamp compresses input tokens before they count against your limit — same work, fewer tokens consumed.
Max 5× — $100/mo
5× → 10.6×
Max 20× — $200/mo
20× → 42.2×
| Model | Input $/MTok | Saved/session | Per dev/month | Team/year |
|---|---|---|---|---|
| Sonnet 4.6 | $3 | $0.32 | $48 | $5,800 |
| Opus 4.6 | $5 | $0.54 | $80 | $9,600 |
| Opus 4.6 (extended) | $10 | $1.07 | $161 | $19,300 |
52.6% fewer input tokens → 2.11× more requests per budget. 10-person team, 5 sessions/day.