Tamp

Token compression proxy for Claude Code.

33.9% fewer input tokens. Zero code changes.

$ npx @sliday/tamp

Set ANTHROPIC_BASE_URL=http://localhost:7778 and go. That's it.

Other installation methods
# Shell installer
$ curl -fsSL tamp.dev/setup.sh | bash

# Manual
$ git clone https://github.com/sliday/tamp ~/.tamp
$ cd ~/.tamp && npm install
$ node index.js

Tokens pile up. Fast.

Turn 1 Turn 10 Turn 20 Turn 30 5K tokens 25K 80K 100K+ input tokens

200+

API calls per session
each sends full history

60%

is tool_result bloat
pretty JSON, raw files, CLI output

$6–15

per coding session
at $3/Mtok input pricing

Every turn re-sends everything. Tamp compresses before it leaves your machine.

3-Stage Compression Pipeline

Claude Code tamp:7778 Anthropic API
1

JSON Minify

— strip whitespace, lossless

Before

{ "name": "tamp", "version": "0.1.0", "type": "module", "dependencies": { "@toon-format/toon": "^2.1.0" } }

After

{"name":"tamp","version":"0.1.0","type":"module","dependencies":{"@toon-format/toon":"^2.1.0"}}
2

TOON Encoding

— objects → columnar format

JSON (334 chars)

[{"name":"a.js","size":1024}, {"name":"b.js","size":2048}, {"name":"c.js","size":512}]

TOON (165 chars)

name[3]{a.js|b.js|c.js} size[3]{1024|2048|512}
3

LLMLingua-2

— ML token pruning for text content

compress.js source code

4,630 chars → 2,214 chars

-52.2%

Each stage runs only when effective. Error results are never compressed.

Measured: 33.9% fewer tokens

A/B tested via OpenRouter with Claude Sonnet 4. Seven scenarios, five runs each, 70 API calls total.

21.9%
28.3%
48.7%
39.8%
23.0%
31.7%
Small JSON Large JSON Tabular Data Source Code Multi-turn Line-Num
Scenario Control Compressed Reduction
Small JSON (package.json) 315 246 -21.9%
Large JSON (dependency tree) 6,773 4,855 -28.3%
Tabular Data (file listing) 3,574 1,835 -48.7%
Source Code (TypeScript) 1,069 644 -39.8%
Multi-turn (5-turn conversation) 1,026 790 -23.0%
Line-Numbered (Read tool output) 473 323 -31.7%
Error Result (is_error: true) — skipped by design 167 167 0.0%

How we tested

Identical requests via OpenRouter. Raw vs. compressed. Exact input_tokens from the API. No estimation.

What compresses

Pretty JSON (minify), arrays (TOON columnar), line-numbered output (strip), text (LLMLingua-2 neural).

What doesn't

Error results, already-minified JSON, and TOON content pass through untouched. Safety first.

Read the white paper Reproduce the benchmark

~$35/month saved per developer

At Sonnet 4 pricing ($3/Mtok input), Tamp saves ~$0.23 per 200-request session. Heavy users running 5 sessions/day see ~$35/month. For a 10-person team: ~$4,200/year.

Per Session $0.23 200 requests
Per Developer $35/mo 5 sessions/day
10-Person Team $4,200/yr free and open source

Based on 60% compressible traffic. Actual savings depend on usage. Opus/Haiku have different rates.

Transparent HTTP Proxy

Intercept

POST /v1/messages only. All other routes pass through untouched.

Compress

Last user message → tool_result blocks get compressed per content type.

Forward

Rewrite Content-Length, forward to upstream API. Response streams back untouched.

Safety

Bodies over 256KB bypass compression. Parse errors fall through gracefully.

What's Next

Extended thinking block compression
Response caching for repeated tool calls
Per-session dashboards with live stats
Configurable compression aggressiveness
Cloudflare Workers edge deployment

Try it now

$ npx @sliday/tamp