I tracked every token my AI coding agent consumed for a week. 70% was waste.

By Sigma Hunter · March 28, 2026 · 1 min read

Last week Anthropic announced tighter usage limits for Claude during peak hours. My timeline exploded with developers asking why they're hitting limits after 2-3 prompts. I'm the developer behind vexp, a local context engine for AI coding agents. Before building it, I did something nobody seems to do: I actually measured what's happening under the hood. The experiment I tracked token consumption on FastAPI v0.115.0 — the real open-source framework, ~800 Python files. Not a toy project. 7 tasks (bug fixes, features, refactors, code understanding). 3 runs per task. 42 total executions. Claude Sonnet 4.6. Full isolation between runs. What I found Every single prompt, Claude Code did this: Glob pattern * — found all files Glob pattern **/*.{py,js,ts,...} — found code files Read file 1 Read file 2 Read file 3 ...repeat 20+ times Finally start thinking about my actual question Average per prompt: 23 tool calls (Read/Grep/Glob) ~180,000 tokens consumed ~50,000 tokens actually relevant to the

I tracked every token my AI coding agent consumed for a week. 70% was waste.

Related Posts

Trending on ShareHub

Latest on ShareHub

Browse Topics

Around the Network