Related ToolsClaudeClaude CodeClaude For DesktopClaude Mobile

Caveman Prompting: One User Cut Claude's Token Usage by 75%

Claude by Anthropic
Image: Anthropic

75%. That's the token reduction one Claude user reported after a simple prompt trick: tell the model to respond like a caveman.

The technique is straightforward. You instruct Claude to strip out filler words, articles, and polite phrasing, responding in compressed shorthand instead of full sentences. "Me find bug in line 42. Fix: change variable name" instead of "I've identified a bug on line 42. The issue is that the variable name conflicts with a reserved keyword. To fix this, you should rename the variable."

Tokens are the chunks of text that language models process. Every token costs money on API calls and counts against your context window (the amount of text the model can "remember" during a conversation). Fewer tokens means lower costs and more room for actual content before the model starts forgetting earlier parts of your conversation.

A 75% reduction is significant if it holds up. On Anthropic's API pricing, that could mean spending $5 instead of $20 on a heavy coding session. For Claude Pro subscribers hitting usage limits, compressed responses mean more back-and-forth before you get throttled.

The tradeoff is readability. Caveman-style responses work fine when you're an experienced developer who just needs the answer. They're terrible for explanations, tutorials, or any situation where you need the model to walk you through reasoning step by step. You're also trusting that the compression doesn't cause the model to drop important nuance from its responses.

A more practical middle ground: tell Claude to be concise and skip pleasantries without going full caveman. Something like "respond in short, direct sentences with no filler" gets you a meaningful token reduction without sacrificing clarity. Most of Claude's verbosity comes from hedging phrases and unnecessary preambles, not from the actual content.