Track spending, reduce token usage, and set up enterprise-grade observability
Claude Code is powerful, but power costs money. This final lesson teaches you how to track, reduce, and monitor your spending — from individual developer habits to enterprise-wide observability.
| Metric | Value |
|---|---|
| Average daily cost per developer | ~$6 |
| 90th percentile daily cost | <$12 |
| Monthly average (Sonnet 4.5) | ~$100-200/developer |
Costs vary based on codebase size, query complexity, conversation length, model choice, and number of parallel instances.
/cost
Total cost: $0.55
Total duration (API): 6m 19.7s
Total duration (wall): 6h 33m 10.2s
Total code changes: 0 lines added, 0 lines removed
Configure your status line to show context usage continuously:
Full access
Unlock all 14 lessons, templates, and resources for Claude Code Mastery. Free.
/config → status line → enable context window usage
Shows account info, current model, and usage details.
Q: What's the typical daily cost per developer using Claude Code with Sonnet?
About $6 per day on average, with the 90th percentile under $12. Monthly, that works out to roughly $100-200 per developer with Sonnet 4.5. Costs increase with Opus, larger codebases, longer sessions, and parallel instances like agent teams.
Token costs scale with context size — the more context Claude processes, the more tokens you use. Claude Code automatically optimizes via prompt caching and auto-compaction.
Here's how to reduce costs further:
/rename auth-work
/clear
Stale context wastes tokens on every subsequent message. Name your session first for easy resumption.
/compact Focus on code samples and API usage
| Task | Recommended Model | Why |
|---|---|---|
| Architecture decisions | Opus | Deep reasoning needed |
| Daily coding | Sonnet | Good balance |
| Simple lookups/formatting | Haiku | Fast and cheap |
| Complex plan → implement | opusplan | Best of both |
/context # See what's consuming space
/mcp # Disable unused servers
Prefer CLI tools when available:
| Instead of... | Use... |
|---|---|
| GitHub MCP server | gh CLI |
| AWS MCP server | aws CLI |
| Sentry MCP server | sentry-cli |
export ENABLE_TOOL_SEARCH=auto:5
LSP plugins give Claude precise navigation instead of text-based search. One "go to definition" call replaces grep + reading multiple files.
Instead of Claude reading a 10,000-line log, use a hook to filter first:
{
"hooks": {
"PreToolUse": [{
"matcher": "Bash",
"hooks": [{
"type": "command",
"command": "./scripts/filter-output.sh"
}]
}]
}
}
A codebase-overview skill gives Claude architecture knowledge instantly instead of spending tokens exploring.
# .claude/agents/formatter.md
---
model: haiku
---
Use Haiku for simple tasks. Sonnet for moderate. Opus only for complex reasoning.
Fill in the blanks:
/___export ENABLE_TOOL_SEARCH=auto:___/context5$6Q: Why are CLI tools often more cost-effective than MCP servers?
MCP servers add tool definitions to every message in the context, consuming tokens even when idle. 10 servers × 5 tools = 50 tool definitions on every turn. CLI tools like gh, aws, and sentry-cli run via Bash with zero persistent overhead — Claude just runs the command when needed.
Agent teams are the most expensive feature:
| Team Size | TPM per User | RPM per User |
|---|---|---|
| 1-5 | 200k-300k | 5-7 |
| 5-20 | 100k-150k | 2.5-3.5 |
| 20-50 | 50k-75k | 1.25-1.75 |
| 50-100 | 25k-35k | 0.62-0.87 |
TPM decreases with team size because fewer users are active simultaneously. Set workspace spend limits in the Anthropic Console.
Claude Code supports OpenTelemetry (OTel) for enterprise-grade observability.
export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=otlp
export OTEL_LOGS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://your-collector:4317
Metrics: Token usage, API latency, tool usage frequency, session duration, error rates.
Events: Session start/end, tool calls, permission decisions, compaction events.
| Variable | Default | Effect |
|---|---|---|
OTEL_LOG_USER_PROMPTS | Disabled | Include prompt content in logs |
OTEL_LOG_TOOL_DETAILS | Disabled | Include MCP server/tool names |
export OTEL_RESOURCE_ATTRIBUTES="department=engineering,team.id=platform,cost_center=eng-123"
Filter dashboards by team, department, or cost center.
{
"env": {
"CLAUDE_CODE_ENABLE_TELEMETRY": "1",
"OTEL_METRICS_EXPORTER": "otlp",
"OTEL_LOGS_EXPORTER": "otlp",
"OTEL_EXPORTER_OTLP_PROTOCOL": "grpc",
"OTEL_EXPORTER_OTLP_ENDPOINT": "http://your-collector:4317"
}
}
Q: What are the two privacy-sensitive OTel variables you should consider before enabling?
OTEL_LOG_USER_PROMPTS — when enabled, your actual prompts are included in telemetry logs. Disabled by default.OTEL_LOG_TOOL_DETAILS — when enabled, MCP server and tool names are logged. Also disabled by default.Leave these disabled unless you have appropriate data handling policies in place.
# Hard spending limit
claude -p "Refactor auth" --max-budget-usd 5.00
# Turn limit
claude -p "Fix lint errors" --max-turns 10
Fill in the blanks:
export CLAUDE_CODE_ENABLE_TELEMETRY=___--max-___-usd/___1--max-budget-usd/cost| Strategy | Impact | Effort |
|---|---|---|
| Use Sonnet instead of Opus for routine tasks | High | Low |
| Clear context between unrelated tasks | High | Low |
| Use CLI tools instead of MCP servers | Medium | Low |
| Install code intelligence plugins | Medium | Low |
| Enable Tool Search for MCP | Medium | Low |
| Write focused CLAUDE.md (not verbose) | Medium | Low |
| Use Haiku for simple subagents | Medium | Low |
| Custom compaction instructions | Medium | Medium |
| Filter output via hooks | High | Medium |
| Use skills for domain knowledge | Medium | Medium |
| Set up OTel monitoring | High | High |
/cost at the end of your next session and note the total/context to see what's consuming your context windowgh issue create/cost after each — notice the difference?/model haiku → ask a basic question → /cost/model sonnet → ask the same question → /costReflection: Where are your biggest token costs? Which optimization would have the most impact?
Scenario: Your 20-person engineering team just adopted Claude Code. After the first month, the bill is $8,000 — double what was budgeted. The CEO wants it under $4,000 next month without reducing usage.
Layered cost reduction strategy:
Switch default model to Sonnet (team settings):
{ "model": "sonnet" }
Biggest impact — Sonnet is significantly cheaper than Opus.
Enable Tool Search for everyone:
{ "env": { "ENABLE_TOOL_SEARCH": "auto:5" } }
Replace MCP servers with CLIs — audit which MCP servers are used vs. idle.
Install LSP plugins for your primary languages — reduces exploratory token usage.
Add compaction instructions to CLAUDE.md:
# Compact instructions
Focus on code changes and test results. Drop verbose exploration logs.
Set up OTel monitoring to identify heavy users and expensive patterns.
Add cost awareness to CLAUDE.md:
# Cost
- Use /clear between unrelated tasks
- Prefer Haiku for simple questions
Expected savings: 40-60% reduction from model switch alone. Combined with other optimizations, hitting $4,000 is very achievable.
| Concept | One-Liner |
|---|---|
| Average cost | ~$6/dev/day, ~$100-200/dev/month with Sonnet |
| Track usage | /cost for session; /context for what's consuming space |
| Biggest savings | Right model choice + clearing context between tasks |
| MCP overhead | CLI tools have zero persistent cost; MCP adds to every message |
| Tool Search | ENABLE_TOOL_SEARCH=auto:5 auto-defers idle tools |
| Headless limits | --max-budget-usd and --max-turns |
| OTel monitoring | CLAUDE_CODE_ENABLE_TELEMETRY=1 + OTLP exporter |
| Privacy | OTEL_LOG_USER_PROMPTS and OTEL_LOG_TOOL_DETAILS disabled by default |
| Team scaling | TPM per user decreases as team size increases |
Congratulations — you've covered every major Claude Code feature across 14 lessons:
/cost and consider OTel for your teamHappy coding with Claude! 🚀