The Tokens I Deleted
Why the expert-persona preamble wastes tokens now and what to use instead.
Here's a prompting pattern that most people are still using that wastes tokens, adds nothing to output quality, and might actually be making your results worse.
The "expert persona" preamble. You know the one.
The Pattern
"You are an expert financial analyst with 20 years of experience in equity research and corporate valuation. You have deep expertise in analyzing technology companies and SaaS business models. Your analysis is always thorough, data-driven, and insightful. Please analyze the following company's financials and provide a comprehensive assessment."
73 tokens before you've said anything useful.
Now compare:
"Analyze this company's financials. Include: revenue growth, margin trends, valuation vs sector, key risks."
26 tokens. The output quality is the same or better.
Why the Preamble Doesn't Help
The "expert with 20 years of experience" prefix doesn't unlock hidden knowledge in the model. Claude doesn't have a secret expert mode that activates when you say the magic words. The model already knows financial analysis. You don't need to tell it to be knowledgeable.
What actually determines output quality:
• Specificity of the task — "analyze financials" is vague. "Include revenue growth, margin trends, valuation vs sector, key risks" gives the model a clear structure to follow.
• Output format requirements — "provide a comprehensive assessment" is shapeless. "Bullet points for each metric, one-paragraph summary, key risks section" gives structure that produces structured results.
• Constraints — "focus on the last 3 quarters" or "compare to these specific competitors" narrows the scope productively and prevents the model from generating a broad, unfocused response.
None of these require a persona. They require clear instructions.
Why This Worked in 2023 (And Doesn't Now)
The persona pattern genuinely made sense with earlier models.
GPT-3.5 and early GPT-4 benefited from role scaffolding. Without a persona anchor, those models would drift off topic, mix registers, and produce inconsistent output. Telling the model "you are an expert" helped it maintain focus and tone throughout a longer response.
Those days are over.
Current models — Claude Opus, Claude Sonnet, GPT-4o and beyond — respond to explicit instructions far better than implicit personas. Their instruction-following capabilities improved dramatically. They don't need to be told they're experts. They need to be told what you want, in what format, with what constraints.
The prompting world evolved. A lot of the advice circulating on LinkedIn and Twitter hasn't caught up. People are still sharing "ultimate prompt templates" optimized for models from two years ago.
More Examples Across Use Cases
Here's what the shift looks like in practice:
Legal analysis: • Old: "You are a senior corporate attorney specializing in M&A with experience in cross-border transactions and regulatory compliance..." • Better: "Review this term sheet. Flag non-standard clauses. Note risks for the buyer. Format: clause, issue, risk level."
Code review: • Old: "You are a staff software engineer with expertise in distributed systems and 15 years of production experience..." • Better: "Review this code for: performance issues, error handling gaps, security risks. Prioritize by severity. Include line numbers."
Marketing copy: • Old: "You are a world-class copywriter who has written for Fortune 500 brands and understands the psychology of persuasion deeply..." • Better: "Write a landing page headline and 3 supporting bullets. Product: [X]. Audience: [Y]. Tone: confident, not salesy. Max 50 words total."
Data analysis: • Old: "You are a senior data scientist with a PhD in statistics and extensive experience in business intelligence and predictive modeling..." • Better: "Analyze this dataset. Identify: top 3 trends, anomalies worth investigating, and one actionable recommendation. Use plain language."
In every case, the specific version tells the model exactly what to produce. The persona version tells the model who to pretend to be and hopes the right output follows.
The Hidden Cost You're Not Counting
Beyond wasted input tokens, there's a subtler problem: throat clearing.
When you prime a model with "you are an expert with deep expertise," the model often mirrors that energy in its response. You get a paragraph of preamble before the actual analysis starts. "As an experienced financial analyst, I can see several important trends in this data that warrant careful examination. Let me walk you through my comprehensive assessment..."
Nobody asked for that. It adds nothing. And it eats tokens on both sides — your input tokens for the persona setup AND extra output tokens for the performative warmup.
Specific instructions produce outputs that start with the answer, not with the introduction. You get to the substance faster. Less waste in, less waste out.
Over hundreds or thousands of API calls, those extra tokens add up to real cost and real latency. For anyone building AI into products, this isn't just a prompting preference — it's an engineering decision.
When Roles Actually Still Help
There are real use cases where persona prompting adds genuine value:
• Creative writing where voice and tone are the product. If you want output in a specific style or register, a persona helps maintain consistency throughout. "Write this in a warm, conversational tone aimed at first-time parents" is useful because the HOW matters as much as the WHAT.
• Teaching and explanation contexts. "Explain this to someone with no programming background" is a form of audience-aware prompting that genuinely changes the output for the better. The model adjusts vocabulary, uses more analogies, and skips jargon.
• Roleplay and character consistency. If you're building a chatbot with a distinct personality, the persona IS the product. That's fundamentally different from tacking a persona onto a straightforward analytical task.
• Adversarial or contrarian analysis. "Argue against this position" or "find every reason this plan would fail" benefits from a perspective frame that focuses the model's reasoning in a specific direction.
The pattern: roles help when the HOW matters more than the WHAT. For most analytical, technical, and business tasks, the WHAT is all you need.
How to Test This Yourself
Simple experiment anyone can run:
- Find a prompt you use regularly — something you send to Claude or GPT at least weekly
- Copy it. Strip the role preamble entirely.
- Replace it with explicit output requirements: what format, what sections, what constraints, what to include
- Run both versions on the same input
- Compare the outputs side by side
If the role version is genuinely better, keep it. There's no virtue in deleting something that works.
In my experience, the specific-instructions version wins for the vast majority of professional tasks. And it does it with fewer tokens, which means faster responses and lower costs at scale.
The tokens you're spending on "You are an expert" could be spent on "Include these specific sections" — and that trade is worth making every time.
What's your most token-heavy prompt preamble? Try cutting it and see what happens.
Related Guides
- Brief AI Like a Pro Course — A complete system for writing better briefs
- AI Automation ROI Calculator — Measure what matters
- AI Workflow Templates 2026 — Efficient prompts built in
Related Stories
- MCP Explained in 10 Minutes — How tools change prompting
- My Claude Code Workflow (November 2024) — How prompting has evolved
Learn More
For a complete briefing system, join Brief AI Like a Pro.
Amir Brooks
Software Engineer & Designer