Decide when to optimize, what to cache, and how to scale without over-engineering
Most teams scale too early — or too late. The right decision is based on bottlenecks, not fear.
Start at the top. Don’t jump to sharding unless the bottleneck demands it.
If you can’t point to a specific bottleneck, you’re optimizing a guess.
Useful questions:
Q: Why is “just add more servers” often the wrong first step?
<details> <summary>💡 Reveal Answer</summary>Because scaling out adds cost and complexity. Many bottlenecks are in the database, the code path, or an external dependency. Without measurement, adding servers can increase contention and make performance worse.
</details>Pick a system you can observe and run a mini capacity review:
Full access
Unlock all 12 lessons, templates, and resources for Software Architecture & Decision Patterns. Free.
Scenario: A marketing campaign will spike traffic 10x for 3 days. Your database is the current bottleneck.
What do you do first?
Create a read‑heavy path with caching or read replicas. Scaling app servers won’t fix a DB bottleneck. Add cache for popular reads, ensure indexes are optimized, and consider a temporary read replica for the campaign window.
| Idea | Remember This |
|---|---|
| Scaling ladder | Start simple, move to complex only when needed |
| Measurement | Decisions without data are guesses |
| Caching | Fastest performance win for read-heavy paths |
| Bottlenecks | Fix the slowest link first |
Next: Reliability & Resilience Patterns