Choose how services talk based on latency, coupling, and failure modes
Most architecture failures show up between systems, not inside them. The way services communicate defines reliability, coupling, and latency.
Pros: simple, predictable, easy to debug
Cons: tight coupling; caller waits; cascading failures
Best for: user-facing reads, low-latency paths
Pros: decouples services, smooths traffic
Cons: harder debugging; eventual consistency
Best for: background jobs, emails, non‑critical processing
Pros: scalable, multiple consumers, loose coupling
Cons: schema versioning, ordering complexity
Best for: audit trails, analytics, multi‑team integration
Ask: “What happens if this downstream system is slow or down?”
Don’t pick by popularity — pick by tradeoff.
Q: Why can synchronous calls cause cascading failures?
<details> <summary>💡 Reveal Answer</summary>If Service A depends on Service B synchronously, and B slows down or fails, A’s threads or connection pool get saturated. This can take A down too, cascading the failure across the system.
</details>Full access
Unlock all 12 lessons, templates, and resources for Software Architecture & Decision Patterns. Free.
Take a feature you know (e.g., “order confirmation”).
Scenario: Your checkout endpoint calls fraud detection synchronously. The fraud service is flaky. Sales are dropping.
How do you redesign the flow?
Move fraud checks to async with a fallback: approve orders initially, then hold or reverse suspicious ones later. This reduces latency and prevents total checkout failure while still managing risk.
| Idea | Remember This |
|---|---|
| Sync vs async | Choose based on failure impact |
| Events | Great for multi‑consumer workflows |
| API style | Pick the protocol that matches constraints |
| Propagation | Design around downstream failures |
Next: Scalability Decisions: Scale Up, Out, or Down