Core Idea
“Latency is zero” is the second of the Fallacies of Distributed Computing—the false assumption that network calls between services happen instantaneously. In reality, every network call introduces measurable latency (typically 1-100ms or more), which accumulates across service boundaries and fundamentally changes how distributed systems must be designed compared to in-process method calls.
What Is the “Latency Is Zero” Fallacy?
The “latency is zero” fallacy is the assumption that when one service calls another over the network, the response is instantaneous:
- Just like calling a local method within the same process
- This assumption is deeply intuitive because in-process method calls complete in nanoseconds, making the delay imperceptible
- However, network calls operate on a completely different timescale: milliseconds to seconds, not nanoseconds
The latency comparison:
- Local method call within a single process: 1-10 nanoseconds
- Network call to a service in the same data center: 1-10 milliseconds—roughly one million times slower
- Cross-region call (e.g., US East Coast to Europe): 50-150 milliseconds
- Congested or overloaded network/service: seconds or timeout entirely
- This difference in magnitude fundamentally changes how systems must be designed
The fallacy becomes catastrophic when architects design distributed systems by simply replacing in-process method calls with remote API calls without accounting for latency. Consider a monolithic application where retrieving user data, fetching their order history, and calculating recommendations involves three in-process method calls totaling perhaps 100 microseconds. If you split this into three microservices with three network calls, the same operation now takes 30-300 milliseconds—a 300-3000x slowdown. Add more service hops, and latency compounds exponentially.
This latency accumulation leads to serious performance degradation. A single web page that makes 20 downstream service calls, each taking 50ms, introduces 1000ms of network latency alone—before any actual business logic executes. Users perceive the system as slow. Timeouts occur. Database connection pools exhaust because threads block waiting for slow network responses. The system becomes unresponsive under load despite adequate CPU and memory resources.
Addressing this fallacy requires architectural strategies to minimize latency impact. Caching reduces repeated calls. Asynchronous communication decouples services so they don’t wait for responses. Bulk APIs reduce round trips by batching multiple requests. Service mesh patterns optimize routing. Edge computing moves services closer to users. Data locality ensures related services are co-located in the same data center. These patterns add architectural complexity but are essential for acceptable performance in distributed systems.
Why This Matters
This fallacy directly impacts user experience and system scalability. Slow response times cause user abandonment—research shows that delays beyond 200-300ms significantly reduce conversion rates for web applications. Latency also affects throughput: if each request takes 100ms longer due to network calls, your system handles 10x fewer requests per second with the same resources.
Understanding this fallacy forces architects to make deliberate trade-offs when choosing distributed architectures. The benefits of distribution—independent deployability, scalability, fault isolation—come at the cost of latency. If your application requires sub-millisecond response times or makes hundreds of fine-grained calls, distribution may be inappropriate. If latency is acceptable and you can design around it with caching and async patterns, distribution becomes viable.
This fallacy also explains why microservices are not universally appropriate. Systems with tight coupling and frequent inter-service communication suffer exponentially from latency accumulation. Monolithic architectures, or coarser-grained services that minimize network hops, may deliver better performance despite losing some benefits of distribution. The decision depends on whether you can tolerate the latency cost for the benefits you gain.
Related Concepts
- Fallacies-of-Distributed-Computing — The complete set of eight fallacies this belongs to
- Fallacy-The-Network-Is-Reliable — Related fallacy about network failure assumptions
- Fallacy-Bandwidth-Is-Infinite — Related fallacy about data transfer capacity
- Monolithic-vs-Distributed-Architectures — The architectural decision this fallacy impacts
- Operational-Characteristics — Performance and latency are key operational characteristics
- Trade-Offs-and-Least-Worst-Architecture — Latency costs exemplify architectural trade-offs
- Microservices-Architecture-Style — Style most affected by latency accumulation
Sources
-
Richards, Mark and Neal Ford (2020). Fundamentals of Software Architecture: An Engineering Approach. O’Reilly Media. ISBN: 978-1-492-04345-4.
- Chapter 9: Foundations
- Discusses the Fallacies of Distributed Computing and their architectural implications
- Available: https://www.oreilly.com/library/view/fundamentals-of-software/9781492043447/
-
Deutsch, Peter (1994-1997). “The Eight Fallacies of Distributed Computing.” Originally articulated at Sun Microsystems.
- Second fallacy in the original list
- Identified through observing repeated performance issues in distributed systems
- Widely referenced in distributed systems literature
Note
This content was drafted with assistance from AI tools for research, organization, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.