Fallacy The Network Is Reliable

Core Idea

“The network is reliable” is the first and most critical of the Fallacies of Distributed Computing—the false assumption that network calls between services will always succeed. In reality, networks fail constantly due to hardware faults, misconfigurations, packet loss, timeouts, and infrastructure issues, making network unreliability the foundational challenge of distributed architecture.

What Is the “Network Is Reliable” Fallacy?

The “network is reliable” fallacy is the assumption that when a service makes a network call to another service, that call will always complete successfully:

This assumption is natural when working within a single process (where method calls virtually never fail)
But it becomes catastrophically wrong when applied to distributed systems where network boundaries exist

In monolithic applications:

When one module calls another module’s function, the call either succeeds (returns a result) or fails immediately with a clear exception
The failure modes are limited: out of memory, null pointer, logic error—all deterministic and usually reproducible

In distributed systems, a network call introduces an entirely new category of failures:

Transient network errors
Service unavailability
Timeout ambiguity (did the request succeed but the response was lost?)
Partial failures
Cascading failures across multiple services

Network failures happen constantly in production systems:

A hardware switch can fail
A fiber optic cable can be accidentally cut during construction
A misconfigured firewall rule can silently drop packets
A congested network can delay packets past timeout thresholds
A DNS lookup can fail
A load balancer can route traffic to a dead instance
Cloud provider infrastructure can experience outages
Each of these scenarios breaks the “network is reliable” assumption

The fallacy is particularly insidious:

Networks often appear reliable during development and testing—local networks are fast and stable, test environments have minimal load, and latency is low
It’s only in production, under real-world conditions with scale, geographic distribution, and operational complexity, that network unreliability becomes apparent
This leads to systems that work perfectly in testing but fail unpredictably in production

Addressing this fallacy requires explicit architectural patterns:

Retry logic handles transient failures
Circuit breakers prevent cascading failures by stopping calls to failing services
Timeouts prevent indefinite blocking
Idempotency ensures retries don’t cause duplicate operations
Health checks detect failed instances
These patterns add complexity, operational overhead, and development effort—costs that architects must accept when choosing distributed architectures

Why This Matters

This fallacy is foundational because it affects every other aspect of distributed system design:

If you assume the network is reliable, you won’t build in retry logic, and your system will fail when transient network issues occur
You won’t implement circuit breakers, so one failing service will cascade failures across your entire system
You won’t design for idempotency, so retries will corrupt data by creating duplicate orders or duplicate charges

Understanding this fallacy forces architects to make honest trade-off decisions:

The benefits of distributed architectures—independent deployability, scalability, fault isolation
Come at the cost of handling network unreliability
If you cannot justify the effort and complexity of building resilient distributed systems, you should reconsider whether distribution is appropriate for your use case

This is why the decision to move from monolith to microservices is not a matter of following trends:

It’s a deliberate decision to trade monolithic simplicity for distributed scalability
While accepting the engineering burden of handling network unreliability through:
- Retry policies
- Circuit breakers
- Timeouts
- Distributed tracing
- Comprehensive monitoring

Fallacies-of-Distributed-Computing — The complete set of eight fallacies this belongs to
Monolithic-vs-Distributed-Architectures — The architectural decision this fallacy most directly impacts
Fallacy-Latency-Is-Zero — Related fallacy about network performance assumptions
Architecture-Characteristics-Categories — Reliability and availability characteristics affected by this fallacy
Trade-Offs-and-Least-Worst-Architecture — This fallacy exemplifies why distributed architectures involve trade-offs
Client-Server-Architecture — The simplest distributed pattern where this fallacy first appears

Sources

Richards, Mark and Neal Ford (2020). Fundamentals of Software Architecture: An Engineering Approach. O’Reilly Media. ISBN: 978-1-492-04345-4.
- Chapter 9: Foundations
- Discusses the Fallacies of Distributed Computing and their architectural implications
- Available: https://www.oreilly.com/library/view/fundamentals-of-software/9781492043447/
Deutsch, Peter (1994-1997). “The Eight Fallacies of Distributed Computing.” Originally articulated at Sun Microsystems.
- First fallacy in the original list
- Identified through observing repeated distributed system failures in production
- Widely referenced in distributed systems literature

Note

This content was drafted with assistance from AI tools for research, organization, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.

Manu's Vault

Explorer

Fallacy The Network Is Reliable

What Is the “Network Is Reliable” Fallacy?

Why This Matters

Sources

Graph View

Table of Contents

Backlinks

Manu's Vault

Explorer

Fallacy The Network Is Reliable

What Is the “Network Is Reliable” Fallacy?

Why This Matters

Related Concepts

Sources

Graph View

Table of Contents

Backlinks