Core Idea

“The network is reliable” is the first and most critical of the Fallacies of Distributed Computing—the false assumption that network calls between services will always succeed. In reality, networks fail constantly due to hardware faults, misconfigurations, packet loss, timeouts, and infrastructure issues, making network unreliability the foundational challenge of distributed architecture.

What Is the “Network Is Reliable” Fallacy?

The fallacy assumes a service-to-service call will always complete successfully — natural when working within a single process (where method calls virtually never fail), but catastrophically wrong across network boundaries.

In monolithic applications, failure modes are limited and deterministic: out of memory, null pointer, logic error. In distributed systems, a network call introduces an entirely new failure category:

  • Transient network errors — hardware switch failure, cut cables, misconfigured firewall silently dropping packets
  • Timeout ambiguity — did the request succeed but the response was lost?
  • Partial failures — some requests succeed while others silently drop
  • Cascading failures — one failing service brings down dependent services

The fallacy is especially insidious because networks often appear reliable during development — local networks are fast, test environments have minimal load, and latency is low. Systems work perfectly in testing, then fail unpredictably in production.

Addressing this fallacy requires explicit architectural patterns:

  • Retry logic — handles transient failures
  • Circuit breakers — prevent cascading failures by stopping calls to failing services
  • Timeouts — prevent indefinite blocking
  • Idempotency — ensures retries don’t cause duplicate operations
  • Health checks — detect and remove failed instances from routing

These patterns add complexity and development effort — costs architects must consciously accept when choosing distributed architectures.

Why This Matters

This fallacy is foundational because it affects every other aspect of distributed system design. Without retry logic, transient failures become user-visible errors. Without circuit breakers, one failing service cascades failures across the entire system. Without idempotency, retries corrupt data by creating duplicate orders or charges.

Understanding this fallacy forces honest trade-off decisions: the benefits of distributed architectures — independent deployability, scalability, fault isolation — come at the cost of handling network unreliability. Moving from monolith to microservices is not about following trends; it’s a deliberate decision to trade simplicity for scalability while accepting the engineering burden of resilience patterns, distributed tracing, and comprehensive monitoring.

Sources

  • Richards, Mark and Neal Ford (2020). Fundamentals of Software Architecture: An Engineering Approach. O’Reilly Media. ISBN: 978-1-492-04345-4.

  • Deutsch, Peter (1994-1997). “The Eight Fallacies of Distributed Computing.” Originally articulated at Sun Microsystems.

    • First fallacy in the original list
    • Identified through observing repeated distributed system failures in production
    • Widely referenced in distributed systems literature

Note

This content was drafted with assistance from AI tools for research, organization, and initial content generation. All final content has been reviewed, fact-checked, and edited by the author to ensure accuracy and alignment with the author’s intentions and perspective.