Summary
I'm re-thinking how I use the words centralized, decentralized, and distributed to describe systems.
I teach a course at BYU every year called "Large Scale Distributed Systems." As I discuss distributed systems with the class, there is always a bit of a terminology issue I have. It has to do with how we think of distributed systems vs. decentralized systems. You often see this diagram floating around the net:
This always feels like an attempt is to place the ideas of centralized, decentralized, and distributed computing on some kind of continuum.
In his PhD dissertation, Extending the REpresentational State Transfer (REST) Architectural Style for Decentralized Systems (PDF), Rohit Khare makes a distinction about decentralized systems that has always felt right to me. Rohit uses "decentralized" to distinguish systems that are under the control of different entities and thus can't be coordinated by fiat.
Plenty of systems are distributed that are still under the control of a single entity. Almost any large Web 2.0 service will be hosted from different data centers, for example. What distinguishes the Internet, SMTP, and other distributed systems is that they are also made to work across organizational boundaries. There's no centerpoint that controls everything.
Consequently, I propose a new way of thinking about this that gives up on the linearity of graphics like the one above and resorts to that most powerful of all analytic tools, the 2x2 matrix:
In this conceptualization, we classify systems along two axes:
- Whether the components are co-located or distributed. This could be either physical or logical depending on the context and level of abstractions.
- Whether the components are under the control of a single entity or multiple entities. A central control point could be logical or abstract so long as it is able to effectively coordinate nodes in the system.
We could envision a third axis on the model that also classifies systems as to whether they are hierarchical or heterarchical like so:
If you're having trouble with the distinction, note that DNS is a decentralized, hierarchical system where as Facebook's OpenGraph is a centralized, heterarchical system.
I like this model and so, for now, I'm sticking with it and starting to think of and describe systems in this way. I've gotten some mental leverage out of it. I'd love to know what you think.