“There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies
and the other is to make it so complicated that there are no obvious deficiencies.” — C A R Hoare
How very witty, true and simple. Perhaps we could just follow this witty but actually very crucial IT Systems advice. Is IT that simple? Are we done here then ?
Well … perhaps not. I think, it is not that simple for the organization, to come to feasible Cloud Computing Architecture, also tailored for the Enterprise. Especially if the “one” is a large corporation or any kind of large institution. Any kind of IT enterprise which is currently severely exposed to myriad of mission critical but in the same time “deep legacy” systems. The kind of IT systems which are here to stay “forever” inside any given large organization today.
The really big issue hidden here is the one, we all dread and we all know is true:
Legacy systems are staying with us, “for good”.
All together with “legacy strategies”, vested interests and a such.
When faced with the Cloud marketing, “defenders of the legacy” (people and organizations) will surely mix-in an additional requirement and difficulty: One Cloud is not Enough! Meaning: no single Cloud Computing Vendor can produce an SLA that can satisfy. Very often (or always) it is claimed that due to compliance and security requirements, (which are sometimes pure assumptions), one Cloud provider can not satisfy.
There will mushroom all sorts of reasons (some less true, some more), where Security and Compliance are unbreakable and unquestionable, and they will always come firmly on top of the list of the reasons of actually not going for the Cloud. Single or multiple Clouds, public or private or any combination of the two. Saas? PaaS? IaaS? Should we just give up then?
The right beginning is the core of success
I would like to think, I am an IT Architect. My favorite saying is: “Where diplomacy stops I do the Architecture”. Let me try and evolve these “legacy” assumptions. I shall try and transform them into valid requirements and then lastly into an inter-cloud IT architecture for enterprise. And yes: Safe and Compliant at that.
We shall start by using an ancient IT “lore” 🙂 We shall start form a very simple, almost primordial, but very scale-able and robust architecture, an special variant of what was in those days called: client/server system, circa 1970 (gasp!):
A bit of a legacy but still very nice, simple and resilient, is this not? Believe it or not, this is not what most of the legacy systems are based on. This is in essence one perhaps “ancient” but proven, tried and tested architecture. Here the key advantage is Decoupling of the Client and the Server. The existence of that Message Queuing (MQ) infrastructure, in the middle, which is in turn based on communication paradigm for Asynchronous communication. (Anybody remember middleware ?) From that history, I do like the term “fire and forget” … which nicely depicts the desired behavior of a message sender, and message consumer.
FFwd to 2014
Today and to some of you, this might look almost identical to architecture of the usual IT marketing peoduced “Slideware Architecture”. Perhaps, but please observe the key difference. “Slideware Architecture” is where client and server are connected with an innocently looking single line on some power point diagram, and nothing else. I can assure you the diagram above is very different architecture to that. Not having decoupled, asynchronous, event driven communication was and is the key problem with every legacy client server system. That is a true “ticking bomb” in most legacy IT systems, in all large organizations today. Here we are talking ancient IT Systems from those hazy days of CORBA, DCOM, Sockets, etc, when people imagined OOD, OOA, OOP and Objects are the “final solution”. Implemented usually as early C++ objects that will “just talk” to each other, in a “locationally transparent manner” . Popular term from “Slideware Architectures” of 1980-es.
Alas, that “final solution” of course never worketh. First generation client server systems immediately met the reality of complex (16 bit) networks, even more complex network appliances, and above all (very) complex personalities of network administrators. Thus very quickly every twentieth century budding OO Architect realized:
Network is not a transparent resource.
It is impossible to keep things together, physically based “just” on LAN. However well managed LAN that is. And then enter: WAN. Short answer: WAN is, and always was, out of the question. The unquestionable and untouchable resource.
Out of pure necessity, and very quickly sinking into this muddle-ware , some “enterprise” client/server solutions, equally quickly found a firm foothold in a good old messaging concepts and infrastructure. And all of a sudden, all (seemingly) was good again. (IBM) MQ has taken the pedestal (again). Not OOA or OOD or OOP, but system resilience became (again) more important than anything else.
Slightly before that “heroic period”, few other “clever people”, designed and implemented a concept of an network of asynchronous senders/receivers, which also had this magical ability to re-route around failed nodes. This later became ARPA net, which (rather quickly) became the foundation of what we simply call today “the Net” (where www is a part of that).
The outcome is that event/driven asynchronous communication is today recognised as the only way to keep arbitrary set of nodes together as one heterogeneous distributed system. And message queuing (MQ) concepts pre-date all of that. This is the key. Concept maturity, which is not stopping me today to base all of my resilient architectures on MQ as a concept. For example whenever I need a non-trivial (but quick) solution , which is (of course) used by end users over a web, I do something (almost) like this:
After all these years and all these projects, I am still amazed how much is such a simple concept as MQ, giving me. The ultimate resilience. Both is space and time. And of course the ultimate scalability. Without complex hardware, needed for allowing server clustering. I can (and will) simply add more web servers, if more is needed. They will all PUT on the same queue. And on the other side I will simply add more servers if required. They will all GET from the same queue. First come first served. And the opposite flow is handled the same way. All the back room server are PUT-ing on the same queue, and all the Web servers are GET-ing from the second queue.
There you have it. In two evolutionary steps we have arrived to the resilient architecture of an modern “screen to cloud” IT solution. Here is how.
Now. Instead of hiding the whole data centre side, in one cloud we will hide it in two clouds. With our bellowed MQ infrastructure (somewhere) in the middle. Why two clouds? Because of the legitimate security requirements. Compliance requirements. And yes because of “legacy people” trying to stop the Cloudification 🙂
Thus we need the solution part, which is owned by organization, to stay inside that organization. Physically in its data-centre. Logically in its private cloud.
And now that we have it, we can apply this architecture to address the requirements of every bank, or similar organization, which wants and needs to keep the core of its “hybrid” solution forever “inside”.
Here we have public cloud (cyan colored, on the left) to which users (aka customers) are connecting from their “screens” (aka browsers). This is a cloud which is optimized (by the Cloud vendor) to handle all the internet issues. The huge number of users, scalability, security, HTTPS, and the rest. On the right we have a, rather smaller, but specialized and rather important private cloud (orange colored). And the MQ infrastructure between the two which again, gives ultimately resilient, controllable and manageable traffic between the two clouds. By the way, this is also a core of modern on-line banking solution.
One detail one devil, many details many devils
Devils are in the details, and this time these devilish details I am not going to leave as an exercise for the reader 🙂 How do I suggest this to be implemented? Up till now we have more or less, studied in the realm of logical or conceptual. I have described resilient hybrid-cloud architecture with messaging infrastructure in the middle. I think it is simple and robust. But in reality this is just a hardware and platform. Installed and configured. What is missing is key application components to make this work, and do something useful for customers, using their web app’s on the client side. From here onwards, I will describe first key parts, of the architecture , I would put “on top” of this, So called Technical Architecture a.k.a. TA2.
The first one I will use is the concept of Messaging Service (MQS)1. Whereas we will encapsulate the messaging infrastructure inside (yet another) little cloud which will basically expose two service end-points : Put and Get.
This diagram mixes logical and physical, which I usually do not do, but still it nicely describes the role of MQS. It decouples and keeps the Messaging Infrastructure in the center of inter-server, inter-cloud resilient communication infrastructure, and in the same time allows us to abstract-away the existence of particular infrastructure, and all the other similar issues. Hint: Think Amazon SQS.
Thanks to this (almost) complete encapsulation and decoupling foundations, we can focus on the actual service functionality, on the logical level of the overall solution, so called Solution Architecture (SA). I have not mentioned SA until now because I knew in advance we do care how is MQ implemented. In SA in particular we are primarily interested how is messaging effectively used in communication between multiple senders and multiple receivers.
But. How this SA works actually?