Monitoring The Enchanted Cloud
A recurring theme in our discussions about cloud computing is that cloud computing is all about the app and service delivery. While we all think we’re providing a service to our users, our focus in a traditional data center is to keep the lights running and a byproduct of those running lights is the service we provide. With cloud computing (including private cloud), we take on the mindset of a service provider and focus on the services provided to our users.
This brings us to the topic of today’s blog post by a new guest blogger here on the Microsoft Private Cloud Blog – Microsoft Cloud Guy. In this installment, Microsoft Cloud Guy answers a question about expectations for a cloud based solution and highlights the key concept of service delivery and how monitoring the right things is critical when acting in the role of a service provider.
I hope you enjoy this article by Microsoft Cloud Guy and please write to him at his email address at the bottom of the post if you would like to ask him questions about private cloud. We’ll answer the questions he receives here on the Microsoft Private Cloud blog. Thanks! – Tom.
<—Microsoft Cloud Guy
Dear Microsoft Cloud Guy:
When I move my mission critical business application to Microsoft's dedicated cloud services, can I expect that everything will always work, or am I going to be trading one problem for another? Or put another way, why would I want to move my apps to anyone's cloud, and how can I make sure that they will work better than what I have today?
Miss (Dis) Enchanted
Dear Miss Enchanted,
Simple answer: No. Everything will NOT always work or be available. You will be disappointed if you set your expectations that high.
There are three parts in answering Miss Enchanted's question:
- Cost and
The majority of today's post will be about monitoring, but I'll start with the other two first.
Everything eventually moves towards specialization. Specialization is a way for all technology to eventually get better. The original Wright Brothers who built the first glider out of bicycle parts knew that they would have to get much smarter and better at making airplane parts. So they spent their days testing, researching and improving their parts and eventually their airplane.
Because of this hard work, we now have a huge industry of airplane specialists. The same is true of Microsoft's cloud technologies as well. Do you really care what the seven layers of the OSI protocol stack are, just so that you can send an email? It makes sense to let someone else worry about all of that.
The Cloud allows Microsoft to scale out and share needed technology as the demand for technology increases. And because it is shared, this means that the cost of the technology can be shared across a very large community.
Another way to say it is, specialization makes the cloud better. Sharing makes it cheaper.
The others are somewhat obvious, but why talk about monitoring? And what does this have to do with me moving my critical business application to the cloud?
When moving a mission critical application into Microsoft's dedicated cloud services, it is critical that you understand, build, and deploy your solution with monitoring in mind. Just because something is running in "the cloud" doesn't mean that it will run better. Stuff breaks. It is going to happen. It might be because of the way your application was written or designed, the type of services you purchased, the underlying network is broken, or even that the physical hardware is broken. Just because a critical application is in the cloud, doesn't mean that it is going to work better. Stuff happens. People make mistakes. Things break.
While it is true that the cloud can add redundancy, resiliency, better performance, the ability to scale, etc., it doesn't mean that it won't break. Once everyone actually accepts that things will break, then honesty (eventually) prevails and the conversation quickly moves to, "how can I make sure that I will know when it is busted?"
At this point, I have to admit a very personal and long held bias of mine. When it comes to building monitoring solutions for key customer systems, and I have built many of them, I always build them from an end-user's perspective. What this means is that the monitoring tools I build primarily focus on monitoring how a typical end-user sees the system and NOT how technology geeks or the underlying components see the system.
Think of it this way. Which is better, knowing that the entire solution works, or that all of the individual pieces that make up an entire solution work? With the latter, you never really know that the entire solution actually works. It is only implied. Not guaranteed. And worse still, if you take the approach of monitoring everything, you end up building a bunch of "things" that may never be needed or used.
My preference is simple. Focus first on the end-user experience when monitoring any system. In the end, what else really matters? What customers want are systems that work. What we need to do is build probes, tools, scripts, whatever, to monitor what "a person sees." If you run tests from an "end-user perspective", pretty soon you figure out what components underneath are consistently "breaking" your systems and the customer experience. Debugging aside, only after that should you add additional monitoring systems that measure only the "most likely to break" and not the most likely to succeed (kind of like High School, but not really). In this particular case, top down monitoring and design really do matter.
It is all about focus. And it took me a little while to figure it out. Many, many years ago. But I quickly learned one very important lesson. And I have never looked back.
If we focus on the customer first, everything else just falls into place.
Microsoft Cloud Guy (email@example.com)
Microsoft Cloud Guy