Monitoring for workloads
Traditional application and infrastructure monitoring is based on the application running successfully and its response time. However, cloud-based monitoring offers many more opportunities you can leverage to give users the best experience. In a production environment, it's important to track the way customers use your system, trace resource utilization, and generally monitor the health and performance of your system. You can use this information as a diagnostic aid to detect and correct issues, and also to help spot potential problems, and prevent them from occurring.
Application monitoring
Application Insights is the Azure Service that allows you to:
- Verify that your application is running correctly.
- Makes application troubleshooting easier.
- Provides custom business telemetry to indicate whether your application is being used as intended.
Make sure you leverage all the rich information that Application Insights can provide about your application. This list is not exhaustive, but the following items provide visibility into what Application Insights can offer:
- Application Insights offers you a default dashboard with an educated guess of the most important metrics you will be interested in. You can then modify it and customize it to your own needs.
- By instrumenting your application correctly, Application Insights will give you performance statistics both from a client and a server perspective
- The Application Map will show you application dependencies in other services such as backend APIs or databases, allowing to determine visually where performance problems lie
- Smart Detection will warn you when anomalies in performance or utilization patterns happen
- Usage Analysis can give you telemetry on which features of your application are most frequently used, or whether all your application functionality is being used. This feature is especially useful after changes to the application functionality, to verify whether those changes were successful
- Release annotations are visual indicators in your Application Insights charts of new builds and other events, so that you can visually correlate changes in application performance to code releases, being able to quickly pinpoint performance problems.
- Cross-component transaction diagnostics allow you to follow failed transactions to find the point in the architecture where the fault was originated.
- Snapshot Debugger, to automatically collect a snapshot of a live application in case of an exception, to analyze it at a later stage.
To use Application Insights you have two options:
- Use codeless monitoring, where onboarding your app to Application Insights doesn't require any code change.
- Use code-based monitoring, where you instrument your code to send telemetry to Application Insights using the Software Development Kit for your programming language of choice.
You can certainly use other Application Performance Management tools to monitor your application on Azure, such as NewRelic or AppDynamics, but Application Insights will give you the most seamless and integrated experience.
Platform monitoring
Application Insights is actually one of the components of Azure Monitor, which gives you rich metrics and logs to verify the state of your complete Azure landscape. No matter whether your application is running on Virtual Machines, App Services, or Kubernetes, Azure Monitor will help you to follow the state of your infrastructure, and to react promptly if there are any issues.
Make sure not only to monitor your compute elements supporting your application code, but your data platform as well: databases, storage accounts, or data lakes should be closely monitored, since a low performance of the data tier of an application could have serious consequences.
Container insights
Should your application run on Azure Kubernetes Service, Azure Monitor allows you to easily monitor the state of your cluster, nodes, and pods. Easy to configure for AKS clusters, Container Insights delivers quick, visual, and actionable information: from the CPU and memory pressure of your nodes to the logs of individual Kubernetes pods.
Additionally, for operators that prefer using the open-source Kubernetes monitoring tool Prometheus but still like the ease of use of Azure Monitor Container Insights, both solutions can integrate with each other.
The Sidecar Pattern adds a separate container with responsibilities that are required by the main container. A common use case is for running logging utilities and monitoring agents.
Network monitoring
Regardless the form factor or programming language your application is based on, the network connecting your code to your users can make or break the experience that your application provides. As a consequence monitoring and troubleshooting the network can be decisive for an operations team. The component of Azure Monitor that manages the network components is called Network Watcher, a collection of network monitoring and troubleshooting tools. Some of these tools are:
- Traffic Analytics will give you an overview of the traffic in your Virtual Networks, as well as the percentage coming from malicious IP addresses, leveraging Microsoft Threat Intelligence databases. This tool will show you as well the systems in your virtual networks that generate most traffic, so that you can visually identify bottlenecks before they degenerate into problems.
- Network Performance Monitor can generate synthetic traffic to measure the performance of network connections over multiple links, giving you a perspective on the evolution of WAN and Internet connections over time, as well as offering valuable monitoring information about Microsoft ExpressRoute circuits.
- VPN diagnostics can help troubleshooting site-to-site VPN connections connecting your applications to users on-premises.
- Connection Monitor allows you to measure the network availability between sets of endpoints.
Other information sources
Not only your application components are producing data, but there are many other signals that you need to track to effectively operate a cloud environment:
- Activity Log: this is a trail audit that lets you see every change that has gone through Azure APIs. It can be critical to understand sudden performance changes or problems, that might have been due to a misconfiguration of the Azure platform.
- Azure Service Health: sometimes outages are provoked not by configuration changes, but by glitches in the Azure platform itself. You can find information about any Azure-related problem impacting your application in the Azure Service Health logs.
- Azure Advisor: find here recommendations about how to optimize your Azure platform to reduce costs, improve your security posture, or increase the availability of your environment.
- Microsoft Defender for Cloud: not a focus of this pillar, but to be included for completeness: Microsoft Defender for Cloud can help you to understand whether your Azure resources are configured according to security best practices
Monitor external dependencies
Just as important as monitoring your applications and Azure infrastructure, monitoring external dependencies is also critical. Ensure that you have solutions to gain insight into the health and availability of external dependencies such as third party services, package repositories, and other related applications.
Azure monitor can have a role in monitoring external dependencies. Azure Monitor can ingest data from several custom data sources, which provides some extensibility into what can be monitored. Utilizing custom data sources for external dependencies provides you with the capabilities of Azure Monitor on virtual any service, endpoint, or software solution.
To learn more about ingesting data from custom sources, see the following articles.
- Collect custom logs with Log Analytics agent in Azure Monitor
- Collecting custom JSON data sources with Log Analytics agent for Linux
- Send log data to Azure Monitor with the HTTP Data Collector API
Monitoring best practices
Event correlation
One critical advantage of Azure Monitor is that it is the monitoring tool for the whole Azure platform. As the previous sections have shown, Azure Monitor holds metrics and logs relevant to your application code, the platform where it is running, the data components, as well as the network connecting the application to its users. This enables operators to compare metrics of different application components to each other, and find out dependencies that might have been hidden otherwise.
Dashboards in Azure offer a great way of exposing the rich information contained in Azure Monitor to other users. Make sure to create shared dashboards in order to expose relevant information to the different groups involved in operating your application, including Developers and Operators. If more complex visualizations are required, Azure Monitor data can be exported to Power BI for advanced data analysis.
Notifications
Whether it is for application, network or platform monitoring, you should not expect operators to constantly look at dashboards. Instead, alerts should be used to send proactive notifications to the relevant individuals that will react on them. Action groups in Azure Monitor can be used to notify multiple recipients, to trigger automated actions, or even to automatically open tickets in IT Service Management Tools such as ServiceNow.
Automation around alerts is critical due to the highly collaborative nature of DevOps and the inherent speed needed for effective incident management. Earlier this year, a report from DevOps.com came out stating that 80% of IT teams are alerted to critical incidents via email. Email is an effective form of communication, but it shouldn't be the most common notification method for a critical issue. Instead, if you can define actions to be executed upon receiving certain alerts (such as scaling up or down) your system will be self-healing.
Other monitoring tasks
Beyond Azure Monitor, you will want to keep an eye on certain events to make sure that your application is running smoothly:
- Review Azure subscription limits for your resources, and make sure you are not coming too close.
- Understand Azure support plans. Refer to Azure support FAQs. Familiarize your team with Azure support.
- Make sure that you monitor expiration dates of digital certificates, or even better, configure automatic digital certificate renewal with Azure Key Vault.
Summary
You can use any monitoring platform to manage your Azure resources. Microsoft's first party offering is Azure Monitor, a comprehensive solution for metrics and logs from the infrastructure to the application code, including the ability to trigger alerts and automated actions as well as data visualization.