Monitor health of Log Analytics workspace in Azure Monitor
To maintain the performance and availability of your Log Analytics workspace in Azure Monitor, you need to be able to proactively detect any issues that arise. This article describes how to monitor the health of your Log Analytics workspace using data in the Operation table. This table is included in every Log Analytics workspace and contains error and warnings that occur in your workspace. You should regularly review this data and create alerts to be proactively notified when there are any important incidents in your workspace.
_LogOperation function
Azure Monitor Logs sends details on any issues to the Operation table in the workspace where the issue occurred. The _LogOperation system function is based on the Operation table and provides a simplified set of information for analysis and alerting.
Columns
The _LogOperation function returns the columns in the following table.
Column | Description |
---|---|
TimeGenerated | Time that the incident occurred in UTC. |
Category | Operation category group. Can be used to filter on types of operations and help create more precise system auditing and alerts. See the section below for a list of categories. |
Operation | Description of the operation type. This can indicate one of the Log Analytics limits, type of operation, or part of a process. |
Level | Severity level of the issue: - Info: No specific attention needed. - Warning: Process was not completed as expected, and attention is needed. - Error: Process failed and urgent attention is needed. |
Detail | Detailed description of the operation include specific error message if it exists. |
_ResourceId | Resource ID of the Azure resource related to the operation. |
Computer | Computer name if the operation is related to an Azure Monitor agent. |
CorrelationId | Used to group consecutive related operations. |
Categories
The following table describes the categories from the _LogOperation function.
Category | Description |
---|---|
Ingestion | Operations that are part of the data ingestion process. See below for more details. |
Agent | Indicates an issue with agent installation. |
Data collection | Operations related to data collections processes. |
Solution targeting | Operation of type ConfigurationScope was processed. |
Assessment solution | An assessment process was executed. |
Ingestion
Ingestion operations are issues that occurred during data ingestion including notification about reaching the Azure Log Analytics workspace limits. Error conditions in this category might suggest data loss, so they are particularly important to monitor. The table below provides details on these operations. See Azure Monitor service limits for service limits for Log Analytics workspaces.
Operation | Level | Detail | Related article |
---|---|---|---|
Custom log | Error | Custom fields column limit reached. | Azure Monitor service limits |
Custom log | Error | Custom logs ingestion failed. | |
Metadata. | Error | Configuration error detected. | |
Data collection | Error | Data was dropped because the request was created earlier than the number of set days. | Manage usage and costs with Azure Monitor Logs |
Data collection | Info | Collection machine configuration is detected. | |
Data collection | Info | Data collection started due to new day. | Manage usage and costs with Azure Monitor Logs |
Data collection | Warning | Data collection stopped due to daily limit reached. | Manage usage and costs with Azure Monitor Logs |
Data processing | Error | Invalid JSON format. | Send log data to Azure Monitor with the HTTP Data Collector API (public preview) |
Data processing | Warning | Value has been trimmed to the max allowed size. | Azure Monitor service limits |
Data processing | Warning | Field value trimmed as size limit reached. | Azure Monitor service limits |
Ingestion rate | Info | Ingestion rate limit approaching 70%. | Azure Monitor service limits |
Ingestion rate | Warning | Ingestion rate limit approaching the limit. | Azure Monitor service limits |
Ingestion rate | Error | Rate limit reached. | Azure Monitor service limits |
Storage | Error | Cannot access the storage account as credentials used are invalid. |
Alert rules
Use log query alerts in Azure Monitor to be proactively notified when an issue is detected in your Log Analytics workspace. You should use a strategy that allows you to respond in a timely manner to issues while minimizing your costs. Your subscription is charged for each alert rule with a cost depending on the frequency that it's evaluated.
A recommended strategy is to start with two alert rules based on the level of the issue. Use a short frequency such as every 5 minutes for Errors and a longer frequency such as 24 hours for Warnings. Since Errors indicate potential data loss, you want to respond to them quickly to minimize any loss. Warnings typically indicate an issue that does not require immediate attention, so you can review them daily.
Use the process in Create, view, and manage log alerts using Azure Monitor to create the log alert rules. The following sections describe the details for each rule.
Query | Threshold value | Period | Frequency |
---|---|---|---|
_LogOperation | where Level == "Error" |
0 | 5 | 5 |
_LogOperation | where Level == "Warning" |
0 | 1440 | 1440 |
These alert rules will respond the same to all operations with Error or Warning. As you become more familiar with the operations that are generating alerts, you may want to respond differently for particular operations. For example, you may want to send notifications to different people for particular operations.
To create an alert rule for a specific operation, use a query that includes the Category and Operation columns.
The following example creates a warning alert when the ingestion volume rate has reached 80% of the limit.
- Target: Select your Log Analytics workspace
- Criteria:
- Signal name: Custom log search
- Search query:
_LogOperation | where Category == "Ingestion" | where Operation == "Ingestion rate" | where Level == "Warning"
- Based on: Number of results
- Condition: Greater than
- Threshold: 0
- Period: 5 (minutes)
- Frequency: 5 (minutes)
- Alert rule name: Daily data limit reached
- Severity: Warning (Sev 1)
The following example creates a warning alert when the data collection has reached the daily limit.
- Target: Select your Log Analytics workspace
- Criteria:
- Signal name: Custom log search
- Search query:
_LogOperation | where Category == "Ingestion" | where Operation == "Data collection Status" | where Level == "Warning"
- Based on: Number of results
- Condition: Greater than
- Threshold: 0
- Period: 5 (minutes)
- Frequency: 5 (minutes)
- Alert rule name: Daily data limit reached
- Severity: Warning (Sev 1)
Next steps
- Learn more about log alerts.
- Collect query audit data for your workspace.