Troubleshoot the sensor and on-premises management console
This article describes basic troubleshooting tools for the sensor and the on-premises management console. In addition to the items described here, you can check the health of your system in the following ways:
- Alerts: An alert is created when the sensor interface that monitors the traffic is down.
- SNMP: Sensor health is monitored through SNMP. Microsoft Defender for IoT responds to SNMP queries sent from an authorized monitoring server.
- System notifications: When a management console controls the sensor, you can forward alerts about failed sensor backups and disconnected sensors.
Check system health
Check your system health from the sensor or on-premises management console.
To access the system health tool:
Sign in to the sensor or on-premises management console with the Support user credentials.
Select System Statistics from the System Settings window.
System health data appears. Select an item on the left to view more details in the box. For example:
System health checks include the following:
| Name | Description |
|---|---|
| Sanity | |
| - Appliance | Runs the appliance sanity check. You can perform the same check by using the CLI command system-sanity. |
| - Version | Displays the appliance version. |
| - Network Properties | Displays the sensor network parameters. |
| Redis | |
| - Memory | Provides the overall picture of memory usage, such as how much memory was used and how much remained. |
| - Longest Key | Displays the longest keys that might cause extensive memory usage. |
| System | |
| - Core Log | Provides the last 500 rows of the core log, so that you can view the recent log rows without exporting the entire system log. |
| - Task Manager | Translates the tasks that appear in the table of processes to the following layers: - Persistent layer (Redis) - Cash layer (SQL) |
| - Network Statistics | Displays your network statistics. |
| - TOP | Shows the table of processes. It's a Linux command that provides a dynamic real-time view of the running system. |
| - Backup Memory Check | Provides the status of the backup memory, checking the following: - The location of the backup folder - The size of the backup folder - The limitations of the backup folder - When the last backup happened - How much space there are for the extra backup files |
| - ifconfig | Displays the parameters for the appliance's physical interfaces. |
| - CyberX nload | Displays network traffic and bandwidth by using the six-second tests. |
| - Errors from Core, log | Displays errors from the core log file. |
Check system health by using the CLI
Verify that the system is up and running prior to testing the system's sanity.
To test the system's sanity:
Connect to the CLI with the Linux terminal (for example, PuTTY) and the user Support.
Enter
system sanity.Check that all the services are green (running).
Verify that System is UP! (prod) appears at the bottom.
Verify that the correct version is used:
To check the system's version:
Connect to the CLI with the Linux terminal (for example, PuTTY) and the user Support.
Enter
system version.Check that the correct version appears.
Verify that all the input interfaces configured during the installation process are running:
To validate the system's network status:
Connect to the CLI with the Linux terminal (for example, PuTTY) and the Support user.
Enter
network list(the equivalent of the Linux commandifconfig).Validate that the required input interfaces appear. For example, if two quad Copper NICs are installed, there should be 10 interfaces in the list.
Verify that you can access the console web GUI:
To check that management has access to the UI:
Connect a laptop with an Ethernet cable to the management port (Gb1).
Define the laptop NIC address to be in the same range as the appliance.
Ping the appliance's IP address from the laptop to verify connectivity (default: 10.100.10.1).
Open the Chrome browser in the laptop and enter the appliance's IP address.
In the Your connection is not private window, select Advanced and proceed.
The test is successful when the Defender for IoT sign-in screen appears.
Troubleshoot sensors
You can't connect by using a web interface
Verify that the computer that you're trying to connect is on the same network as the appliance.
Verify that the GUI network is connected to the management port.
Ping the appliance's IP address. If there is no ping:
Connect a monitor and a keyboard to the appliance.
Use the Support user and password to sign in.
Use the command
network listto see the current IP address.
If the network parameters are misconfigured, use the following procedure to change them:
Use the command
network edit-settings.To change the management network IP address, select Y.
To change the subnet mask, select Y.
To change the DNS, select Y.
To change the default gateway IP address, select Y.
For the input interface change (sensor only), select N.
To apply the settings, select Y.
After restart, connect with the Support user credentials and use the
network listcommand to verify that the parameters were changed.Try to ping and connect from the GUI again.
The appliance isn't responding
Connect a monitor and keyboard to the appliance, or use PuTTY to connect remotely to the CLI.
Use the Support user credentials to sign in.
Use the
system sanitycommand and check that all processes are running.
For any other issues, contact Microsoft Support.
Investigate password failure at initial sign in
When signing into a preconfigured sensor for the first time, you'll need to perform password recovery as follows:
On the Defender for IoT sign in screen, select Password recovery. The Password recovery screen opens.
Select either CyberX or Support, and copy the unique identifier.
Navigate to the Azure portal and select Sites and Sensors.
Select the More Actions drop down menu and select Recover on-premises management console password.
Enter the unique identifier that you received on the Password recovery screen and select Recover. The
password_recovery.zipfile is downloaded. Do not extract or modify the zip file.
On the Password recovery screen, select Upload. The Upload Password Recovery File window will open.
Select Browse to locate your
password_recovery.zipfile, or drag thepassword_recovery.zipto the window.Select Next, and your user, and system-generated password for your management console will then appear.
Note
When you sign in to a sensor or on-premises management console for the first time it will be linked to the subscription you connected it to. If you need to reset the password for the CyberX, or Support user you will need to select that subscription. For more information on recovering a CyberX, or Support user password, see Recover the password for the on-premises management console, or the sensor.
Investigate a lack of traffic
An indicator appears at the top of the console when the sensor recognizes that there's no traffic on one of the configured ports. This indicator is visible to all users. When this message appears, you can investigate where there's no traffic. Make sure the span cable is connected and there was no change in the span architecture.
Check system performance
When a new sensor is deployed or a sensor is working slowly or not showing any alerts, you can check system performance.
- In the Defender for IoT dashboard > Overview, make sure that
PPS > 0. - In Devices* check that devices are being discovered.
- In Data Mining, generate a report.
- In Trends & Statistics window, create a dashboard.
- In Alerts, check that the alert was created.
Investigate a lack of expected alerts
If the Alerts window doesn't show an alert that you expected, verify the following:
- Check if the same alert already appears in the Alerts window as a reaction to a different security instance. If yes, and this alert has not been handled yet, the sensor console does not show a new alert.
- Make sure you did not exclude this alert by using the Alert Exclusion rules in the management console.
Investigate dashboard that shows no data
When the dashboards in the Trends & Statistics window show no data, do the following:
- Check system performance.
- Make sure the time and region settings are properly configured and not set to a future time.
Investigate a device map that shows only broadcasting devices
When devices shown on the device map appear not connected to each other, something might be wrong with the SPAN port configuration. That is, you might be seeing only broadcasting devices and no unicast traffic.
- Validate that you're only seeing the broadcast traffic. To do this, in Data Mining, select Create report. In Create new report,specify the report fields. In Choose Category, choose Select all.
- Save the report, and review it to see if only broadcast and multicast traffic (and no unicast traffic) appears. If so, asking networking to fix the SPAN port configuration so that you can see the unicast traffic as well. Alternately, you can record a PCAP directly from the switch, or connect a laptop by using Wireshark.
Connect the sensor to NTP
You can configure a standalone sensor and a management console, with the sensors related to it, to connect to NTP.
To connect a standalone sensor to NTP:
To connect a sensor controlled by the management console to NTP:
- The connection to NTP is configured on the management console. All the sensors that the management console controls get the NTP connection automatically.
Investigate when devices aren't shown on the map, or you have multiple internet-related alerts
Sometimes ICS devices are configured with external IP addresses. These ICS devices are not shown on the map. Instead of the devices, an internet cloud appears on the map. The IP addresses of these devices are included in the cloud image. Another indication of the same problem is when multiple internet-related alerts appear. Fix the issue as follows:
- Right-click the cloud icon on the device map and select Export IP Addresses.
- Copy the public ranges that are private, and add them to the subnet list. Learn more about configuring subnets.
- Generate a new data-mining report for internet connections.
- In the data-mining report, enter the administrator mode and delete the IP addresses of your ICS devices.
Clearing sensor data to factory default
In cases where the sensor needs to be relocated or erased, the sensor can be reset to factory default data.
Note
Network settings such as IP/DNS/GATEWAY will not be changed by clearing system data.
To clear system data:
Sign in to the sensor as the cyberx user.
Select Support > Clear system data, and confirm that you do want to reset the sensor to factory default data.
All allowlists, policies, and configuration settings are cleared, and the sensor is restarted.
Troubleshoot an on-premises management console
Investigate a lack of expected alerts
If you don't see an expected alert on the on-premises Alerts page, do the following to troubleshoot:
Verify whether the alert is already listed as a reaction to a different security instance. If it has, and that alert hasn't yet been handled, a new alert isn't shown elsewhere.
Verify that the alert isn't being excluded by Alert Exclusion rules. For more information, see Create alert exclusion rules.
Tweak the Quality of Service (QoS)
To save your network resources, you can limit the number of alerts sent to external systems (such as emails or SIEM) in one sync operation between an appliance and the on-premises management console.
The default is 50. This means that in one communication session between an appliance and the on-premises management console, there will be no more than 50 alerts to external systems.
To limit the number of alerts, use the notifications.max_number_to_report property available in /var/cyberx/properties/management.properties. No restart is needed after you change this property.
To tweak the Quality of Service (QoS):
Sign in as a Defender for IoT user.
Verify the default values:
grep \"notifications\" /var/cyberx/properties/management.propertiesThe following default values appear:
notifications.max_number_to_report=50 notifications.max_time_to_report=10 (seconds)Edit the default settings:
sudo nano /var/cyberx/properties/management.propertiesEdit the settings of the following lines:
notifications.max_number_to_report=50 notifications.max_time_to_report=10 (seconds)Save the changes. No restart is required.
Export audit logs for troubleshooting
Audit logs record key activity data at the time of occurrence. Use audit logs generated on the on-premises management console to understand which changes were made, when, and by whom.
You may also want to export your audit logs to send them to the support team for extra troubleshooting.
Note
New audit logs are generated at every 10 MB. One previous log is stored in addition to the current active log file.
To export audit log data:
In the on-premises management console, select System Settings > Export.
In the Export Troubleshooting Information dialog:
In the File Name field, enter a meaningful name for the exported log. The default filename uses the current date, such as 13:10-June-14-2022.tar.gz.
Select Audit Logs.
Select Export.
The file is exported and is linked from the Archived Files list at the bottom of the Export Troubleshooting Information dialog. Select the link to download the file.
Exported audit logs are encrypted for your security, and require a password to open. In the Archived Files list, select the
button for your exported logs to view its password. If you're forwarding the audit logs to the support team, make sure to send the password to support separately from the exported logs.
For more information, see View audit log data on the on-premises management console.
Next steps
Povratne informacije
Pošalјite i prikažite povratne informacije za