Toolbox: New products for IT professionals
This month’s tools are all about monitoring, whether monitoring the uptime of your Web site or monitoring enterprise-wide systems and applications.
If you’re responsible for running a Web site, the expectation is that it will always be available and always have a snappy response time. If it doesn’t, people coming to your site will move on. Uptime Robot is a free remote service that lets you monitor up to 50 Web sites with health checks every five minutes. If one of your site checks fails, Uptime Robot can send you alerts via e-mail, Short Message Service (SMS), RSS, Twitter, or even push notifications to your iPad or iPhone.
To set up your site monitors, create an account and click “Add New” on the My Monitors dashboard. Next, select the type of monitor:
- HTTP looks for an HTTP 200-type response from a specified URL.
- Keyword Checking looks for a particular word (or lack thereof) in the HTTP response.
- Ping does a standard Internet Control Message Protocol (ICMP) verification.
- TCP Ports verifies a particular port is “listening” for requests.
After picking the monitor type, enter the URL or IP of the host to check along with any pertinent information. Then choose how you’d like to receive alerts when there’s a problem. Uptime Robot also has an API you can access to view, create, update, and delete monitors and alert contacts. This lets you easily integrate the service into an existing monitoring infrastructure. It also helps you script monitors as you bring sites online and offline.
The creators of Uptime Robot recently did an engine update with improved features. They’re also planning an updated and overhauled interface. Uptime Robot is currently free, and hopefully they can keep it that way, but they may reconsider post-redesign (see the Uptime Robot blog for more information). Uptime Robot can verify your Web site from the outside world.
System and application monitoring is a key component of successful IT operations. The more in-depth the monitoring, the more aware of your applications and infrastructure you are, and the faster you’ll be able to troubleshoot and respond. One scalable monitoring solution that can handle everything—routers, switches, printers, Windows, Unix, and Linux machines and more—is the open source product Nagios.
You can install Nagios on most popular flavors of Linux as a package or by building from the source. Instructions for the source install are pretty comprehensive, so if you have an available Linux system as a host, you should be able to get up and running on the core system. Take a look at the Nagios Quickstart Installation Guides for more information.
Nagios consists of three main components: Nagios Core, plug-ins and the Web-based front end. Nagios Core is the server component that handles all the monitoring tasks. The plug-ins let you monitor various servers, applications, services and performance metrics. And the Web-based front end gives you interaction and visualization within the system. Once installed, you’ll want to set up your first host. The installation packages give you examples of the text file-based configuration files that are required for monitoring applications and services with Nagios.
To monitor Windows servers, you’ll also want to take a look at the free and open source monitoring agent NSClient++. This is a module-based monitoring agent you install on your Windows servers so you can perform checks from your Nagios server. With its standards-based protocol handling, it works with other monitoring solutions as well. Once you have the client installed, you’ll be able to query for CPU, memory, disk, process state, registry entries, service states, performance counters and uptime of your Windows servers. There are additional modules that extend Nagios’ capabilities to Event Log queries, general scripted queries and SMTP server health, to name a few.
Nagios has an active user community and support structure. There are quite a few additional Nagios projects out there that provide additional functionality, enhance the UI or help manage configuration. You can find most of these extensions on Nagios Exchange. You can find a number of new front ends to dress up your Nagios interface and additional plug-ins for new monitors. For example, there are plug-ins to run T-SQL queries on your Microsoft SQL Server instances to verify state or SQL Server Agent jobs running on those servers.
If you find text file-based configuration tedious, you can use the Lilac-Reloaded add-on to help. Lilac-Reloaded will manage your monitor configuration for you, storing Nagios configuration information in a MySQL store. When you’re ready, you can write the configuration as files out to disk with a click of a button.
Once your monitors are active, Nagios can alert you via e-mail, SMS or a custom script when a check fails or when things have gotten back to normal. You can configure the thresholds for various alerts by service, so you can help ensure real problems aren’t masked by a ton of false positives. You can also configure escalation policies for alerts and notification policies, or set policies based on the time of day.
For each host, you can define maintenance windows or temporarily disable checks when you need to work on a host. If a service or host does go down, you can even configure the system to execute a script or plug-in that could bring a new host online or restart IIS. This way, you can really use the system as a proactive monitoring solution.
Nagios has a number of built-in reports to help you understand the health of your infrastructure over time. You can monitor a specified time range, or percentage uptime and downtime for any of your systems. There are also trend reports so you can track things like CPU usage. These types of reports can help with capacity planning by helping you see system “pressure points” before they become critical. The alert reports show you the when, where and how of various alert times. They can also provide metrics on response time. The Web front end gives insight into how Nagios itself is performing, with views into system logs and metrics on the scheduling queue, performance information and process information.
If you’re looking for more official support for your Nagios implementation, there are products built on Nagios Core. Nagios XI¸ for example, gives you simplified administration, a more polished interface, further extensibility and performance tuning in a downloadable package with a support system in place to help you get set up or if you run into trouble. Nagios XI is priced by the number of hosts you wish to monitor. It starts at $1,995 for up to 100 hosts, or $4,995 for unlimited hosts per installation.
However, if you feel comfortable getting your own instance up and running, you can use the open source version along with its active community. Having a proactive monitoring solution in place will save you time, reduce stress and help you look forward instead of constantly putting out fires.
Greg Steen* is a technology professional, entrepreneur and enthusiast. He’s always on the hunt for new tools to help make operations, QA and development easier for the IT professional.*