Author: Robert Agar
The monitoring of IT systems is an important practice that should be in place in any complex computing environment. It provides a window into the inner-workings of the systems and applications with which a business or organization operates. The statistics produced by a monitoring platform can be used to optimize systems, enhance the user experience or plan for capacity upgrades.
A viable monitoring platform is designed with the ability to generate and send alerts. This feature increases the utility of the tool by introducing the possibility of creating immediate notifications to address potential issues or inconsistencies in the systems being observed. Alerts are routed to individuals or teams who can take action to further investigate or resolve the problems. Let’s take a closer look at why you want alerts to be created and how your organization should handle them.
Why Generate Monitoring Alerts?
Monitoring platforms can observe and report on many different aspects of IT systems or infrastructures. Wading through the accumulated data can furnish insight regarding the systems but can be very time-consuming. There are likely to be many times when nothing of importance occurs. Studying the monitored data might be interesting but will not have any impact on the operation or functionality of your environment.
Eventually, a lack of actionable data will cause the monitoring process to fade into the background. Maybe stats are pulled out for a capacity planning initiative or to develop an optimization strategy. Adding the capability to generate alerts to your monitoring platform makes it exponentially more useful to your enterprise.
Alerts are created by monitoring tools based on thresholds and metrics specific to the systems under observation. The purpose of the alerts is to allow the IT team to take proactive action and resolve issues before they impact users. The thresholds can be defined statically or may evolve dynamically based on the behavior of the system being monitored.
There are many facets of an IT environment or system that warrant monitoring and possibly alerting as well. Some of the metrics used to develop monitoring plans may apply to multiple systems while others are only pertinent to specific applications or systems.
The goal of monitoring is to identify possible problems before they occur and to point to areas which should be investigated to resolve the issues. MySQL presents a large number of metrics that can be captured and used for these purposes. The broad categories of metrics which can be used to monitor your MySQL servers include:
- Performance work metrics which measure the high-level health of the database by evaluating its useful output. Work metrics can be further sub-categorized into data related to the throughput and performance of the database. They also look at the success and error rate of the queries executed against the database.
- Performance resource metrics are concerned with the physical resources required for the database to operate efficiently. The key areas covered by resource metrics are database utilization and saturation. In addition, internal errors and availability are discoverable through these metrics.
Alerts based on these metrics can contribute to better performance and minimize unexpected downtime. That is, as long as someone pays attention to the alerts and takes the recommended action required by the notification.
More Alerts Aren’t Always a Good Thing
When a new monitoring tool is introduced to the IT team it is usually used extensively. Everyone loves a new toy, and the new information source offers a wealth of data that can be analyzed. Alerts are handled expeditiously and management can feel satisfied that they invested in a worthwhile solution that will improve the performance of the monitored systems. Everybody is happy.
Then reality sets in. The alerts keep coming and many of them are irrelevant to their recipients. Pretty soon alert fatigue sets in and threatens to negate the benefits of your monitoring strategy. In the flood of unimportant notifications, critical alerts are missed or ignored by the teams that receive them.
This is a counterproductive situation that must be remedied for the alerts to be of any use. Too many false alarms or meaningless alerts drive individuals to set up mailbox rules to promptly move them to a folder or delete the messages. Pretty soon, the first alert that something is wrong with your databases is the phone call from angry users. This was not the desired outcome when purchasing a monitoring tool.
Controlling the Number and Effectiveness of Alerts
Monitoring can take many forms and be accomplished using a variety of tools and applications. One of the most important features of a monitoring tool is the ability to set proactive alerts based on multiple baselines. Another sought-after characteristic of a robust monitoring system is the ability to define automatic responses that eliminate potential human errors in the way serious alerts are handled.