SQL Server Alerts : Best Practices
Over the last six and a half years, I’ve spent a great deal of my time here at SentryOne discussing SQL Server alerts with customers, evaluators, and coworkers. The purpose of this post is to share what I have found to be important and effective in designing an alerting system for meaningful events on your SQL Servers. There are certainly many different strategies employed by different DBAs, tools vendors, and industry experts, so don’t take this as one size fits all, or hesitate to pick and choose bits and pieces. I’m just sharing what has worked well for me and for many SentryOne customers.
I firmly believe that there is no such thing as a turnkey solution to provide SQL Server Alerts; every tool and strategy will require some degree of adaptation to your environment. Any specific scenarios I use are being used for example’s sake, and not because I believe everyone needs to follow that specific way of alerting for the issue used in the example. This also isn’t a list of things to alert for, as those will be different in every environment.
Planning the Framework
I have a few priorities when planning my SQL Server alerts:
- Getting notified immediately for critical issues.
- Keeping a log or other record of less critical events.
- Making sure to minimize noise and false positives.
- Ensuring that the system is flexible to account for changes in the environment.
- Delivering the alert to the right person at the right time.
- Integrating with other tools if required.
A note on noise, false positives, and alert fatigue: In my experience, this is the thing that most often compromises an otherwise successful SQL Server alerting strategy or tool. It is of paramount importance that your system is built to prevent this, and has the configurability to address noisy or false-positive prone alerts. If you get hundreds of email alerts a day about your SQL Servers, you will not pay attention to them. Only send an email, page, SMS, or ticket type alert if someone will respond to that event! Anything that doesn’t require a response should go into a report or some other means of tracking.
A nice to have here, if your solution supports it, is to have your SQL Server alerts follow your call schedule. I have blogged about doing this with our tools here: https://blogs.sentryone.com/scottfallen/alerting-call-schedule-sql-sentry/.
Striking the right balance between getting notified when I/we need to respond, and not getting notified when a response isn’t required or urgent, is the key to the long-term success of my alerting strategy. This generally results in a three- or four-tiered approach; any more than that gets unwieldy.
When I talk about a tiered approach, I’m essentially talking about severity level. Most tools, including SentryOne, use a Critical, High, Medium, Low or High, Medium, Low set of categories for alerts. Even if you’re just going to alert through native tools like database mail, SQL Agent Alerts, and SQL Server Agent Jobs, I recommend considering a tiered approach. I like, and our tools use, the four-tiered system. Here’s how I use them:
- Critical – A response to this alert is required immediately regardless of time of day or day of the week. These alerts need to be delivered right away, and will usually need to trigger an audible alert on the responsible party’s phone. SMS or high-importance email is my norm here.
- High – A response is required, but not quite as urgent. If the alert happens off-hours, it can wait until the next business day. Email is my standard for these alerts.
- Medium – Almost everything else. Alerts that are informative but don’t necessarily warrant an immediate response, even during business hours. These to go to a dashboard, digest, or report format. Staying away from email/page notifications for these will go a long way to preventing alert fatigue.
- Low – Low severity alerts might be useful to look at once in a while, but usually pass without a response. I often use this level simply for alerts that are components of others, or things I am testing. Logging is all want from these.
These severity levels ensure that I have clearly-defined notification, logging, and/or reporting mechanisms and response plans for any alert that is triggered on one of my servers.
Another thing that can be a huge time saver is to not have to deploy alerts individually. Determining which alerts would apply to which servers is a good next step once severity levels are defined. I tend to think of this as setting alerts at the global, group, or object level. The object level category is flexible; I might set an alert on a specific SQL Server instance, a SQL Agent Job, a key query or stored procedure, and so forth.
Globally, I usually want to alert only on the big stuff that I’m going to want to know about on most or all servers. Servers going down or unresponsive, HA/DR problems, key scheduled event failures, corruption, and so forth tend to be globally configured in my environments.
Group level configuration is usually done based on server lifecycle or the applications and end users the servers support. I may want very different alerts for production and non-production servers. I might not alert for much of anything on development servers, for example. I might also have some alerts that are specific to an application we are supporting, or I may need to vary the alert recipient due to different team members being responsible for different groups of servers. Designing this well is critical, especially in large environments.
Object level alerts are going to be much more varied. A solution that allows for very granular alerting is important, as there will certainly be a need for point solutions. That said, it’s still good to try to think of ways to handle things higher up the chain if possible.
Having a well thought out array of SQL Server alerts setup is important, but all the fine-tuning and customization can be a nightmare if it’s not well-documented. If you’re using SentryOne for your alerts, you don’t really have to do anything here, as you can output a report of your alert configuration at-will. If you’re rolling your own, it’s important that you, other members of your team, or even your replacement years down the road can find where things are configured, and can adjust the alerts as needed.
SentryOne Offers a Powerful and Complete Solution
If these ideas have resonated with you, you can try our tools for free! The SentryOne platform offers a comprehensive and powerful set of pre-built SQL Server alerts that can be customized to your needs. You can also build your own from scratch if needed.
- To download a trial of SentryOne, click here: SentryOne Evaluation Download
- Sign up for the trial.
- Your download will start automatically, and you will be emailed the evaluation license key.
Scott (@ScottFallen) leads SentryOne’s Solutions Engineering team. He and the rest of the team interact with customers and prospective customers at conferences and trade shows, and provide demos and online webinars to existing and future users of the SentryOne platform. Scott has been teaching the SQL Server community and our customers about our products since 2010. Scott’s blog covers events that the SentryOne team attends and offers tips on using our software. Whether you are an existing customer, evaluating the software, or just curious who we are, be sure to check out his blog to see what is going on at SentryOne.