Create Advisory Conditions for SQL Server on Amazon RDS

On a call recently, a SentryOne customer spoke about how they were migrating a portion of their SQL Server environment to Amazon RDS instances. The customer knew they would be able to monitor Amazon RDS for SQL Server with SentryOne SQL Sentry as they migrated but they wanted to walk through the functionality just to be safe. This customer relies heavily on SentryOne for both monitoring and troubleshooting their database environment, so this was understandable.

Although the workflow and data presented in the SentryOne client for Amazon RDS instances is nearly identical to that for SQL Server instances, there are a few differences. Because the server/OS layer is abstracted, the Performance Analysis Dashboard does not include OS-specific measurements, as you can see in the screenshot below, and the Disk Space and Disk Activity tabs are not present.

Performance Analysis Dashboard for Amazon RDS Instances

Performance Analysis Dashboard for Amazon RDS Instances

Since Amazon RDS is a fully hosted SQL Server instance, you should not need to worry too much about the underlying hardware in most cases. However, many data professionals still want as much information as possible.

If you are familiar with Advisory Conditions (ACs) in SentryOne, then you likely know where I am going with this blog post. ACs allow you to create your own specific conditions (alerts) and automate responses to these conditions. Sources for conditions can range from the results of a T-SQL query, a WMI query, a performance counter value, or any combination of values or queries. If you have not tried creating your own ACs, I recommend reading Lori Edwards blog posts about how to do so here. After reading Lori's blog post, you will understand what each setting within your AC means. Lori then dives deeper in the second and final part of the series here.

This blog post will focus on creating ACs specifically for SQL Server on Amazon RDS instances. I will walk through several examples and include the conditions to be imported so that you can make use of them.

Custom AC Examples

The first custom AC we will look at will check Virtual Log File (VLF) Counts for each log file. Since we don’t have access to Disk Space/Activity information for Amazon RDS instances, we also do not have access to VLF information. There is alerting around VLF count but the built-in AC currently does not evaluate against Amazon RDS instances.

With all of that in mind, we can still retrieve VLF information from SQL Server on RDS by running the following SQL query:

SELECT [name],
COUNT(lg.database_id) AS 'VLF Count'
FROM sys.databases s
CROSS APPLY sys.dm_db_log_info(s.database_id) lg
GROUP BY [name]
ORDER BY 'VLF Count' DESC 



Amazon RDS - High VLF Count

Let’s look at CPU collection. Again, with little control over virtual hardware and operating system parameters with Amazon RDS, it is easy to lose focus on these metrics. The same constraints will apply whether we are in the cloud or on-premises. With that in mind, visibility into CPU collection is still important and possible by querying sys.dm_os_ring_buffers.

The following query shows CPU usage, and, by altering the query slightly, I can replicate the built-in High CPU and High CPU for Non-SQL Server Process ACs, which rely on performance counter information.

DECLARE @TimeStamp bigint
SELECT @TimeStamp = cpu_ticks / (cpu_ticks/ms_ticks)  FROM sys.dm_os_sys_info
select top 1 CONVERT(varchar,DATEADD (ms, -1 * (@TimeStamp - [timestamp]), GETDATE()),8) + ' | SQL(' + Cast(SQLServerCPUUtilization as varchar(5)) + '%) | Other(' + Cast((100 - SystemIdle - SQLServerCPUUtilization) as varchar(30)) + '%)',  100 - SystemIdle as TotalCPU
FROM ( 
select record.value('(./Record/@id)[1]', 'int') as id,
record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') as SystemIdle,
record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') as SQLServerCPUUtilization,
timestamp
from (
select timestamp, convert(xml, record) as record
from sys.dm_os_ring_buffers
where ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR'
and record like '%<SystemHealth>%') as RingBufferInfo
) AS TabularInfo
order by id desc

The last two ACs we will investigate in this blog post are % Free Space – Disk and % Free Space – Data/Log File(s). I have set these up because I want to be notified if we drop below 10% free space on either the disk in question or any data/log file. For both of these conditions, I am creating a temporary table, inserting records from sys.master_files and sys.database_files into this table, respectively, and then using the AC to check to see if these values fall below the set threshold. Once this has all been completed and we have the results we need our AC to evaluate against, we are dropping the temporary table. The query below goes through this process for % Free Space - Disk.

CREATE TABLE #FileInfo
(DatabaseName NVARCHAR(100),
FileName NVARCHAR(100), 
MountPoint NVARCHAR(50),
Volume NVARCHAR(50),
SizeMB INT, 
FreeMB INT,
);

INSERT INTO #FileInfo(DatabaseName, FileName, MountPoint,Volume, SizeMB, FreeMB)
SELECT
@@SERVERNAME + ' | ' + db_name(f.database_id) AS DatabaseName,
f.name AS FileName,
vs.volume_mount_point AS MountPoint,
vs.logical_volume_name AS Volume,
vs.total_bytes / (1024 * 1024) AS SizeMB,
vs.available_bytes / (1024 * 1024) AS FreeMB
FROM sys.master_files AS f
CROSS APPLY sys.dm_os_volume_stats(f.database_id, f.file_id) as vs

SELECT @@ServerName +  ' | ' + MountPoint,
CONVERT (DECIMAL (3, 2), ((CAST (FreeMB AS FLOAT) / (CAST (SizeMB AS FLOAT))))) * 100
FROM #FileInfo
Group BY MountPoint, volume, SizeMB, FreeMB
Order by
MountPoint
    
DROP TABLE #FileInfo;

Tag Your Condition

Once you have created your custom ACs, feel free to create or apply a tag to them. You can do so through the Conditions list. It makes it easier to filter this list as you continue to create additional alerts.

Tag Your Custom Advisory Conditions

Modify ACs to Fit Your Environment

The main objective for this blog post was to show you how you can retrieve additional information from your SQL Server environment using ACs. In this case, we wanted to pull additional information available to us for SQL Server on Amazon RDS instances.

The code examples included in this blog post are available at the links below and are ready to be imported into the SentryOne client. Once imported, feel free to change the thresholds as you please. There is no one value that fits all database environments, and some thresholds might be too low or too high in certain cases. For more information about modifying conditions for your environment, check out Patrick Kelley’s blog post, “SQL Server Alert Tuning Basics with SentryOne.”

Download the Advisory Conditions Covered in This Blog Post

Thwack - Symbolize TM, R, and C