SentryOne Performance Analysis Dashboard Tips and Tricks

 

Patrick Kelley | SentryOne

The SentryOne Performance Analysis Dashboard is understandably one of the most-used views within SentryOne. The ability to view key metrics in real time or historically and then easily drill down into what could be causing observed resource utilization makes the dashboard an obvious starting point within SentryOne. However, in the several years I have spent training SentryOne users, I find there are still dashboard features that are often overlooked. Let’s dive into how to best utilize these dashboard features.

 

Table of Contents

 

Custom Process Groups

 

On the CPU chart shown to the right, SQL Server is broken out into its own section (displayed in blue), while other processes are grouped into the Other category (displayed in green). This differentiation makes it very easy to see if a CPU spike is related to a SQL Server process or an external process. When you see a spike in non-SQL Server processes, you can check the Processes tab to see which process outside of SQL Server might be the culprit. If you find your servers regularly or intermittently having CPU utilization spikes from a process outside of SQL Server, it can be beneficial to break out the process into its own process category.

Example of a Custom Process Group on the Dashboard

Example of a Custom Process Group on the Dashboard

Custom Wait Categories

 

If you find that most of your waits are showing up as the Other category, or you have specific waits that are most relevant to you, it can be very helpful to break waits out into custom categories. Doing so enables you to quickly identify what specific wait activity is tied to.

Below you can see a very common view of the Waits chart, with a decent amount of the waits activity being grouped into the Other category (displayed in gray).

 

Waits Chart Before Creating Custom Wait Category

Waits Chart Before Creating Custom Wait Category

 

After breaking out CXPACKET and BACKUPIO waits into their own categories, you can see in the screenshot below that it’s much easier to quickly visually identify the breakdown of that wait activity.

 Waits Chart After Creating Custom Wait Categories

Waits Chart After Creating Custom Wait Categories

 

I have also seen SentryOne users break out HADR waits if Availability Groups (AGs) are involved, or specific latch waits when dealing with blocking issues. You can break out the waits that are meaningful to you. For information about how to create custom wait categories, check out Justin Randall’s blog post "Dashboard Custom Wait Categories.”

 

Baselines

 

SentryOne provides several out of the box baselines (e.g., Previous Period, Day, Week, Month, Quarter, Year). Baselines can provide a quick glance at the available previous time ranges to assist in change analysis, trending, and pattern recognition.

A common use case is if you see a sudden spike or drop in a particular metric. Take Page Life Expectancy (PLE), for example. As you can see in the screenshot below, any time I see a major drop in PLE, I throw up the Previous Day and/or Week baselines to quickly see if there is a pattern to this behavior, which might be tied to a schedule process. The drop in PLE could certainly happen because of maintenance processes or could be related to a heavy process that runs every day/week, causing objects in buffer to get flushed back to disk. If this behavior is causing issues for other critical processes at that time, you could reschedule that heavy process to a less impactful time.

 

 Previous Day Baseline Showing That a PLE Drop Also Occurred at the Same Time the Previous Day

Previous Day Baseline Showing That a PLE Drop Also Occurred at the Same Time the Previous Day

 

Baselines can also be helpful for more macro situations such as comparing months/quarters before and after a major change (e.g., upgrading SQL Server). Note the larger resolution baselines will only become available when viewing larger ranges of time. See the SentryOne documentation for more details on performance analysis baselining.

You can also create custom baselines. Your custom baselines could be for something as small as an hour of heavy processing, before and after a change, to see if you’ve achieved the increased performance you were hoping for. Or, you could use them for something more macro, such as a lift and shift to the cloud. In that situation, you would want to create several baselines before and after during heavy load time ranges and/or entire days, weeks, etc.

To create a custom baseline for a specific time range, simply pull up the time range you want to baseline on the dashboard, highlight the time range, and click Create Baseline.

 

Creating a Custom Baseline for the Highlighted Time Range

Creating a Custom Baseline for the Highlighted Time Range

 

Doing so will auto-calculate Min, Max, Avg, and Standard Deviation for all dashboard metrics in that time period. (Note that if you’re interested in only calculating these values over a specific time period, you can click Create Baseline, note the values you care about, and simply close out the window without saving the custom baseline.)

If you want to perform change analysis, you will find all saved baselines in the Baseline drop-down menu. Selecting the ellipsis icon to the right of the drop-down menu (show in the screenshot below) when a custom baseline is selected will bring up the raw data for that baseline, as well as the option to delete them.

 

The Baseline Drop-Down Menu

The Baseline Drop-Down Menu

 

For any selected baseline, only a single counter will be referenced/displayed on each chart at a time. You can change the scope of the counter data you see for the baseline by right clicking the counter legend (as shown below for Read versus Write latency).

 

Swapping the Scope of the Baseline Between Read and Write Latency

Swapping the Scope of the Baseline Between Read and Write Latency

 

Something to be aware of is the option to make a baseline global. When not set to global, a baseline can only be overlaid where it was created. But in cases such as AGs or migrations, it could be beneficial to overlay a baseline created on one target onto another target's Performance Analysis Dashboard. Creating a global baseline will also make it available to Advisory Conditions for alerting or automated response purposes. See Lori Edward’s blog post “Alerting, Baselines, and Standard Deviations” for more details.

Sample Mode

 

Sample Mode provides more granularity related to the data being collected on the dashboard, displaying a detailed snapshot of the data SentryOne uses to plot on the historical charts. With Auto-Refresh enabled, you will see the Sample Mode update every 15 seconds or so. Sample Mode can be helpful when dealing with real-time issues, but what about getting granular details historically? You can view the Sample Mode for that historical data point by clicking a single point in time on the History view to select it, right clicking it, and selecting Sample Mode.

 

Selecting the Peak of a Spike and Jumping to the Sample Mode

Selecting the Peak of a Spike and Jumping to the Sample Mode

 

One of my favorite use cases for viewing historical data samples is for digging into waits, as Sample Mode provides a more detailed view that is easier to explore if you have a lot of activity. Note the scroll bar at the bottom (shown in the screenshot below).

 

Waits Chart in Sample Mode

Waits Chart in Sample Mode

 

I have also found the Sample Mode view to be helpful when digging into parallelism or affinity settings. The ability to see individual core and socket utilization can provide insight into whether you need to adjust these settings.

 

CPU Chart in Sample Mode Showing Individual Core and Socket Utilization

CPU Chart in Sample Mode Showing Individual Core and Socket Utilization

 

Sample Mode provides additional counters such as User Connections, Memory Grants Pending, and Context Switches. Sample Mode allows you to see point-in-time values for these extra metrics. But what about historical graphs for these other metrics? Although these are not available on the dashboard, you can generate reports for these metrics. Simply right click the metric and select Quick Report.

 

Running a Quick Report for Sample Mode-Only Counters

Running a Quick Report for Sample Mode-Only Counters

 

The report will give you a historical view for that counter. It will typically default to a 30-minute time range surrounding the timestamp you were viewing in the Sample Mode. You can see Min, Max, and Avg over the time range calculated and can drill into the raw data to find timestamps/values associated with a major drop or spike. Selecting Refresh Report in the top left corner of the report will allow you to specify the time range you desire.

 

Generated Quick Report

Generated Quick Report

 

The Performance Counter History and Performance Counter Date Range Comparison Reports

 

Although this post is centered around the dashboard, I can’t share the previously mentioned Quick Report option without taking it one step further. There are two very helpful reports that I find often get overlooked—the Performance Counter History and Performance Counter Date Range Comparison reports. SentryOne collects many more counters than just what you see on the dashboard, or even in the UI. You can enable the collection of even more counters than what SentryOne collects by default. See Justin Randall’s "Enabling Additional Performance Counters“ blog post for more details on how to do so.

The two reports I mentioned allow you to create custom reports with counters of your choosing and provide visibility into the wealth of additional counters SentryOne offers, including specific waits. The Performance Counter Date Range Comparison report takes it one step further, allowing you to see two different time ranges and the percent change between those time ranges. This report is great for change analysis or comparing month-end processing. You can get to these reports by going to Reports, clicking Performance Analysis, and then selecting Performance. I will go into more detail on these reports and others in an upcoming post.

vCenter Monitoring

 

I’m always amazed at how often the virtual layer is a black box to DBAs. It’s like a car mechanic who is tasked with maintaining or fixing a car but can’t open the hood to look at the engine and surrounding components. But nonetheless, this is often the case. If you’re using VMware, you can add a vCenter to SentryOne with just read-only access to the vCenter. Doing so will provide a topology of the vCenter, automatically overlay additional counters on your SentryOne dashboard (Co-Stop %, Ready Time %, and Ballooned Memory), and provides additional conditions. One of the most critical conditions provided is VM Moved, which notifies you if a VM has vMotioned to another host.

Co-Stop, Ready Time, and Ballooned Memory Counters Added to Dashboard

Co-Stop, Ready Time, and Ballooned Memory Counters Added to Dashboard

Conclusion

 

Hopefully, this walk through of how to leverage Sample Mode, reports, baselines, and virtualization monitoring in SentryOne will make it easier to administrate and monitor your environments. As always, if you have any questions, please feel free to reach out to the SentryOne Support team or your Customer Success Manager.

 

Check out Patrick's SentryOne Top SQL Tips and Tricks