SentryOne Top SQL Tips and Tricks
One of the most heavily utilized features in the SentryOne Monitoring Platform is Top SQL, a user interface that displays long-running and high-impact queries. The popularity of Top SQL shouldn't be a surprise given the importance of monitoring and optimizing queries executing against a given SQL Server environment.
The Top SQL tab has a lot of different ways to view collected queries and search for executions—let's look at how to navigate Top SQL and dive into several of its features.
Table of Contents
Although it's not the intent of this post to discuss Top SQL collection, I do need to highlight some important collection concepts, as I feel it's crucial to truly understanding and using Top SQL.
First is Top SQL's Minimum Duration threshold setting. This setting defines the minimum duration queries must run before SentryOne will collect them. By default, Minimum Duration is set to 5 seconds. This setting can be adjusted through Top SQL Source settings. (Select the desired level in Navigator Topology > Settings Pane > Top SQL Source > Minimum Duration.)
Top SQL's Minimum Duration Setting
Note that setting the Minimum Duration as low as possible globally isn't recommended. Doing so can flood result sets, making it more difficult to find the queries you're interested in. In addition, depending on the volume of activity in a particular environment, it could lead to an exponential increase in the size of the SentryOne database, which could potentially result in performance issues. If you have any concerns about this duration, please reach out to the SentryOne Support team.
Queries that execute very fast and often are collected on the Procedure Stats and Query Stats tabs within Top SQL. I won't go into depth on these tabs here; additional information can be found in the Top SQL documentation.
Manipulating the Result Set
There are several approaches you can take to find a query or set of queries of interest. First, always be sure that you have identified a window of time in which the query or queries have executed and that execution durations were long enough to meet the Minimum Duration threshold.
At the very top of the Top SQL tab, there's a Filter drop-down menu. Expanding the Filter menu will provide several options for filtering the result set of Completed Queries. Note that unlike other filtering options that will be discussed, the available values to filter on will be based on all collected queries, rather than just values available in the current result set. So, if you select a specific value to filter on and no results are visible, you likely need to adjust the window of time you're viewing.
After selecting a desired value to filter on, make sure to hit the Refresh button located on the top toolbar. There's also a Show Top setting, which enables you to set the number of results that will be returned in Completed Queries. Not that if you're looking at a very large window of time, or have an extremely active instance, you might need to increase this value if you can't find events you would expect to see. That said, keep in mind that when sorting by a specific column, it will show the top results based on that column and sort order. So, if you sort by CPU descending, you would not miss high CPU offenders. The max value for this setting is 20,000.
The Top SQL Filter Grid
The column headers in Top SQL can be used to sort and filter based on a specific column. Simply clicking the header will alternate sorting the selected column in ascending and descending order. In addition, you can filter based on collected column values. To do so, hover over a given column until the funnel icon displays in the top right corner of the column header, and then select it.
Filtering Column Values
For example, let's take the Application column. As you can see in the screenshot, it shows all available applications in the given result set. Selecting one of the available applications will filter the result set for only that application.
There's also a custom option available with the functionality. You can use this to search or filter based on a string defined by you.
Once you have applied a filter, it will be listed just below the result set. There you can de-select a filter or select the X to the far left to remove the filter altogether. To the right of the filters is a drop-down button, which allows you to select from previously used or created filters, which comes in handy if you have a filter set you use regularly.
Additional Filter Options
To the far right of the filters, there's an Edit Filter button that opens the filter editor. You can add and remove filters and apply Boolean logic for a more complex filter. To adjust the Boolean logic, simply select the red And to bring up the context window.
Context Window in the Filter Editor
Group By Column Values
You can also group the result set by given column values. Drag a column header to the area between the headers themselves and the Completed Queries tab, as shown in the screenshot.
Grouping a Result Set by Column Values
This capability isn't just great for organizing queries, but you can also see in the group headers that several performance metrics have been aggregated. This allows you to see the big picture impact by or against certain objects. One of my favorite use cases for this feature are environments that are hit by a large volume of ad hoc user queries. Whether it's that hour where you had issues or more macro such as entire days/ weeks, this view can easily bubble up if a specific user is the culprit in hammering the instance.
Note that you can also sort this aggregated view based on the available metrics. To do so, right-click the column you're grouping by and select Sort by Summary.
Sorting a Grouped Result Set
In addition, you can group by multiple columns. Take the high ad hoc user scenario, for example. I can add the Database column to see the sub-aggregated data of each user, per database that they queried against, to provide even more detail to the result set.
Adding the Database Column to See the Sub-Aggregated Data of Each User
Just changing which column is first can provide you with other helpful views. Rather than seeing which user could be causing problems and then drilling down into which databases they're hitting, you could group by database, then Login. Doing so allows you to see which database is getting hit the hardest, and then drill into the users that might be causing it. Keep in mind that you'll want to use this functionality for columns with distinct values. This feature doesn't work well with the Text Data column. But no worries, SentryOne offers a better option for grouping your result set by Text Data!
You can group "like" queries together by normalizing the result set. To do so, enable Show Totals, which is the sigma icon on the top toolbar (shown in the screenshot below).
The Show Totals Icon in Top SQL
This view will group queries together based on multiple column values, particularly Text Data and Database. Parameters are striped so that queries can be grouped together by Text Data, regardless of the parameters used in each execution. This view will provide aggregations of the performance-based columns, as well as additional duration metrics such as Min, Max, and Avg duration.
Show Totals View
Often, resource utilization issues aren't caused by a single heavy query, but the accumulated impact of the same or multiple queries executing at a high volume. When this is the case, normalizing the result set will allow you to see the accumulated impact of queries based on the multiple executions that might have occurred.
Additionally, the Show Totals view can be a very good view for identifying what your heaviest queries are across days/ weeks. This can help you focus on the portion of your workload that could have the greatest impact if optimized. This ties back to the 80/20 principle. If you focus on optimizing the heaviest 20% of your workload, you'll see an 80% improvement, versus if you focus on the bottom 80% of your workload, which would only provide a 20% improvement.
For both the Show Totals and column grouping views, if you're looking at larger time ranges, don't forget about the Show Top setting on the Filter tab.
You can bring up the search functionality by selecting anywhere in the Top SQL grids and pressing CTRL + F. Once the search window is open, simply type a string of interest. The result set will be filtered by only events that contain the entered string in at least one column, and the string of interest will be highlighted. A simple example is shown below.
The Result Set Filtered by Events that Contain the Entered String in at Least One Column
The string search can be very helpful if you have queries that have very similar Text Data with small differences. The beginning of a query might be the same but have different clauses or parameters. The string search can help you find the version of interest. No wild cards are needed for this feature, as the search will automatically look for an entered string in a "%String%" fashion.
Using the Query History view, you can easily look at the historical performance of a query to assist in identifying trends, patterns, or anomalies. There are a few things to be aware of to fully leverage this feature.
- You're able to freely view the history of a query using the slider in the bottom left. You don't need to update the Start/End time tied to the grid result set.
- You can view more than just Duration historically. CPU and I/O are also available. I/O is a good metric to keep an eye on if you have queries that deal with volatile statistics or index fragmentation. In these situations, you'll typically see I/O increase over time between maintenance cycles. This can help you identify if your current maintenance plan is adequate.
- You can group the result set by hour/day/week. Whereas the general view can help identify trends such as degraded performance over time, grouping could help identify patterns in degraded performance that happen at regular intervals due to issues such as job scheduling conflicts. In the screenshot below, I can see a consistent pattern with extended runtimes every 4 hours. Now that I have picked up on this pattern, I could use the Event Calendar to identify what else is going on at that time that could be causing contention.
- If we collected additional statements in the Statements section, then you can click through those to see history on the sub-statements. Doing so can help identify which sub-statement could be the actual culprit. The Mode drop-down menu will let you know which scope you're in. Switching from Statements to Procedure will take you back to the parent object.
- The colors associated with each event have to do with plan usage. Each color represents a different plan having been used. The patterns and trends created by this historically could be used to give you some insight into what could be causing plan issues.
Query History Result Set Grouping
Multiple Plan Usage for the Selected Query
Jumping Back to the Dashboard
A very common workflow when using the SentryOne Monitoring Platform is identifying some activity in question on the Dashboard, highlighting it, and jumping to Top SQL to start digging into what was executing during that time. What can sometimes be helpful is once you believe to have found that query in question, right-click it, select Jump to, and then select Dashboard. Doing so will actually overlay the event duration on the Dashboard, which can be sued to visually compare the activity you're seeking. As you can see in the screenshot below, in combination with presented data in Top SQL, I feel very confident that the query in question played a major role in the spike in CPU utilization.
Overlaying Event Duration on the Dashboard
It could also be helpful in verifying certain activity isn't related to an event. Take visible Write latency during the same event. You can see Write latency increasing before the query execution in question. Write latency also starts a downward trend as the query in question is running. This visual behavior would indicate to me that this query had little to do with the Write latency that occurred at that time.
Write Latency Compared to Event Duration
If you find other highlights getting in the way at the bottom of the chart due to Advisory Conditions triggering, you can hide them by toggling the Show Advisory Conditions Marker (i.e., the Lightning Bolt icon) in the top toolbar.
The Show Advisory Conditions Icons in the Top SQL Toolbar
Understanding Percent-Based Metrics
One important thing to keep in mind when analyzing queries through Top SQL is that the percent-based metrics (i.e. % CPU, % Reads, and % Writes) are based on the result set, not percent of resource utilization on the server itself. For example, you can see in the screenshot that the top query is listed as using 73% CPU.
The Top Query Accounting for 73% of CPU Utilization Out of Result Set
Again, this isn't 73% of available CPU on the server but rather, out of the six available queries in the result set, the top query accounts for 73% of CPU utilization out of those six queries. Any decimal values are dropped and queries that account for less than 1% of the result set will be blank. The percentages might not always add up to exactly 100% but it should be quite close.
As with most columns available through SentryOne features, if you right-click a column header and select Column Chooser from the menu, you'll access additional columns that can be dragged onto the grid. One column I would highly recommend adding is the Has Plan column. This will provide a checkbox value that, if checked, means a plan is collected for that particular query. This way, you don't have to blindly select the View option to see if a plan was collected.
The Column Chooser Option
If a query in Completed Queries is highlighted in red, it means the query failed. Something to examine in this case is the Error column. In this column, an error will usually display as 1=Error or 2=Abort. If you see Error, then it's likely due to an internal SQL Server error or deadlock. As for Abort, this is usually tied to timeouts or manually getting killed. So, whenever you see Abort, the next thing you should check is duration, to see if the query got killed right at an obvious timeout threshold such as 30 seconds.
Errors Showing for Failed Queries
In the Running Queries tab, you might also see queries highlighted in orange, which means the queries are actively causing blocking. You can then right-click the query and select Show Block if you want to view the blocking event in question in the Blocking SQL tab, or Kill Process if necessary. You won't be able to kill the SPID unless the account you're running the client under has the permissions to do so.
Viewing the Blocking Event or Killing the SPID
If you have a regularly occurring Top SQL statement that you're not interested in ever seeing in Completed Queries or being alerted on, you can hide it by right-clicking the statement in question and selecting Hide. Doing so will cause any executions of the statement in question not to show up in the result set and from being alerted on for conditions such as Top SQL: Error and Top SQL: Duration Threshold Max. This being the case, the Hide option should not generally be used as a filter. You can view hidden queries by selecting the Show Hidden Rows button in the top toolbar. Once visible, you can again use the right-click context menu to unhide particular statements.
Show Hidden Rows Button in Toolbar
Although we now have a newer Runtime Stats chart right in Top SQL, the older view could still be helpful, especially since it tracks all-time Min, Max, and Avg for each query. It also provides another view that could be helpful, as it clearly shows which occurrences were failures. You can get to Runtime Stats by right-clicking any statements in the Completed Queries section, choosing Jump To, and selecting Runtime Stats. For additional information about this feature, please refer to the blog post, “Have You Been Using Runtime Statistics?”
This information should help you get the most out of the Top SQL feature. As always, please feel free to contact the SentryOne Support team with any questions you might have regarding the functionality discussed here, or any other questions about SentryOne products.
If you haven't given SentryOne a try yet to explore the Top SQL interface further, you can download a 14-day trial here.
Patrick is a Customer Success Engineering Manager and helps to provide customers with support services, troubleshooting, and defect resolution. Patrick also does some side work with SentryOne’s Professional Services team, who provide training and consulting. Patrick’s blog focus is on helping SQL Sentry users get the most out of our products. His hope is to ensure our users are as knowledgeable and comfortable as possible with SQL Sentry products, so they have as much ammo as possible to solve real world SQL Server problems.