Monitoring and Alerting on High Context Switches with SentryOne
When the kernel switches a processor from one thread to another, a context switch occurs. Here’s an example of context switching in everyday life. Imagine you’re reading your favorite book. Suddenly, someone calls out to you from another room. You place a bookmark into your book and go into the other room to see what they need. You have switched context from the task at hand to another and will resume that task once the new task is complete.
When it comes to your servers and their performance, you will want them to finish tasks quickly, rather than placing too many bookmarks. The metric Context Switches/sec can be used to see how often context switching is occurring and gain insight into performance issues on your server. Consistently high values for Context Switches/sec and Processor: % Processor Time are indicative of processor bottlenecks.
In this blog post, we will explore how to track context switches within your SentryOne installation and alert on them, and why you should care.
Using SentryOne to Monitor Context Switches
SentryOne includes many built-in reports that you can leverage to gain insight into performance issues. The Performance Counter History report will allow you to choose multiple counters for several targets to pull a customized report view.
To run this report, navigate to Reports > Performance Analysis > Performance > Performance Counter History in SentryOne. The window that displays will allow you to add the performance counters of choice to the report and provide you with a graph, such as the one shown below.
The counter that displays the sum of Context Switches/sec for all processors on the machine
This metric can be important to track because a high rate of context switching indicates that processes on the server are competing for processor time. Instead of allowing the processor to do its work, more time is spent suspending processes, switching threads, and starting others up.
This type of performance issue occurs more frequently on systems that are hosting other resource-hungry applications alongside SQL Server. Additionally, servers with multiple SQL Server instances are subject to this type of performance issue.
The average Context Switches/sec value should be below 7,500 per processor. Consistently having values over this threshold can indicate that the server is spending too much time switching threads instead of actively running them.
In addition to the Performance Counter History report, the Sample Mode of the Performance Analysis Dashboard allows you to see this metric in near real time. Click the Sample Mode button at the top of the SentryOne client when on the dashboard to switch to this view.
You can also select a snapshot from the dashboard and view this metric at a specific point in time. Click a point-in-time on the dashboard in History mode and then right-click Sample Mode.
CPU usage chart on a target server in History Mode
Below you can see the number of Context Switches during that selected snapshot.
CPU usage chart on a target server in Sample Mode
You also have the option to run a Quick Report, which is the same report as the Performance Counter History report; however, it is limited to the single counter and the time range is defaulted to the range in History Mode. To run the Quick Report, right-click the metric in Sample Mode and select the Quick Report option.
Detecting the Problem: SentryOne Advisory Conditions
Although a spike in Context Switches is not always indicative of a problem, it certainly can help drive your investigation into performance issues. To detect potential issues and be notified when this metric spikes, we recommend leveraging SentryOne Advisory Conditions.
The Advisory Conditions Pack includes conditions for High Context Switches and High Context Switches - Warning. Enable these conditions and set the action to Send Email if you are experiencing performance issues. For more information about Advisory Conditions, please visit the SentryOne Advisory Conditions documentation.
You might find the default values in many conditions to be on the sensitive side, which is by design, as they are meant to be tuned over time. For information about how to tune Advisory Conditions, see Patrick Kelley’s blog post, “SentryOne Tips & Tricks: Tuning Advisory Conditions.”
Evaluating Context Switches in Combination with Other Key Metrics
Keep in mind that the Context Switches counter might not provide complete insight into performance issues when viewed in a vacuum. However, when looking at this counter in combination with other metrics such as Processor Time, Queue Length, and Parallelism Waits, you can begin to paint a more accurate picture of what is going on within your environment.
Dexter is a Customer Success Engineer with a passion for helping customers navigate through SentryOne. With several years of experience on the Support team, Dexter makes it a priority to resolve any issues customers might be facing. Since moving to Customer Success in November 2019, he has begun to take charge in ensuring customers have the knowledge to leverage SentryOne to resolve SQL Server related issues through Tips and Tricks, alert optimization, and blog posts.