Basic SQL Server Query Tuning Secrets Every SQL Admin Should Know
The performance of your applications is a complex, multi-layered puzzle. Performance can be negatively impacted at the application layer or even by remote calls to networked services. However, the most common bottleneck for applications is the data storage layer. The most common data storage tier for applications is a relational database, whose performance can vary widely depending on query optimization. Relational databases are a black box to many developers; however, they’re incredibly well instrumented pieces of software. In this post, you’ll learn how to unravel query performance in your SQL Server and Azure SQL databases.
Understand Query Execution
The most complex part of query execution is known as query optimization, when the database engine generates what’s known as an execution plan for a given query. This execution plan is a representation—either text or graphical—of the operations the database engine needs to perform to complete a given query. Execution plans themselves are a complex topic (and the subject of another post in this series), but having some basic understanding of plans and query metrics gives you a much better understanding of query performance.
Compilation and Caching
The first time you execute a query in SQL Server or Azure SQL Database, your query goes through a process known as compilation. The query is first parsed to ensure it uses valid syntax and references valid objects. After the parsing process is complete, the database engine generates several execution plans and chooses the one it thinks is most efficient for a given query. These execution plans are built with assumptions based on the statistics associated with columns and indexes. These statistics quantify the distribution of values within an index or column and allow the engine to make its best guess.
Secret 1: Ensuring you have up-to-date statistics allows the database engine to make its best decisions. While SQL Server auto-updates statistics from time to time, you may need to update statistics more frequently to guarantee efficient query plans.
This compilation process is expensive from a CPU perspective, so SQL Server caches execution plans in an area called the execution plan cache. If the query has the same hash value as a previously executed query (and still exists in the plan cache), the engine bypasses optimization and uses the previously compiled execution plan. This is a good thing from the perspective of CPU usage, but it can lead to variations in performance due to a behavior called parameter sniffing.
One of the most common questions application developers ask is, “Why does my query run quickly sometimes and slowly at other times?” Consider the following stored procedure:
This is a simple procedure against the AdventureWorks 2017 sample database. If you execute the procedure with the following two parameters, you’ll see significantly different performance.
The reason for the variation in performance is the query optimizer generated an execution plan optimized for a parameter value of 897 on the first execution of the query. The value of 897 only occurs twice in the table, whereas there are 4,688 instances of 870. Since the database engine caches execution plans, it examines the frequency of the parameter only when it caches the execution plan. Although this behavior may seem like a negative in this case, it generally helps the overall performance of the database engine, as building execution plans is resource-intensive. Working with data skew and how it affects parameter sniffing is a much deeper topic, but this caching behavior is something all administrators and developers should understand.
To improve something, you need to be able to measure it. The SQL Server engine includes a great deal of built-in instrumentation to measure query performance. The first and easiest thing to use is the timing and statistics mechanism built into the engine. This can be enabled in your session by executing the following code:
In the messages returned for the query, you’ll see the execution time and the number of reads the query performed. Here’s an example of the output for the stored procedure created above:
This data is really useful because it helps you see the impact of changes you may have made to your code or indexes and gives you an overall view of the efficiency (fewer reads are better) of your queries during the development process.
Secret 2: Always measure your query performance—metrics are your path to better code.
Query statistics are useful during development and testing phases, but they’re harder to use in production scenarios. Newer versions of SQL Server (SQL Server 2016+ and Azure SQL) include a feature called the Query Store, which allows you track performance and—more importantly—the variation in your performance over time. You should use this and/or a monitoring tool such as SolarWinds® SQL Sentry® to quickly and effectively monitor your performance.
Always Be Measuring Queries
As you can see, monitoring the performance of your queries is key to your application’s performance. Monitoring performance at scale across multiple servers is challenging, a big effort for DBA teams large and small. By capturing trends across this data, a SQL monitoring tool can improve your productivity and allow you to find performance issues more quickly.
When you start researching monitoring tools, take a look at SolarWinds SQL Sentry. SQL Sentry provides the necessary visibility into your data, keeping your pipeline humming. Get started with a free trial, and see for yourself what it can do.