Indexes Matter—How Poor Index Management Can Ruin Query Performance
Ideally, database queries use the fewest possible resources: time, memory, bandwidth, etc. Lower resource consumption maps to better query performance. To find relevant data in a table, a database query relies on lookup operations, and a table index can help a query efficiently find the table values it needs. With an efficient, well-designed table index, a database query can find the table data it needs, avoiding the need to "scan"—or search through—all the table data. Table index management doesn't happen automatically, however, and index management and maintenance are both important. This article shows how SolarWinds® SQL Sentry® can help you easily improve index performance.
Criteria for Choosing Indexes
This article focuses on two different index types: clustered and non-clustered. A clustered index maps the physical table data order to the index, as determined by one or more table columns. This means a table can have only one clustered index because the table can have only one physical order.
A non-clustered index exists as a resource separate from the table data, and it maps to the physical table data through pointers. To use a non-clustered index, a query would look in the index for the value it needs, whereupon the index would find the value in the table. Non-clustered indexes don’t depend on the physical table order, so a table can support more than one non-clustered index.
Clearly, if a table index doesn’t reflect the order of a query, it’ll harm the performance of the query because of the extra overhead required to examine more table data. These indexes simply don't help.
Many criteria go into choosing which columns to index and whether to make the index clustered or non-clustered. The central criterion is which columns appear most often in a query. For instance, in a specific table with two specific columns, column A and column B, queries might request column A data more often than they request column B data. In this case, column A should probably have the index.
SQL Server WHERE, JOIN, and GROUP BY query clauses can benefit from any indexes on the columns they cover because the query engine can look directly at the indexes for the rows they need, avoiding a data scan of the relevant table. Index design should therefore focus on the table columns most likely included in these query clauses.
A clustered index might offer somewhat better performance than a non-clustered index, as a clustered index would require fewer data lookups to find the relevant table data. A clustered index should also cover a column with a unique value—or a group of columns with a combination of unique values—for each table row. If these unique values don't exist, SQL Server will build a new column with unique values as row identifiers for the clustered index. This increases storage, memory, and performance pressure on the SQL Server Query Optimizer.
A clustered index is most beneficial for integer (INT, SMALLINT, and BIGINT) data types because of data search efficiencies. Other data types—real, float, and especially character—require more resources to handle these data columns as index values. Changes to an index, whether clustered or non-clustered, will lead to fragmentation of the index. This means the index data will gradually lose its optimized storage structure.
As table data changes, non-clustered indexes will reflect those changes. This overhead can become expensive, so you should avoid non-clustered indexes on columns not involved in searches. A non-clustered index covering the smallest possible number of columns helps reduce index maintenance overhead.
Example: Query Diagnostics and Repair
UPDATEs and DELETEs alter the physical structure of the table data so it no longer matches the logical order of the physical index. In this way, the index becomes gradually or rapidly fragmented. Fragmentation impacts index performance, so you must address it. In this example, we'll first measure the problem with the SQL Server sys.dm_db_index_physical_stats view. Depending on the amount of index fragmentation detected, we'll either rebuild or reorganize the indexes.
This query shows the degree of fragmentation in a SQL Server database named RECIPES:
The results are shown in Figure 1.
Figure 1: Measure SQL Server Index Fragmentation
This query joins the sys.indexes, sys.sysobjects, and sys.dm_db_index_physical_stats views to show the percentage of index fragmentation in the database tables. The NULL parameters at line 5 cover all indexes, across all tables, in the RECIPES database.
The output in Figure 1 shows us PK_RECIPES suffers from only a small amount of fragmentation: slightly more than 10%. Microsoft suggests ALTER INDEX REORGANIZE for index fragmentation between 5% and 30%, so we’ll run this command. This will optimize the index without taking the table offline.
In contrast, the CATEGORIES index suffers from more than 87% fragmentation. For index fragmentation greater than 30%, Microsoft recommends ALTER INDEX REBUILD, so we’ll use this. It drops and recreates the index in a single transaction. Because a rebuild happens in a transaction, any errors will force a complete transaction rollback. ALTER INDEX REORGANIZE doesn’t have this issue.
First, we'll reorganize the PK_RECIPES index. In Object Explorer, drill down to the index and click Reorganize, as shown in Figure 2.
Figure 2: Reorganize a Table Index
Click OK as shown in Figure 3.
Figure 3: Reorganize the Table Index
To rebuild the PK_CATEGORIES index, drill down to the index in Object Explorer and click Rebuild, as shown in Figure 4.
Figure 4: Rebuild the Table Index
Click OK as shown in Figure 5.
Figure 5: Rebuild the Table Index
Figure 6 shows the defragmented indexes.
Figure 6: The Defragmented Indexes
The PK_RECIPES index still shows a little fragmentation because index reorganization defragments existing index resources and might not eliminate all fragmentation. Index PK_CATEGORIES shows complete defragmentation, as the index removed the existing index resources and started fresh.
Don’t Take Database Performance for Granted
Database query performance depends on the quality of your table indexes. With simple tools, we can measure this quality and improve it.