Introduction to SQL Server Internals and Architecture

Kevin Kline

Published On: February 25, 2020

Categories: SQL Server, Webinar 0

New appliances and gadgets typically come with instruction manuals, but let’s be honest—who reads those? It’s true that you can probably figure out how to operate a new appliance without the manual, but it might reveal hidden features or tips for optimal use that you wouldn’t have known about otherwise.

The same holds true for SQL Server. You can effectively do many IT jobs related to SQL Server without knowing the internals of how SQL Server works. In fact, many great developers, DBAs, and designers get their daily work completed on time and with reasonable quality while never really knowing what’s happening behind the scenes. But if you want to take your skills to the next level, it’s critical to know SQL Server’s internal processes and architecture.

During a recent SentryOne Webinar on Introduction to SQL Server Internals & Architecture, I shared answers to many questions regarding the internal operations of SQL Server such as:

  • What are the various areas of memory inside of SQL Server?
  • How are queries handled behind the scenes?
  • What does SQL Server do with procedural code, like functions, procedures, and triggers?
  • What happens during checkpoints? Lazywrites?
  • How are IOs handled with regards to transaction logs and databases?
  • What happens when transaction logs and databases grow or shrinks?

For those topics we didn’t cover, I provided resources where you can get more information.

If you missed the live webinar, click here to view it on-demand.

SQL Server Internals & Architecture

During the webinar, I received some great questions from attendees. For the submitted questions that we didn’t have time to answer, I’ve provided the answers below.

Webinar Q&A

Q: Do you plan to discuss SQL Server 2019 featuresin particular, the new health check at the database level added for databases in Availability Groups (AG)? Is that a good feature use to failover the AG if the database is unhealthy? Does it not impact other databases in AG?

A: We are not diving into new features of SQL Server 2019 during this webinar. And to be honest, I haven’t studied the health check feature in detail. However, to your second question, failing over a database in an AG can affect the performance of other databases in the AG when rolling over to a secondary since the load on that secondary can dramatically change. That is a normal and expected behavior.

 

Q: In SQL standard 2016 AGs, can you use the same IP for multiple listeners?

A: You can create only one listener per availability group through SQL Server. Typically, each availability group requires only one listener. However, some customer scenarios require multiple listeners for one availability group. After creating a listener through SQL Server, you can use Windows PowerShell for failover clusters or the WSFC Failover Cluster Manager to create additional listeners. For more information, check out this resource.

 

Q: Are all the connections encrypted to the database server by default using Transport Layer Security (TLS) encryption?

A: No. Plain text is the default unless you use something like AlwaysEncrypted.

 

Q: How do sp_prepare and sp_execute affect these initial steps for a query?

A: The differences would kick in at the query optimizer phases coming up shortly. Brent Ozar has a great blog post about the differences.

 

Q: Are Trivial plans not in Cache (Cache Stats)?

A: Yes. If you want trivial plans to remain in the plan cache, you'd have to use a trace flag. This can be dangerous because it can bloat the plan cache to massive proportions, while simultaneously depriving the data cache and other parts of the buffer pool (depends a lot on the version of SQL Server as to which parts will be starved for memory, but always the data cache). And since trivial plans are so easy to create, SQL Server doesn't bother with them under normal circumstances.

 

Q: Where are query hints processed and why are they sometimes ignored?

A: When using query hints, you are virtually assured that the query optimizer will go through Full Optimization. The optimizer might ignore hints for a variety of reasons. For example, QUERYTRACEON trace_flag at the individual query level won’t return an error or warning if an unsupported trace flag number is used. In the same way, if the specified trace flag is not one that affects a query execution plan, the option will be silently ignored. In some cases, you’ll get an error 8622, but in many other places SQL Server will simply ignore a malformed hint. An example of this happens when directing a query to use an index that doesn’t satisfy any of the search arguments.

 

Q: Can we copy and restore a query plan to another server or another query in SQL Server? How do you do it on a single server?

A: It's easy to do that on a single server. I believe it is theoretically possible to do that across two different servers, especially if you are using a query store. However, I believe it would be next to impossible in practice. That's because even if we have objects all with the same name, they may not have the same object IDs. So, there could easily be differences in the plans at a binary level.

Regarding your second question—assuming you have SQL Server 2014 or later—you may want to read more about the basics of Query Store. Details can be found here.

 

Q: How does the transaction manager handle the different isolation levels?

A: Transaction isolation level settings enable you to change the default behavior of locking (and blocking) at an instance- or transaction-level. Use with caution.

The default is Read Committed. If you use the statement SET TRANSACTION ISOLATION LEVEL, you can increase or decrease the "isolation" of each request for records, whether to read or write to those records.

The transaction manager, specifically its lock manager component, enforces either the default isolation level or whatever level has been explicitly declared. So, at default, the lock manager says, "Writers block everyone else. Readers do not block readers. But readers are blocked from reading data that has been modified but not committed by other transactions."

 

Q: Does checkpoint harden the pages from data cache (memory) or disk cache (storage cache)?

A: I think I know what you mean, assuming by disk cache you mean something like a write cache on a hard disk or SSD. If so, read on. Checkpoint only watches for an acknowledgment from Windows passed to it from the I/O subsystem. Thus, data is changed in the data cache. The checkpoint tells SQL Server to harden that page(s) and SQL Server passes those pages to Windows to the I/O subsystem which sends back an acknowledgment that the data has been hardened. However, neither Windows nor SQL Server knows if that data was written to a cache on the SAN or to the lowest level (disks/SSDs) of the SAN. That's the nature of the beast.

 

Q: If Ghost Cleanup is blocking other transactions, is that an indication that more memory is needed? Or is there a way to relieve or clear that block?

A: Have you ever experienced Ghost Cleanup blocking any other operation? I suspect not, because by its very nature ghost cleanup only accesses objects in memory that have not been handled in quite a while. There are PerfMon counters and SQL Server Dynamic Management Views (DMVs) that are much better ways of determining if there's memory pressure.

 

Q: Can compression be enabled for MAX columns (text data types)? If so, what is the preferred method of MAX compression?

A: SQL Server has three ways to handle MAX data types: IN_ROW_DATA, ROW_OVERFLOW_DATA, and LOB_DATA. If you use data compression—say page compression—it =can= compress anything stored in the IN_ROW_DATA region because the page data compression algorithm only functions against IN_ROW_DATA. But that's probably only a very small percentage of data stored in a MAX data type. You can use the following query to see how much data is in each of the three regions for a MAX data type:

SELECT
     OBJECT_NAME(sp.object_id) AS [ObjectName]
     ,si.name AS IndexName
     ,sps.in_row_data_page_countas In_Row
     ,sps.row_overflow_used_page_countAS Row_Over_Flow
     ,sps.lob_reserved_page_countAS LOB_Data
FROM
     sys.dm_db_partition_stats sps
     JOIN sys.partitions sp
           ON sps.partition_id=sp.partition_id
     JOIN sys.indexes si
           ON sp.index_id=si.index_id AND sp.object_id = si.object_id
WHERE
     OBJECTPROPERTY(sp.object_id,'IsUserTable') =1
order by sps.in_row_data_page_count desc

Additional Resources

I’d be remiss if I didn’t repeat a statement I made earlier in this blog: You can effectively do many IT jobs related to SQL Server without knowing the internals of how it works, but if you want to take your skills to the next level, it’s critical to know SQL Server’s internal processes and architecture.

If you missed the live webinar, click here to view it on-demand. Also, for additional information, check out the following resources.

Kevin (@kekline) serves as Principal Program Manager at SentryOne. He is a founder and former president of PASS and the author of popular IT books like SQL in a Nutshell. Kevin is a renowned database expert, software industry veteran, Microsoft SQL Server MVP, and long-time blogger at SentryOne. As a noted leader in the SQL Server community, Kevin blogs about Microsoft Data Platform features and best practices, SQL Server trends, and professional development for data professionals.


Comments