Be Mindful of SQL Server TempDB Use (AKA TempDB Parasites)

This post is a refresh of the "tempdb parasites" topic I was covering back in 2014. Microsoft has done a lot over the last several years to help SQL Server professionals with tempdb concerns. The most recent was enabling a shift of tempdb metadata to In-Memory. Erin Stellato (b|t) covers this change and provides some great test results on it here. At the same time, we need to remain mindful of how we configure and utilize tempdb.

SQL Server instances only have one tempdb. That makes tempdb a single point of contention. What’s worse is that there are a lot of things going on in tempdb that you may not be aware of. Let’s evaluate a list of some things that can use tempdb while we’re not looking. These are all still valid concerns in 2020.

Table Variables Use Tempdb

There is still some misinformation out there telling you that table variables are treated differently from #temp tables. They are to a degree, but not in the way that logic might lead us to believe. Table variables are not stored in a special memory buffer at all times. Both table variables and #temp tables end up being stored in the same way. In SQL Server both are stored in tempdb.

For some great reference, there is an incredibly detailed answer on Stack Exchange discussing this. I can't come up with much that will add to it.

The next time you’re trying to remember this though, just think of this little lyric: “Table variables: convenient though they may be, they are still stored in tempdb.”

Query Plan Operations Use Tempdb

Certain plan operations will use or "spill" to tempdb. They're outlined in this older “Working With TempDB ” white paper. They are still important to keep in mind today.

  • Sorts (including distinct)
  • Hash (aggregates and joins)
  • Spools (table and index)

Let’s take a moment to show the effect a sort spill can have on tempdb. I was having a tough time getting a sort spill to happen. Aaron Bertrand (b|t) stopped by with an idea for a nasty query. I tweaked it just a little bit to keep it from running all day. In the end, it gave me what I was looking for. (Thanks Aaron!)

SET NOCOUNT ON

USE tempdb
GO

SELECT internal_objects_alloc_page_count * 8 / 1024 [Before MBs]
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

SELECT TOP 1000 *
FROM (
     SELECT TOP 5000 *
     FROM sys.all_columns
     ) col crossjoin(SELECT TOP 5000 * FROM sys.all_objects) obj orderby newid()

SELECT internal_objects_alloc_page_count * 8 / 1024 [After MBs]
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

Note the DMV “dm_db_task_space_usage.” I’m using this to tell me space allocation for internal objects in tempdb. It is documented here.

And the results…

Before MBs
--------------------
0

After MBs
--------------------
2221

See how I have no internal space allocated in tempdb before the query, but afterward, I have allocated 2221 MB? That is where the sort spilled to tempdb.

In the Plan Explorer, we can see a warning about it:

Sort Operation in Plan Explorer Sort Operation Tooltip in Plan Explorer

How do I know for sure it was mostly the sort? I commented that order by out and ran it again. Note that it is likely not absolutely zero, because I'm using integer math rather than something more precise.

More results…

Before MBs
--------------------
0

After MBs
--------------------
0

Cursors Use Tempdb

Keyset cursors store their keyset in a tempdb table and static cursors store the entire result set in a tempdb table. I’m not going say “never use cursors.” I firmly believe that we should never say never (oops!). I will suggest to keep the potential impact on tempdb in mind when declaring a cursor. I'm also aware that keyset cursors have limited practical use. Keep that in mind as well.

It is worth noting a post Aaron did on using different cursor options.

In that post, Aaron covers the options LOCAL and FAST_FORWARD. You can and should use these with the majority of the cursors you might need.

Let’s take a moment to show the effect a static cursor can have on tempdb.

I have a database named “tests” with a table defined:

CREATE TABLE test (
     id INT IDENTITY(1, 1) NOT NULL
     , [data] NVARCHAR(max) NULL
     )

Now, I’ll add some rows. This should give me close to 5MB per row worth of nvarchar data. Run 100 times, I should get somewhere close to a 500MB table. This does a quick check for that as well.

INSERT dbo.test ([data])
SELECT REPLICATE(CONVERT(NVARCHAR(max), 'A'), 2500000) GO 100

SELECT SUM(stat.used_page_count) * 8 / 1024 MBs
FROM sys.dm_db_partition_stats stat
WHERE stat.[object_id] = object_id('dbo.test')

Results…

MBs
--------------------
487

Not quite 500, but close enough for this. ;)

This query will check tempdb size, declare a static cursor, then check tempdb size again to see what we caused:

SET NO COUNT ON

USE tempdb
GO

SELECT internal_objects_alloc_page_count * 8 / 1024 [Before MBs]
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

DECLARE c CURSOR STATIC
FOR
SELECT [data]
FROM tests.dbo.test

OPEN c

SELECT internal_objects_alloc_page_count * 8 / 1024 [After MBs]
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

CLOSE c

DEALLOCATE c

SELECT internal_objects_dealloc_page_count * 8 / 1024 [Deallocate MBs]
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

Here are my results:

Before MBs
--------------------
0

After MBs
--------------------
487

Deallocate MBs
--------------------
487



As you can see, before the cursor, I have 0 allocated internal object pages, then the entire thing is stuffed into tempdb. Finally, when I deallocate, we can see those pages are marked for deallocation as well.

One thing I was thinking while I was doing this was questioning whether or not this happened if I never even opened the cursor. I tried that out, and it doesn’t. It is not until you open the cursor that the tempdb objects are allocated.

Triggers Use Tempdb

Since SQL Server 2005, triggers use the version store, which is kept in tempdb. So, if you’re using triggers, they are implicitly using tempdb. Remember that triggers are set-based, and you’ll get version data for every affected row. For really big batches, this can turn out to be quite a bit of version data.

Let’s see an example. This is using my test table from the cursors section above. I’m going to define an update trigger:

USE tests
GO

CREATE TRIGGER tUpdateTest ON dbo.test
FOR UPDATE
AS
DECLARE @var NVARCHAR(max)
DECLARE @pages INT

SELECT @pages = internal_objects_alloc_page_count * 8 / 1024
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

PRINT '[During MBs] ' + cast(@pages AS VARCHAR)

SELECT @var = deleted.[data]
FROM deleted

This technically doesn’t do much of anything. I wanted to make sure we’re accessing the deleted table and that we’re checking tempdb internal object allocation inside the trigger.

Now I’ll run a batch that checks tempdb allocation before I run an update against every row. Our results will have the starting tempdb allocation size and the print statement from the trigger. Let’s see what happens.

SELECT internal_objects_alloc_page_count * 8 / 1024 [Before MBs]
FROM sys.dm_db_task_space_usage
WHERE session_id = @@SPID
     AND database_id = 2

UPDATE tests.dbo.test
SET [data] = [data] + 'x'

And the results…

Before MBs
--------------------
0


And from the print statement

[During MBs] 493

See how all that version data got dumped into tempdb? We generally know about, or easily discover, this for things like snapshot isolation or for the redo queue on Availability Group replicas. For triggers, I had been overlooking it. That is likely because triggers have been in SQL Server since before I was spending a lot of time considering more advanced performance topics. Remember to keep this one on your list of things to look out for with tempdb.

LOB Variables Use Tempdb

Do you need a variable to hold a big XML document? Maybe you need a parameter holding a large nvarchar(max)? These can be stored as internal objects in tempdb. Let’s see what impact this might have.

This time I don’t need a table. I only need a variable and a loop. Each loop we will add another 100k characters to the string. On each trip through the loop we'll review tempdb usage for this task along with the length of the variable. This will let us see at what point the variable goes to tempdb.

DECLARE @bigVar NVARCHAR(max) = ''

WHILE (LEN(@bigVar) < 1000000)
BEGIN
     SELECT @bigVar = @bigVar + REPLICATE(CONVERT(NVARCHAR(max), 'x'), 100000)

     SELECT LEN(@bigVar) [Variable Length]
          , internal_objects_alloc_page_count * 8 / 1024 [TempDB MBs]
     FROM sys.dm_db_task_space_usage
     WHERE session_id = @@SPID
          AND database_id = 2
END

I’m adding 100k to the string variable each loop, so I should get 10 rows in my results:

Variable Length      TempDB MBs
-------------------- --------------------
100000               0


Variable Length      TempDB MBs
-------------------- --------------------
200000               0


Variable Length      TempDB MBs
-------------------- --------------------
300000               1


Variable Length      TempDB MBs
-------------------- --------------------
400000               2


Variable Length      TempDB MBs
-------------------- --------------------
500000               3


Variable Length      TempDB MBs
-------------------- --------------------
600000               4


Variable Length      TempDB MBs
-------------------- --------------------
700000               6


Variable Length      TempDB MBs
-------------------- --------------------
800000               7


Variable Length      TempDB MBs
-------------------- --------------------
900000               9


Variable Length      TempDB MBs
-------------------- --------------------
1000000              11

For the first 2 loops, The string in stored in memory, but somewhere between 200k and 300k characters it was moved to tempdb storage. By the time I get to 1m characters, I’m using over 10MB in tempdb. This may not have a profound effect on tempdb usage or performance. At the same time, you can imagine what might happen if you had an application with 5000+ concurrent connections all performing operations like this. It adds up.

Summary

The purpose of this post isn’t so much to define everything that SQL Server does with tempdb. The intention is point out some specific scenarios that can cause tempdb usage when tempdb usage is not explicit. For these cases, tempdb might is not likely to be top of mind while coding. These are parasites that can slowly leech tempdb throughput little by little. We should make an effort to be mindful of them when designing and developing a system that utilizes SQL Server. Interested in learning more about SQL Sentry? Download a free trial today.