T-SQL Tuesday: Always Encrypted Limitations

T-SQL Tuesday

This month's T-SQL Tuesday event is being hosted by Ken Wilson (@_KenWilson), and the topic is encryption. I've been playing with SQL Server 2016 quite a bit, so I thought I would talk about a new feature there, Always Encrypted. There are plenty of materials out there singing its praises (and rightfully so); I decided to focus on the limitations, so you can understand what you are bound to in the current implementation of the feature.

Note that these observations are drawn mostly from experience with the CTP 2.2 build, and anything here is subject to change.

Background

As a primer, Always Encrypted is different from Transparent Data Encryption (TDE) in two major ways:

  • You encrypt at the column level, rather than the entire database.
  • The data is encrypted both at rest and in memory (decryption is handled by the client driver), meaning you can protect the data from both malicious administrators and man-in-the-middle attacks. (TDE only encrypts data at rest.)

There are two styles of encryption: deterministic and randomized. Deterministic gets you the same encrypted value every time, while randomized arrives at - at least theoretically - a different value every time (there is no guarantee about global uniqueness or anything like that). Deterministic is best used for columns where you are likely to perform point lookups or seeks, such as LastName perhaps (pseudo-code: WHERE LastName = <encrypt('Smith')>). Randomized is more secure, but should only be used for columns that are display only, such as Salary - I am not going to be looking for all of the people in the Employees table making exactly $84,500 (and if we're encrypting salary, we know we're not going to be performing range queries, either).

Ok, now on with the limitations...

Encryption Algorithms

Only one algorithm option is currently supported, and it's certainly a mouthful: AEAD_AES_256_CBC_HMAC_SHA_256. This is not something I'm ever going to memorize, and when testing out this feature, it quickly made it to the top of my clipboard list.

Columns/Data Types

String-based columns that are encrypted with deterministic encryption must use a _BIN2 collation (e.g. Latin1_General_BIN2).

The following data types are *not* supported as encrypted columns, per the documentation:

  • text/ntext/image
  • XML/hierarchyid/geography/geometry
  • alias types/user-defined data types
  • SQL_VARIANT
  • rowversion (timestamp)

These additional items are unsupported, but aren't listed in the documentation:

  • Sparse columnset (sparse columns are okay, as long as the table does not contain a columnset)
    • Msg 1740, Level 16
      Cannot encrypt the sparse column 'col' in the 'dbo.tbl' table because the table contains a sparse column
      set.
      Msg 1741, Level 16
      Cannot create the sparse column set 'col' in the 'dbo.tbl' table because the table contains one or more
      encrypted sparse columns.
  • Built-in alias types, e.g. SYSNAME
    • Msg 33280, Level 16
      Cannot create encrypted column 'col' because type 'sysname' is not supported for encryption.
  • IDENTITY columns
    • Msg 2749, Level 16
      Identity column 'col' must be of data type int, bigint, smallint, tinyint, or decimal or numeric
      with a scale of 0, unencrypted, and constrained to be nonnullable.
  • Computed columns
    • Msg 206, Level 16
      Operand type clash: nvarchar(32) encrypted with (encryption_type = 'DETERMINISTIC', 
      encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 
      'ColumnKey', column_encryption_key_database_name = 'EncryptTest') is incompatible with varchar

Other Feature-Related Restrictions

Currently, several features don't play well with Always Encrypted. These are the ones I found, but I am not certain whether these are limitations due to the early stage of the CTPs, the v1 stage of the feature, or if they will be permanent limitations.

  • Temporal tables - while you can add temporal columns to a table with encrypted columns, you get an error message as soon as you try to turn system versioning on:
    • Msg 13579, Level 16
      Table 'dbo.tbl' contains system-time PERIOD or SYSTEM_VERSIONING is ON and cannot have
      encrypted columns.
       

      And if you try to add an encrypted column to a table that already has system versioning on:

    • Msg 13550, Level 16
      Add column operation failed on table 'AE.dbo.tbl' because it is not supported operation on
      system-versioned temporal tables.
  • Triggers are partially supported, as long as you don't reference any of the encrypted columns (and you don't even have to do so directly - something as simple as SELECT * FROM inserted; will yield the operand type clash error (Msg 206)). I did not try to see if there were clean ways to make triggers actually succeed at runtime as long as they didn't mention the encrypted columns. I did try to create a trigger with a table variable that had encrypted columns as well (this required creating the master and column keys in tempdb too), but this only yielded this slightly different Msg 206 error:
    • Msg 206, Level 16, State 2
      Procedure TriggerName Operand type clash: nvarchar(32) encrypted with (encryption_type =
      'DETERMINISTIC', encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', 
      column_encryption_key_name = 'ColumnKey', column_encryption_key_database_name = 'EncryptTest')
      is incompatible with nvarchar(32) encrypted with (encryption_type = 'DETERMINISTIC', 
      encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 
      'ColumnKey', column_encryption_key_database_name = 'tempdb')

I presume this is because the two certificates don't match exactly (try it, script them out; the ENCRYPTED_VALUE is different). This means that #temp tables, @table variables, etc. will pose difficulties.

  • Full-text search is not supported; if you try to create a full-text index, you will receive an error like the following:
    • Column 'col' cannot be used for full-text search because it is not a character-based, XML, 
      image or varbinary(max) type column or it is encrypted. (Microsoft SQL Server, Error: 7670)
  • Column 'col' cannot be used for full-text search because it is not a character-based, XML, image or varbinary(max) type column or it is encrypted. (Microsoft SQL Server, Error: 7670)
  • Replication is not supported; I was able to set up a basic publication, but just because it works now doesn't mean it will, or is meant to. I actually couldn't get a subscription initialized (though there were no obvious errors implicating encryption), and I just tried basic transactional replication, nothing exotic. Mirroring and Availability Groups are both supported (and Log Shipping should be okay as well, though I haven't tested that, either).
  • Change Data Capture could be turned on, but did not appear to work end-to-end. The CDC jobs never succeed:
    • Msg 206, Level 16
      Operand type clash: nvarchar(32) encrypted with (encryption_type = 'DETERMINISTIC', 
      encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 
      'ColumnKey', column_encryption_key_database_name = 'Archive') is incompatible with nvarchar 
      For more information, query the sys.dm_cdc_errors dynamic management view. 
      Msg 22863, Level 16
      Failed to insert rows into Change Data Capture change tables. Refer to previous errors in the
      current session to identify the cause and correct any associated problems. For more 
      information, query the sys.dm_cdc_errors dynamic management view.
  • (Change Tracking, on the other hand, works fine with Always Encrypted. I believe this is because only the key values are recorded, so by definition the history could never include encrypted columns.)
  • In-Memory OLTP - you cannot have always encrypted columns within in-memory tables:
    • Msg 10794, Level 16
      The feature 'ALWAYS ENCRYPTED' is not supported with memory optimized tables.
  • Stretch Database, a new feature in SQL Server 2016 which allows you to migrate portions of tables to Azure SQL Database, is documented as not supporting Always Encrypted.
  • Some other features I tried and seemed to work okay: filestream and columnstore indexes (both variations; they can even be in the key list for non-clustered columnstore indexes). I did not try filetable.

Operations

Columns using deterministic encryption support WHERE equality comparisons, as well as DISTINCT, JOIN, and GROUP BY. You cannot perform inequality, range, or LIKE queries, or any other operations against encrypted columns (arithmetic, date/time operations, etc.):

Msg 206, Level 16
Operand type clash: nvarchar(32) encrypted with (encryption_type = 'DETERMINISTIC', 
encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 
'ColumnKey', column_encryption_key_database_name = 'AE') is incompatible with nvarchar

Columns using randomized encryption are not supported in clauses at all:

Msg 33299, Level 16
Encryption scheme mismatch for columns/variables '@col', 'col'. The encryption scheme for the 
columns/variables is (encryption_type = 'RANDOMIZED', encryption_algorithm_name = 
'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 'ColumnKey', 
column_encryption_key_database_name = 'AE') and the expression near line '7' expects it to be
(encryption_type = 'DETERMINISTIC') (or weaker).

Indexes / Constraints

You cannot create an index or constraint on a column that uses randomized encryption:

Msg 33282, Level 16
Column 'dbo.tbl.col' is encrypted using a randomized encryption type and is therefore not 
valid for use as a key column in a constraint, index, or statistics.

And foreign keys must match encryption types:

Msg 33281, Level 16
Column 'dbo.tbl1.col' has a different encryption scheme than referencing column 'tbl2.col' 
in foreign key 'FKName'.

Client Code

Client libraries need to be updated to support encryption and decryption of columns and parameters. Not all drivers will support this functionality, and some may never. Currently the only technology that supports this is .NET 4.6; ODBC and JDBC updates should come soon.

The connection string needs to be updated with the following additional attribute:

Column Encryption Setting=enabled;

Management Studio itself will require each client to have a copy of the column encryption key, or direct access to it. I will go over a more elaborate scenario in a future post, but for now you can assume that you will need to grant application users (including those of SSMS) the ability to see the master/column key metadata, with the following code:

GRANT VIEW ANY COLUMN MASTER KEY DEFINITION TO database_user;
GRANT VIEW ANY COLUMN ENCRYPTION KEY DEFINITION TO database_user;

As long as the Column Encryption Setting is added as an additional connection parameter, SSMS users will be able to see the decrypted values in the result of a query. But they will not be able to pass parameters to a stored procedure that inserts or updates encrypted values, because - unlike with an application, which handles the encryption - there is no round-trip between declaring the parameters and passing them to the stored procedure. Result:

Msg 33299, Level 16
Encryption scheme mismatch for columns/variables '@param'. The encryption scheme for the 
columns/variables is (encryption_type = 'PLAINTEXT') and the expression near line '0' expects 
it to be (encryption_type = 'DETERMINISTIC', encryption_algorithm_name = 
'AEAD_AES_256_CBC_HMAC_SHA_256', column_encryption_key_name = 'ColumnKey', 
column_encryption_key_database_name = 'AE') (or weaker).

There may be some way to work around this using SQLCMD mode, but I haven't explored too deeply yet.

For the client applications on machines other than where SQL Server 2016 is installed, they'll need an updated version of .NET Framework (not necessary on Windows 10 or if you have Visual Studio 2015), and the certificate installed (as I describe here). If the framework isn't modern enough, the connection string setting simply won't work (though I haven't confirmed the exact error message because all of my VMs running SQL Server 2016 are Windows 10). If the certificate used for the keys isn't there, the application will crash with errors like this, depending on the type of keys you generated on the database server:

Unhandled Exception: System.Data.SqlClient.SqlException: Failed to decrypt a column encryption 
key using key store provider: 'MSSQL_CERTIFICATE_STORE'. The last 10 bytes of the encrypted 
column encryption key are: '60-8E-2E-D3-09-8E-1D-18-10-E3'.
Certificate with thumbprint '8D2106DABD34DD557D208DD5BE9DFCD1D23CE1BF' not found in certificate
store 'My' in certificate location 'CurrentUser'. Verify the certificate path in the column 
master key definition in the database is correct, and the certificate has been imported 
correctly into the certificate location/store.
Parameter name: masterKeyPath —> System.ArgumentException: Certificate with thumbprint 
'8D2106DABD34DD557D208DD5BE9DFCD1D23CE1BF' not found in certificate store 'My' in certificate 
location 'CurrentUser'. Verify the certificate path in the column master key definition in the
database is correct, and the certificate has been imported correctly into the certificate 
location/store.

(It knows where to look for that certificate on the local machine because it gets the metadata from SQL Server. For some other information about master/column keys and particularly about key rotation, see this SQL Server Security Blog post.)

Encrypted parameters need to be passed as properly typed parameters. This type of ad hoc SQL will break, even when passed from your C# 4.6 application with the proper column encryption setting in the connection string:

SELECT cols FROM dbo.tbl WHERE EncryptedColumn = 'some value';

Msg 206, Level 16
Operand type clash: varchar is incompatible with varchar(8000) encrypted with (encryption_type =
'DETERMINISTIC', encryption_algorithm_name = 'AEAD_AES_256_CBC_HMAC_SHA_256', 
column_encryption_key_name = 'ColumnKey', column_encryption_key_database_name = 'AE') 
collation_name = 'SQL_Latin1_General_CP1_CI_AS'

You'll need to use SqlCommand and SqlParameter, with types and lengths that match exactly, in order to perform any of the supported operations against encrypted columns.

Finally, there is additional overhead required to negotiate the encryption handshakes, so if your application is really chatty, you'll definitely want to baseline your network performance and then compare once your columns are encrypted. There is also the consideration that encrypted data can take a lot more space than its unencrypted equivalent, whether or not you are using compression. As a companion to this post, I took a look at these two impacts in a separate post over on SQLPerformance.com.

There are some additional considerations for client code development, and there is a dedicated topic on that in Books Online.

Summary

With all of the data breaches happening these days, I am quite happy that there are advances being made in the protection of data. But understanding the limitations before jumping into it can be a valuable learning exercise. I'll repeat, because it can't be stressed enough, that these are the limitations I am observing in current builds today, and the status of any of these issues can change at any time. I'll also admit that this isn't necessarily an exhaustive list in terms of features; just the ones I thought most obvious to check.

Follow the #TSQL2sDay hash tag and Ken's post for more posts on encryption and SQL Server.

Thwack - Symbolize TM, R, and C