
This month's T-SQL Tuesday event is being hosted by Ken Wilson (@_KenWilson), and the topic is encryption. I've been playing with SQL Server 2016 quite a bit, so I thought I would talk about a new feature there, Always Encrypted. There are plenty of materials out there singing its praises (and rightfully so); I decided to focus on the limitations, so you can understand what you are bound to in the current implementation of the feature.
Note that these observations are drawn mostly from experience with the CTP 2.2 build, and anything here is subject to change.
Background
As a primer, Always Encrypted is different from Transparent Data Encryption (TDE) in two major ways:
- You encrypt at the column level, rather than the entire database.
- The data is encrypted both at rest and in memory (decryption is handled by the client driver), meaning you can protect the data from both malicious administrators and man-in-the-middle attacks. (TDE only encrypts data at rest.)
There are two styles of encryption: deterministic and randomized. Deterministic gets you the same encrypted value every time, while randomized arrives at - at least theoretically - a different value every time (there is no guarantee about global uniqueness or anything like that). Deterministic is best used for columns where you are likely to perform point lookups or seeks, such as LastName perhaps (pseudo-code: WHERE LastName = <encrypt('Smith')>
). Randomized is more secure, but should only be used for columns that are display only, such as Salary - I am not going to be looking for all of the people in the Employees table making exactly $84,500 (and if we're encrypting salary, we know we're not going to be performing range queries, either).
Ok, now on with the limitations...
Encryption Algorithms
Only one algorithm option is currently supported, and it's certainly a mouthful: AEAD_AES_256_CBC_HMAC_SHA_256. This is not something I'm ever going to memorize, and when testing out this feature, it quickly made it to the top of my clipboard list.
Columns/Data Types
String-based columns that are encrypted with deterministic encryption must use a _BIN2 collation (e.g. Latin1_General_BIN2).
The following data types are *not* supported as encrypted columns, per the documentation:
- text/ntext/image
- XML/hierarchyid/geography/geometry
- alias types/user-defined data types
- SQL_VARIANT
- rowversion (timestamp)
These additional items are unsupported, but aren't listed in the documentation:
- Sparse columnset (sparse columns are okay, as long as the table does not contain a columnset)
- Built-in alias types, e.g. SYSNAME
- IDENTITY columns
- Computed columns
Other Feature-Related Restrictions
Currently, several features don't play well with Always Encrypted. These are the ones I found, but I am not certain whether these are limitations due to the early stage of the CTPs, the v1 stage of the feature, or if they will be permanent limitations.
- Temporal tables - while you can add temporal columns to a table with encrypted columns, you get an error message as soon as you try to turn system versioning on:
- Triggers are partially supported, as long as you don't reference any of the encrypted columns (and you don't even have to do so directly - something as simple as
SELECT * FROM inserted;
will yield the operand type clash error (Msg 206)). I did not try to see if there were clean ways to make triggers actually succeed at runtime as long as they didn't mention the encrypted columns. I did try to create a trigger with a table variable that had encrypted columns as well (this required creating the master and column keys in tempdb too), but this only yielded this slightly different Msg 206 error:
I presume this is because the two certificates don't match exactly (try it, script them out; the ENCRYPTED_VALUE
is different). This means that #temp tables, @table variables, etc. will pose difficulties.
- Full-text search is not supported; if you try to create a full-text index, you will receive an error like the following:
-
Column 'col' cannot be used for full-text search because it is not a character-based, XML, image or varbinary(max) type column or it is encrypted. (Microsoft SQL Server, Error: 7670)
- Replication is not supported; I was able to set up a basic publication, but just because it works now doesn't mean it will, or is meant to. I actually couldn't get a subscription initialized (though there were no obvious errors implicating encryption), and I just tried basic transactional replication, nothing exotic. Mirroring and Availability Groups are both supported (and Log Shipping should be okay as well, though I haven't tested that, either).
- Change Data Capture could be turned on, but did not appear to work end-to-end. The CDC jobs never succeed:
- (Change Tracking, on the other hand, works fine with Always Encrypted. I believe this is because only the key values are recorded, so by definition the history could never include encrypted columns.)
- In-Memory OLTP - you cannot have always encrypted columns within in-memory tables:
- Stretch Database, a new feature in SQL Server 2016 which allows you to migrate portions of tables to Azure SQL Database, is documented as not supporting Always Encrypted.
- Some other features I tried and seemed to work okay: filestream and columnstore indexes (both variations; they can even be in the key list for non-clustered columnstore indexes). I did not try filetable.
Operations
Columns using deterministic encryption support WHERE equality comparisons, as well as DISTINCT, JOIN, and GROUP BY. You cannot perform inequality, range, or LIKE queries, or any other operations against encrypted columns (arithmetic, date/time operations, etc.):
Columns using randomized encryption are not supported in clauses at all:
Indexes / Constraints
You cannot create an index or constraint on a column that uses randomized encryption:
And foreign keys must match encryption types:
Client Code
Client libraries need to be updated to support encryption and decryption of columns and parameters. Not all drivers will support this functionality, and some may never. Currently the only technology that supports this is .NET 4.6; ODBC and JDBC updates should come soon.
The connection string needs to be updated with the following additional attribute:
Management Studio itself will require each client to have a copy of the column encryption key, or direct access to it. I will go over a more elaborate scenario in a future post, but for now you can assume that you will need to grant application users (including those of SSMS) the ability to see the master/column key metadata, with the following code:
As long as the Column Encryption Setting is added as an additional connection parameter, SSMS users will be able to see the decrypted values in the result of a query. But they will not be able to pass parameters to a stored procedure that inserts or updates encrypted values, because - unlike with an application, which handles the encryption - there is no round-trip between declaring the parameters and passing them to the stored procedure. Result:
There may be some way to work around this using SQLCMD mode, but I haven't explored too deeply yet.
For the client applications on machines other than where SQL Server 2016 is installed, they'll need an updated version of .NET Framework (not necessary on Windows 10 or if you have Visual Studio 2015), and the certificate installed (as I describe here). If the framework isn't modern enough, the connection string setting simply won't work (though I haven't confirmed the exact error message because all of my VMs running SQL Server 2016 are Windows 10). If the certificate used for the keys isn't there, the application will crash with errors like this, depending on the type of keys you generated on the database server:
(It knows where to look for that certificate on the local machine because it gets the metadata from SQL Server. For some other information about master/column keys and particularly about key rotation, see this SQL Server Security Blog post.)
Encrypted parameters need to be passed as properly typed parameters. This type of ad hoc SQL will break, even when passed from your C# 4.6 application with the proper column encryption setting in the connection string:
You'll need to use SqlCommand and SqlParameter, with types and lengths that match exactly, in order to perform any of the supported operations against encrypted columns.
Finally, there is additional overhead required to negotiate the encryption handshakes, so if your application is really chatty, you'll definitely want to baseline your network performance and then compare once your columns are encrypted. There is also the consideration that encrypted data can take a lot more space than its unencrypted equivalent, whether or not you are using compression. As a companion to this post, I took a look at these two impacts in a separate post over on SQLPerformance.com.
There are some additional considerations for client code development, and there is a dedicated topic on that in Books Online.
With all of the data breaches happening these days, I am quite happy that there are advances being made in the protection of data. But understanding the limitations before jumping into it can be a valuable learning exercise. I'll repeat, because it can't be stressed enough, that these are the limitations I am observing in current builds today, and the status of any of these issues can change at any time. I'll also admit that this isn't necessarily an exhaustive list in terms of features; just the ones I thought most obvious to check.
Follow the #TSQL2sDay hash tag and Ken's post for more posts on encryption and SQL Server.
Aaron (@AaronBertrand) is a Data Platform MVP with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com.
Aaron's blog focuses on T-SQL bad habits and best practices, as well as coverage of updates and new features in Plan Explorer, SentryOne, and SQL Server.
Comments