Category Archives: SQL2019

SQL 2019: Always Encrypted with Secure Enclaves, Part 1

I’m going to start this blog post series with a basic introduction to what’s required in order to deploy Always Encrypted with Secure Enclaves, and how the pieces fit together.

First, some background on using encryption with SQL Server.

Cell/column-level

Column-level encryption is described here.

It uses certificates to encrypt data, and the application must change in order to do the actual encryption/decryption. As you can imagine, changing applications is not too popular. Both the certificates and the data are available to highly privileged technology staff (DBAs).

At rest

Transparent Data Encryption also known as TDE, encrypts data “at rest”. You can’t restore or attach a database or database backup to a server that doesn’t have the same certificate as the source environment.

TDE has been the defacto standard for encrypting SQL Server databases for ages. It’s even been recently enhanced to allow you to pause and resume, which is great functionality to have for large databases.

See this link for more details on pausing and restarting TDE.

TDE is I/O based encryption. That means SQL Server processes that circumvent the I/O stack can’t use it. This was the case for SQL 2016, if you used NVDIMM-N/PMEM, because those hardware solutions for storage latency circumvent the I/O stack (that changes in SQL 2019, where using the Hybrid Buffer Pool allows you to use TDE).

The Database Encryption Key (DEK) resides in the database boot page of the encrypted database, so that it can be referenced during database recovery.

The main issue with TDE from a compliance/auditing perspective, is that highly privileged users (DBAs and sysadmins) have access to both the database and the encryption keys.

In transit

TDE is great for encrypting data “at rest” but doesn’t handle encrypting data “across communication channels”, as the documentation states.

For “across the wire” encryption, you’ll need to use TLS, and configure things properly. Details here.

In use

Always Encrypted resolves the “highly privileged user” issue – the certificates used for encryption don’t reside in the SQL Server environment. Data can only be decrypted on the client, so it’s also encrypted in transit. And unlike TDE, data is also encrypted in memory.

Using Always Encrypted in SQL 2016/2017 has limited functionality, including (but not limited to) the following:

  • Initial encryption and key rotation required moving all data to be encrypted out of the database, encrypting it, and writing it back to the database (encryption happens on the client)
  • Columns that were randomly encrypted could only be referenced by equality predicates
  • It was not possible to index columns with random encryption

Enter the Enclave

SQL 2019 supports an enhanced version of Always Encrypted, known as “Secure Enclaves”. What is an enclave? It’s like a consulate: “….a state that is enclosed within the territory of another state”.

It takes the form of a protected region of memory within the SQL Server environment, requiring special credentials for access. Data in the secure enclave lives in an unencrypted state.

However, as I’ll discuss later in this series, depending on how your organization implements Always Encrypted with Secure Enclaves, it might not be as secure as you had hoped.

HGS and Attestation

Client applications need to be assured that the enclave they are connecting to (on the SQL Server) is safe and secure. That’s were “attestation” comes in – a separate computer/VM assumes the responsibility of attesting to the validity of the enclave. This attestation comes in the form of a Windows service named “Host Guardian Service”, or HGS.

There are at least three required components for encrypting/decrypting data using Always Encrypted with Secure Enclaves:

  • Client/application server
  • SQL Server
  • HGS server

Clients (i.e. an app server) and SQL Server must have connectivity to the HGS server.

Data in the enclave is plaintext, which allows SQL Server to delegate non-equality comparisons to the enclave. A DLL resides in the enclave that allows this to happen.

HA/DR

Just like your production SQL Server instance, your production HGS environment should be redundant. Microsoft supports creating a Windows Server Failover Cluster (WSFC) for HGS, and in this specialized form of a WSFC, there is no shared storage.

For lower environments, you might consider running HGS on a single computer, but you should be aware that if the HGS environment suffers an outage, queries against encrypted data fail.

Unlike other forms of encryption for SQL Server, Always Encrypted uses the client side to encrypt/decrypt data. In this context, “client side” could be an application server, but it could also be a properly configured SSMS session.

What the enclave buys you

Not unlike many 1.0 features, Always Encrypted – which was released in SQL 2016 had many severe limitations that prevented it from being widely adopted. There were no changes to Always Encrypted in SQL 2017.

Some of the limitations of using Always Encrypted without an enclave are addressed when using an enclave:

  • Initial encryption and key rotation allow encryption “in place”
  • Non-equality predicates are supported, i.e. LIKE, >=, <=
  • Creating indexes on columns with random encryption

Under certain circumstances enclaves won’t be able to manage your encrypted data “in place”. So where does it go? The encrypted data to be manipulated must be copied to the memory on the client, unencrypted, manipulated, re-encrypted, and sent back to the enclave, which sends it to the database. That could be incredibly painful, but as usual, it depends.

Under the hood

The initial size of the enclave is 29 kilobytes, and it can grow to a maximum size of 35 megabytes. The amount of memory allocated to the VM or physical server does not affect the size of the enclave.

The enclave can grow, but once it grows it cannot shrink.

The size of the enclave can be interrogated in sys.dm_column_encryption_enclave

Don’t ditch that TDE just yet

You’re not supposed to be using Always Encrypted on your entire database – it’s meant to be used on specific columns, and in conjunction with TDE (not instead of it).

Why, you might ask?

I can think of at least three reasons:

  • Performance impact: client-side encryption has overhead
  • Storage impact: using Always Encrypted on a significant number of columns in your database could drastically increase the storage footprint. See “cyphertext length” at this link for more info.
  • Risk: if you remove TDE, it’s even easier for rogue folks to get a hold of your data. Maybe not the columns encrypted with Always Encrypted – but all the other formerly TDE-encrypted data is exposed to anyone that has access. All they have to do is grab a backup, and restore it on another instance of SQL Server.

SSMS

Microsoft has done a great job of making our lives easier, specifically in SSMS. The client driver knows which columns are encrypted with Always Encrypted (encryption metadata is cached), and if you reference one of those columns in a query and use a variable as a predicate, the client driver converts this to a parameterized query behind the scenes.

More details about how .NET automagically works with Always Encrypted can be found here.

High level limitations

Always Encrypted still has a lot of limitations, but mainly what you have to do is change the way you think about how SQL Server operates on data.

For example, you can’t do the following in a stored procedure, if the column is encrypted with Always Encrypted:

Why will this fail?

Because Always Encrypted is specifically designed to disallow this behavior.

Remember – only the client side can encrypt/decrypt data – SQL Server has no ability to do this. In this scenario, SQL Server must insert data that’s already encrypted. But the data referenced above is “plain text”, so the INSERT will fail. Predicates used withing stored procedures that will be compared with Always Encrypted columns must also be encrypted, and that includes parameters.

SQL Server data masking

Always Encrypted is not compatible with data masking or user-defined types (UDT). That’s unfortunate, as many companies have a requirement to obfuscate data in lower environments.

Linux support

Support for secure enclaves is on the roadmap for Always Encrypted on Linux. Secure enclaves in its present form uses Virtualization Based Security (VBS) which is a Windows thing, so Linux can’t use VBS. Microsoft will implement secure enclaves on Linux based on Intel SGX. More info about SGX can be found here.

Availability Groups

All replicas must be configured to use a secure enclave in order to be fully functional. If a secure enclave is not configured for a replica, and a query is executed on the replica that references encrypted columns, the query will fail.

Backup/Restore

Restore is similar to AGs in that a secure enclave must be configured in order to run queries against encrypted columns. But restore also has an additional requirement. Remember that the enclave must be attested to by a computer running HGS. Therefore, if you create a backup of a production database that uses a secure enclave, and restore it in a lower environment (which likely has its own HGS configuration), unless you re-encrypt the data with the lower environment certificate (this is known as key rotation), your queries will fail. You can rotate the keys without an enclave, but it will not be done “in-place” – all encrypted data will have to be copied to the client, decrypted, re-encrypted, and written back to the database.

Replication

Only columns that are not encrypted with Always Encrypted can be replicated. I tested this with Transactional Replication, and it works.

In the next post, we’ll configure Always Encrypted in SQL 2019 to use a secure enclave.

Hekatonized Tempdb

At PASS Summit 2018, I attended a session led by Pam Lahoud (t) of the SQL Tiger Team , entitled “TempDB: The Good, The Bad, and The Ugly”. If you have access to the PASS recordings from 2018, I highly recommend watching this session.

It was a really fantastic presentation, detailing the full history of how the SQL Server engineering team has attempted to optimize TempDB in various ways. The two problems that busy servers can have with regard to TempDB are allocation page contention, and metadata contention, and the engineering team should be applauded for its clever approaches to solving these types of contention throughout the years. To be clear, all of the optimizations were related to temp table usage in stored procedures, not scripts.

However, none of those solutions for contention scaled – some only relocated the issue. As part of Pam’s presentation, she did a demo with a single TempDB metadata table that was “Hekatonized”  – actually using the In-Memory OLTP engine – and the difference in throughput was significant. She said that Microsoft intends to convert the remaining system tables in TempDB to be memory-optimized (you’ll need SQL 2019 CTP 3.0 or later to test).

So once you’ve got it installed or have started a container running it – how to you automagically convert TempDB system tables to be memory-optimized? With TSQL, of course:

ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED TEMPDB_METADATA = ON;

Like other changes to TempDB, in order for the new memory-optimization to take effect a restart of the SQL Server service is required. Once the service is restarted, system tables in TempDB are now memory-optimized (it should be that way for RTM, but in CTP 3.0, it could be the case that not all system tables have been converted to Hekaton). You can reverse this setting with the following command, and again restarting the SQL Server service:

ALTER SERVER CONFIGURATION SET MEMORY_OPTIMIZED TEMPDB_METADATA = OFF;

Unless your workload was truly hammering TempDB, you probably won’t see much difference in TempDB performance.

We need to be careful with this new In-Memory power, because depending on workload characteristics, we might need a whole lot more memory just to handle what’s going on in TempDB. Also, if you have scripts and/or monitoring that interrogate system tables in TempDB, you might be affected by some of the restrictions if TempDB system tables are memory-optimized. As the CTP release notes, state:

“A single transaction may not access memory-optimized tables in more than one database. This means that any transactions that involve a memory-optimized table in a user database will not be able to access TempDB system views in the same transaction.”

Another thing I want to make clear is that this new TempDB optimization only affects system tables in TempDB – not the tables you create; #table and ##table do not become memory-optimized as a result of this new feature.

After all, it’s name is MEMORY_OPTIMIZED_TEMPDB_METADATA

SQL 2019 In-Memory hotness

SQL 2019 is on track to become one of the most awesome releases – the product touches so many realms of the data platform, it’s truly mind boggling.

Since I have such a keen interest in Hekaton/In-Memory OLTP, when the CTPs are released for a new version of SQL Server, I look forward to any potential announcements about that feature.

So far, there’s been only one publicly announced enhancement for In-Memory OLTP in SQL 2019: system tables in TempDB will be “Hekatonized”. This will forever solve the issue of system table contention in TempDB, which is a fantastic use of Hekaton. I’m told it will be “opt in”, so you can use this enhancement if you want to, but you can also back out of it, which would require a restart of the SQL Server service.

But there’s at least one other enhancement that’s not been announced, although the details of its implementation are not yet known.

When you start to research the Hekaton feature, most are shocked to learn that CHECKDB does not verify anything about durable In-Memory tables: it silently ignores them.

That appears to have changed in SQL 2019, although either the informational message about what it does is misleading, or behind the scenes it does something different.

This is the output for DBCC CHECKDB of a memory-optimized database in SQL 2017:

Object ID 949578421 (object ‘inmem_table’): The operation is not
supported with memory optimized tables. This object has been skipped and will not be processed.

(the emphasis was added by me)

This is the output for DBCC CHECKDB of a memory-optimized database in SQL 2019:

DBCC results for ‘inmem_table’.
There are 101 rows in 3 pages for object “inmem_table”.

Why do I say the message is misleading?

Because durable data for memory-optimized tables is not stored in pages, but instead in a streaming fashion in files known as checkpoint file pairs (or data and delta files). Also, while it’s true that there are 101 rows in this table, the engine pre-creates a number of data and delta files, and it would make DBAs sleep a lot better at night, if all of those files were verified as being corruption free.

We’ll just have to stay tuned to the future CTPs and RTM of SQL 2019 to see how all of this shakes out.