Anyone who has worked with or studied In-Memory OLTP should know that upon database restart (which occurs for a variety of reasons), data for durable memory-optimized databases must be streamed from checkpoint file pairs that reside on disk (CFPs), into memory.
During recovery, data files are read in parallel, using as many threads as there are logical CPUs. If all CFPs reside on a single volume, the speed of recovery is limited by how much IO is available from just that single volume. That’s why RTO is affected by not only how many containers exist for a database, but how many volumes those containers are spread across.
That means that other than keeping the amount of data stored in durable memory-optimized tables to a minimum, the only option you have for reducing the amount of time it takes to stream data into memory, is to spread your containers across multiple volumes.
Let’s say you’ve got 5 containers, and your RTO is 15 minutes. You’ve tested recovery with your current data set, and the actual time required to restart your database is 5 minutes. Everything is ok, you’ve got RTO minutes to spare.
Over time, the amount of memory consumed by your durable memory-optimized tables – and the required storage footprint for CFPs – continue to grow. But you’re ok with all of that, because hey – memory is cheap, and so is storage, so you’ll just make sure that you have enough of both to continue doing business.
Now fast forward to some point in the future, and the size of your durable data set has quadrupled. You can’t get application owners to agree to migrate data from memory-optimized tables to disk, because of who-knows-what, or perhaps it’s because of some of the reasons mentioned here.
And now, due to the continued growth of your durable memory-optimized data, you can’t make your RTO. How will you solve this problem?
If you only create additional containers, the In-Memory OLTP engine will be able to use them for writing new data. But that won’t get you where you want to be, because the existing CFPs don’t get redistributed among all existing containers – the new containers get their own CFPs.
The moment you add the new containers, 100% of your existing data still resides in the 5 original containers. Even though you theoretically have enough containers to make your RTO (the original 5 containers plus 5 new ones), it’s still blown, because the pre-existing CFPs have not been redistributed across all containers (10 containers total).
The solution is to:
- Create new containers on the pre-existing volumes
- Create new containers on the new volumes
- Drop the old containers
Dropping the old containers will cause the existing CFPs to be redistributed among all new containers (10).
Now we have a way to redistribute pre-existing CFPs across new containers/volumes so that you can make your RTO. And in case you were wondering, Microsoft says that redistribution of CFPs can be performed while your database is online (obviously you should test this entire process in a non-production environment to see if it works as promised).
How much durable data can you store in a SQL Server database? For SQL 2016, you are limited only by what the operating system can handle. With Windows Server 2012R2, the maximum memory is 4TB, but Windows Server 2016 supports 24TB of memory. That could mean you got a whole lotta CPF redistribution goin’ on…..