Category Archives: Uncategorized

PASS Summit 2017

What a difference a tweet makes

I’ve been involved with SQL Server longer than most folks, but was always a sort of lone wolf. In the early days, there was no community to speak of, and even when the community started to gain momentum, I shied away from it.

That changed after the 2015 Summit, when I made a conscious decision to become involved in the SQL Server community. I spoke at my first user group, began applying to SQL Saturdays, and last year, I submitted abstracts to speak at the 2017 Summit. I was not at all surprised that my abstracts were not selected, but after reviewing the schedule, I was a bit disappointed to see my favorite feature somewhat under-represented (In-Memory OLTP) .

So I took to twitter, and wrote this:

And so it began, my roundabout path to presenting at PASS Summit 2017.

Niko Neugebauer – who I had been in touch with for a while, but had never actually met in person – replied to my tweet, and a dialog ensued.

After a bit of back and forth with folks at PASS, a panel had been formed to specifically discuss In-Memory OLTP, and I would be a member of that panel. Included was of course Niko himself, who was a fantastic MC, as well as Bob Ward, Kevin Farlee, Tehas Shah, and Jos de Briujn. Sunal Agarwal was also supposed to join us, but due to scheduling issues, couldn’t make it. Many of the Microsoft panel members were responsible for actually delivering the In-Memory OLTP feature, and to say that I was honored to be on a panel with them would be a great understatement. It was really thrilling – definitely the highlight of my presenting life!

Presenting takes you deep

Think you know a topic well? Presenting will prove that you don’t! The simple act of organizing your thoughts, such that they can be imparted to others, forces you to drill down into a topic in a way that you would never otherwise get to. All of the facts about SQL Server are written in the documentation, but delivering those facts to a room full of people requires a variety of skills: creating slides, demos, scripts, and anticipating questions.

It’s “Live”

My life as a jazz musician before I got into SQL Server included a lot of public performance, so I was pretty comfortable being out in front of a room full of people. There’s some common ground between jazz and presenting – they are both “live”, and anything can happen at any time. Projectors fail, other issues arise, and you have to find a way forward no matter what.

You

I encourage all readers of this post to submit abstracts for the 2018 PASS Summit – my experience is proof that you never know what will happen!

In-Memory OLTP Resources, Part 3: OOS (Out of Storage)

3 Replies

Zero free space

This is a continuation of Part 1 and Part 2 of this blog post series, related to resource issues/requirements for memory-optimized databases.

In this post, we’ll continue with simulating what happens to a memory-optimized database when all volumes run out of free space.

In my lab, I’m running Windows Server 2012. Let’s use Powershell to install the File System Resource Manager, which will allow us to create a quota for the relevant folder:

add-windowsfeature –name fs-resource-manager –includemanagementtools

After installing the Windows feature we can set the quota for the folder, but we shouldn’t enable it just yet, because first we have to verify the current size of the folder.

On my server, I created a quota of 1.5GB, and then enabled it.

Now let’s INSERT rows into the table, in batches of 1000, until we reach the limit (the INSERT script is listed in Part 2, I’m trying to keep this post from getting too long).

Once the quota has been reached, we receive the dreaded 41822 error – this is what you’ll see when all of the volumes where your containers reside run out of free space (if even one of the volumes has free space, your workload can still execute).

Just out of curiosity, we’ll verify how many rows actually got inserted. On my server, I’ve got 4,639 rows in that table, and the folder consumes 1.44GB. So theoretically, there was enough space on the drive to create more checkpoint files, but it seems as though the engine won’t just create what it can to fit in the available space. It’s more likely that the engine attempts to precreate a set of files, and it either succeeds or fails all at once, but I’ve not confirmed that.

I disabled the quota, executed a manual CHECKPOINT, and ran the diagnostic queries again:

File Merge

Data files persist rows that reside in durable memory-optimized tables, and delta files store references to logically deleted rows. As more and more rows become logically deleted across different sets of CFPs, two things happen:

the storage footprint increases (imagine that all data files have 50% of their rows logically deleted)
query performance gets worse, because result sets must be filtered by entries in the delta files, which are increasing in size

Microsoft killed both of these birds with one stone: File Merge (aka Garbage Collection for data/delta files)

In the background – while your workload is running – the File Merge process attempts to combine adjacent sets of CFPs, and this is where we get to one of the file states that we didn’t cover in Part 1: MERGE TARGET

A file that has the fileType of MERGE TARGET is the new set of combined data/delta files from the File Merge process. Once the merge has completed, the MERGE TARGET transitions to ACTIVE, and as we stated earlier in this series, ACTIVE files can no longer be populated.

But what about the source files that the MERGE TARGET is derived from? After a CHECKPOINT, these files transition to WAITING FOR LOG TRUNCATION, and can be removed. It should be noted that it can take several checkpoints and transaction log backups for CFPs to transition to a state where they can actually be removed. That’s why Microsoft recommends 4x durable memory-optimized data size for the initial storage footprint.

In the images that follow, we can see that the formerly distinct transaction ranges of 101 to 200, and 201 to 300, have been combined into a single CFP, which has the range of 101 to 300.

Effect on backup size

File Merge – and the requisite file state changes that CFPs must go through – explain why backups for memory-optimized databases can be considerably larger than the amount of data stored in memory. Until CFPs go through the required state changes, they must be included in backups.

IOPS

The File Merge process requires both storage and IOPS, as it reads from both sets of CFPs, and writes to a new set. Let’s say your workload requires 500 IOPS to perform well. We’ve just added another 1,000 IOPS as a requirement for your workload to maintain the same level of performance: 500 IOPS each for the read and write components of File Merge. That’s why Microsoft recommends 3x workload IOPS for your memory-optimized storage.

Potential remedies, real and imagined

What happens to your memory-optimized database when all volumes run out of free space?

In my testing of inserts that breached the quota for the folder, I saw no affect on database status. However, if I created the database, set the quota to a much lower value, and then created a memory-optimized table, the database status became SUSPECT. In a real-world situation, with hundreds of gigabytes or more of memory-optimized data, the last thing you want to do is a database restore in order to return your database to a usable state.

I was able to set the database OFFLINE, and then ONLINE, and that cleared the SUSPECT status. But keep in mind, that setting the database OFFLINE/ONLINE will restream all your data, so there will be a delay in database recovery due to that.

USE master;
GO
ALTER DATABASE [OOM_DB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
ALTER DATABASE [OOM_DB] SET OFFLINE;
GO
ALTER DATABASE [OOM_DB] SET ONLINE;
ALTER DATABASE [OOM_DB] SET MULTI_USER
GO

USE master;

ALTER DATABASE [OOM_DB] SET SINGLE_USER WITH ROLLBACK IMMEDIATE;

ALTER DATABASE [OOM_DB] SET OFFLINE;

ALTER DATABASE [OOM_DB] SET ONLINE;

ALTER DATABASE [OOM_DB] SET MULTI_USER

What can you do if your volumes run out of free space?

Well, in SQL 2014, your database went into “SUSPENDED” mode (not suspect), and it was offline, until perhaps you added more space and restarted the database (not sure, I didn’t test that). In SQL 2016+, the database goes into what’s known as “delete-only mode”, where you can still SELECT data, but modifying data is limited to deleting rows and/or dropping indexes/tables. Of course, SELECT, DELETE, and DROP to nothing to solve your problem: you need more free space.

When a database transitions to delete-only mode, that fact is written to the SQL errorlog:

[WARNING] Database ID: [9]. Checkpoint hit an error code 0x8300000a. Database is now in DeleteOnlyMode

You might think that you can issue CHECKPOINT manually, and do transaction log backups, hoping that File Merge will kick in. Or you could manually execute File Merge, with this uber-long thing:

EXEC sys.sp_xtp_checkpoint_force_garbage_collection <dbname>

But keep in mind that if there was no additional free space on the volumes to precreate CFPs, then it’s not likely that there will be enough free space to write a new set of CFPs for DBA-initiated File Merge.

The only thing you can do to remedy this situation is to either free up some space on the existing volumes, or create a new container on a new volume that has free space.

In Part 4, we’ll discuss memory in the same ways we’ve discussed storage – how it’s allocated, and what happens to your memory-optimized workload when you run out of it.

In-Memory OLTP Resources, Part 1: The Foundation

3 Replies

This multi-part blog post will cover various resource conditions that can affect memory-optimized workloads. We’ll first lay the foundation for what types of resources are required for In-Memory OLTP, and why.

The following topics will be covered :

causes of OOM (Out of Memory)
how files that persist durable memory-optimized data affect backup size
how memory is allocated, including resource pools, if running Enterprise Edition
potential effect on disk-based workloads (buffer pool pressure)
what happens when volumes that store durable memory-optimized data run out of free space
what you can and cannot do when a memory-optimized database runs out of resources
database restore/recovery
garbage collection (GC) for row versions and files (file merge)
BPE (buffer pool extension)

Like most everything in the database world, In-Memory OLTP requires the following resources:

storage
IOPS
memory
CPU

Let’s take storage first – why would a memory-optimized database require storage, what is it used for, and how much storage is required?

Why and What?

You’ll need more storage than you might expect, to hold the files that persist your durable memory-optimized data, and backups.

How much storage?

No one can exactly answer that question, as we’ll explain over the next few blog posts. However, Microsoft’s recommendation is that you have 4x durable memory-optimized data size as a starting point for storage capacity planning.

Architecture

A memory-optimized database must have a special filegroup designated for memory-optimized data, known as a memory-optimized filegroup. This special filegroup is logically associated with one or more “containers”. What the heck is a “container”? Well, it’s just a fancy word for “folder”, nothing more, nothing less. But what is actually stored in those fancy folders?

Containers hold files known as “checkpoint file pairs”, which are also known as “data and delta files”, and these files persist durable memory-optimized data (in this blog post series, I’ll use the terms CFP and data/delta files interchangeably). You’ll note on the following image that it clearly states in bold red letters, “NO MAXSIZE” and “STREAMING”. “NO MAXSIZE” means that you can’t specify how large these files will grow, nor can you specify how large the container that houses them can grow (unless you set a quota, but you should NOT do that). And there’s also no way at the database level to control the size of anything having to do with In-Memory OLTP storage – you simply must have enough available free space for the data and delta files to grow.

This is the first potential resource issue for In-Memory OLTP: certain types of data modifications are no longer allowed if the volume your container resides upon runs out of free space. I’ll cover workload recovery from resource depletion in a future blog post.

“STREAMING” means that the data stored within these files is different than what’s stored in MDF/LDF/NDF files. Data files for disk-based tables store data rows on 8K pages, a group of which is known as an extent. Data for durable memory-optimized tables is not stored on pages or extents. Instead, memory-optimized data is written in a sequential, streaming fashion, like the FILESTREAM feature (it should be noted that you do not have to enable the FILESTREAM feature in order to use In-Memory OLTP, and that statement has been true since In-Memory OLTP was first released in SQL 2014).

How do these data/delta files get populated? All that is durable in SQL Server is written to the transaction log, and memory-optimized tables are no exception. After first being written to the transaction log, a process known as “offline checkpoint” harvests changes related to memory-optimized tables, and persists those changes in the data/delta files. In SQL 2014, there was a single offline checkpoint thread, but as of SQL 2016, there are multiple offline checkpoint threads.

Let’s create a sample database:

USE master;
GO
CREATE DATABASE [OOM_DB]
ON PRIMARY
       (
           NAME = N'Data'
          ,FILENAME = N'H:\InMemOOMTest\OOM_DB.mdf'
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
  ,FILEGROUP [OOM_DB_inmem1] CONTAINS MEMORY_OPTIMIZED_DATA DEFAULT
       (
           NAME = N'InMemDB_inmem1'
          ,FILENAME = N'H:\InMemOOMTest\OOM_DB_inmem1'
          ,MAXSIZE = UNLIMITED
       )
      ,(
           NAME = N'InMemDB_inmem2'
          ,FILENAME = N'H:\InMemOOMTest\OOM_DB_inmem2'
          ,MAXSIZE = UNLIMITED
       )
LOG ON
    (
        NAME = N'Log'
       ,FILENAME = N'S:\SQLDATA\OOM_DB.ldf'
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

USE master;

CREATE DATABASE [OOM_DB]

ON PRIMARY

(

NAME = N'Data'

,FILENAME = N'H:\InMemOOMTest\OOM_DB.mdf'

,SIZE = 100MB

,MAXSIZE = UNLIMITED

,FILEGROWTH = 100MB

)

,FILEGROUP [OOM_DB_inmem1] CONTAINS MEMORY_OPTIMIZED_DATA DEFAULT

(

NAME = N'InMemDB_inmem1'

,FILENAME = N'H:\InMemOOMTest\OOM_DB_inmem1'

,MAXSIZE = UNLIMITED

)

NAME = N'InMemDB_inmem2'

,FILENAME = N'H:\InMemOOMTest\OOM_DB_inmem2'

,MAXSIZE = UNLIMITED

)

LOG ON

(

NAME = N'Log'

,FILENAME = N'S:\SQLDATA\OOM_DB.ldf'

,SIZE = 100MB

,MAXSIZE = 2048GB

,FILEGROWTH = 100MB

);

After creating the database, the InMemOOMTest folder looks like this:

OOM_DB_inmem1 and OOM_DB_inmem2 are containers (folders), and they’ll be used to hold checkpoint file pairs. You’ll note in the DDL listed above, that under the memory-optimized filegroup, each container has both a name and filename entry. The name is the logical name of the container, while the filename is actually the container name, which represents the folder that gets created on disk. Initially there are no CPFs in the containers, but as soon as you create your first memory-optimized table, CFPs get created in both containers.

If we have a look in one of the containers, we can see files that have GUIDs as names, and are created with different sizes.

This is definitely not human-readable, but luckily, Microsoft has created a DMV to allow us to figure out what these files represent.

SELECT container_id
      ,checkpoint_file_id
      ,relative_file_path
      ,file_type_desc
      ,state_desc
      ,file_size_in_bytes
      ,file_size_used_in_bytes
FROM sys.dm_db_xtp_checkpoint_files
ORDER BY file_type_desc
        ,state_desc

SELECT container_id

,checkpoint_file_id

,relative_file_path

,file_type_desc

,state_desc

,file_size_in_bytes

,file_size_used_in_bytes

FROM sys.dm_db_xtp_checkpoint_files

ORDER BY file_type_desc

,state_desc

Below we can clearly see that there are different types of files, and that files can have different “states”, which is central to the discussion of the storage footprint for memory-optimized databases, and backups of those databases. There are different values for container_id – remember we said that a memory-optimized database can have one or more containers. Next, we should pay attention to the fact that all entries for the “relative_file_path” column begin with “$HKv2\”. This means that in each container, we have a folder with the name “$HKv2”, and all data/delta files for that container are located there.

At this point, it’s time for a discussion of the various file states. I’ll stick to SQL 2016+ (because SQL 2014 had more file states).

The possible file states are:

PRECREATED
UNDER CONSTRUCTION
ACTIVE
MERGE TARGET
WAITING FOR LOG TRUNCATION

We’ll discuss the first three now, and save MERGE TARGET and WAITING FOR LOG TRUNCATION for later.

PRECREATED: as a performance optimization technique, the In-Memory engine will “precreate” files. These precreated files have nothing in them – they are completely empty, from a durable data perspective. A file in this state cannot yet be populated.

UNDER CONSTRUCTION: when the engine starts adding data to a file, the state of the file changes from PRECREATED to UNDER CONSTRUCTION. Data and delta files are shared by all durable memory-optimized tables, so it’s entirely possible that the first entry is for TableA, the next entry for TableB, and so on. “UNDER CONSTRUCTION” could be interpreted as “able to be populated”.

ACTIVE: When a file that was previously UNDER CONSTRUCTION gets closed, the state transitions to ACTIVE. That means it has entries in it, but is no longer able to be be populated. What causes a file to be closed? The CHECKPOINT process will close the checkpoint, changing all UNDER CONSTRUCTION files to ACTIVE.

That’s the basic rundown of the file states we need to know about at this point.

In Part 2, we’ll dive deeper into the impact of data/delta file states and the storage footprint for memory-optimized databases.

Ned Otter Blog

SQL Server DBA and Musician

Category Archives: Uncategorized

PASS Summit 2017

What a difference a tweet makes

Presenting takes you deep

It’s “Live”

You

In-Memory OLTP Resources, Part 3: OOS (Out of Storage)

In-Memory OLTP Resources, Part 1: The Foundation