Category Archives: SQL Server

In-Memory OLTP Resources, Part 2: Checkpoint File Pairs

In Part 1, we created a memory-optimized database, and explained the different states that CFPs can have.

In this post, we’ll take note of the changes to free space on the volumes that host our containers, before/after creating a memory-optimized table.

To begin with, we can see that the folder currently consumes 100MB of storage, because that’s how much we allocated for the MDF file when we created our database.

image_thumb[405]

image_thumb[406]

I’ve written a script that among other things, displays summary and detail information about memory-optimized databases, which can be found here. After changing the script to only give us details about the OOM_DB database, the relevant sections are listed below. In the somewhat voluminous output from the script, you can scroll down to the section entitled “’Database layout”.

We should note the following:

  1. First, the general database layout is reported, which includes all mdf and ldf files, as well as listing all containers. If you are not familiar with what a container is, please go back to Part 1.
  2. The second section displays details about each container, i.e. how large it is, and how many files reside there.
  3. Next we drill further down, and summarize at the fileType and fileState level, regardless of which container the file belong to.
  4. And finally, for each container, we detail the number of files, and aggregate the amount of storage consumed, per fileType and fileState.

This information is extremely valuable when assessing the storage state of a memory-optimized database.

OOM1

The containers consume 584MB and 568MB respectively (but after running these tests several times, it seems that the numbers fluctuate slightly), and all of the files in each container are “PRECREATED”. As we mentioned in Part 1, as a performance optimization, the In-Memory engine precreates files, and this has some interesting implications, which we’ll see later on.

The image above is what you’ll see when you have created a memory-optimized database, and created at least one memory-optimized table. But as I said earlier, if you’ve only created a memory-optimized database, the containers will be empty.

Let’s create our table (which is specifically designed to consume a lot of memory for each row):

USE OOM_DB
GO
CREATE TABLE dbo.InMemTable1
(
     rowKey INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,col01 CHAR(8000) NOT NULL
    ,col02 CHAR(8000) NOT NULL
    ,col03 CHAR(8000) NOT NULL
    ,col04 CHAR(8000) NOT NULL
    ,col05 CHAR(8000) NOT NULL
    ,col06 CHAR(8000) NOT NULL
    ,col07 CHAR(8000) NOT NULL
    ,col08 CHAR(8000) NOT NULL
    ,col09 CHAR(8000) NOT NULL
    ,col10 CHAR(8000) NOT NULL
    ,col11 CHAR(8000) NOT NULL
    ,col12 CHAR(8000) NOT NULL
    ,col13 CHAR(8000) NOT NULL
    ,col14 CHAR(8000) NOT NULL
    ,col15 CHAR(8000) NOT NULL
    ,col16 CHAR(8000) NOT NULL
    ,col17 CHAR(8000) NOT NULL
    ,col18 CHAR(8000) NOT NULL
    ,col19 CHAR(8000) NOT NULL
    ,col20 CHAR(8000) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

I ran the diagnostic script again, and now you can see that all of the data and delta files are PRECREATED, because none of them have been populated yet.

image

Let’s INSERT 10 rows, and run the diagnostic script again.

SET NOCOUNT ON;
DECLARE @stringVal VARCHAR(8000) = REPLICATE('A', 8000)
INSERT dbo.InMemTable1
(
    col01, col02, col03, col04, col05, col06, col07, col08, col09, col10,col11, col12, col13, col14,col15, col16, col17, col18, col19, col20
)
VALUES
(
    @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal
   ,@stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal, @stringVal
   ,@stringVal, @stringVal 
)

GO 10 

GO 10

After adding 10 rows, we have:

image

It’s clear that before we inserted any data, we had 20 files that were in the PRECREATED state. After inserting 10 rows, we now have 18 PRECREATED files, and 2 UNDER CONSTRUCTION files, which means the In-Memory engine is populating these files, that they are “open” in terms of their CHECKPOINT status. If you don’t understand what these terms mean, please read Part 1.

But there’s one thing that doesn’t look right here: we’ve inserted data into the table, but sizeBytesUsed is still zero for the UNDER CONSTRUCTION files. Why is that?

The Microsoft documentation explains it:

“file_size_used_in_bytes: for checkpoint file pairs that are still being populated, this column will be updated after the next checkpoint.”

After executing a manual CHECKPOINT, the following image show the before/after state of our memory-optimized database. We can see the difference: we now have values for the sizeBytesUsed column for the UNDER CONSTRUCTION rows.

Please note:

  • The ‘Container details by fileType and fileState’ had only PRECREATED AND UNDER CONSTRUCTION data and delta files
  • All files that were UNDER CONSTRUCTION before executing the manual CHECKPOINT are now ACTIVE. This was discussed in Part 1 – when a CHECKPOINT occurs, the files that were being populated are now closed to further entries. This happens automatically, but sometimes you need to do it manually (more on that in a future post).
  • In the ‘Before’ image, data and delta files have two states, ACTIVE and PRECREATED.
  • In the ‘After’ image, data and delta files have three states, ACTIVE and PRECREATED, and UNDER CONSTRUCTION.
  • For the first time, we’re seeing the fileState of ‘WAITING FOR LOG TRUNCATION’( which we’ll explain in Part 3)

Before #################

image

After #################

image

In Part 3, we’ll dive deeper into IOPS and free space requirements, and how to reset the database status when all volumes run out of free space.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Entire database in memory: Fact or fiction?

HP Servers and Persistent Memory

Advances in hardware and software have converged to allow storing your entire database in memory (depending on how large it is), even if you don’t use Microsoft’s In-Memory OLTP feature.

HP Gen9 servers support NVDIMM-N (known as Persistent Memory), which at that time had a maximum size of 8GB, and with 16 slots, offered a total server capacity of 128GB. Hardly large enough to run today’s mega-sized databases, and also there was no way to actually store your database there. So the use case for SQL Server 2016 was to store log blocks for transaction logs there. This could be beneficial in general, but particularly when using durable memory-optimized tables. That’s because WRITELOG waits for the transaction log could be a scalability bottleneck, which reduced the benefit of migrating to In-Memory OLTP.

There were other potential issues when using Persistent Memory, detailed in this blog post. But what’s not covered in that post is the fact that deploying NVIDMM-N reduced the memory speed and/or capacity, because they are not compatible with LRDIMM. This causes you to use RDIMM, which reduces capacity, and because NVDIMM-N operates at a slower speed than RDIMM, it also affects total memory speed.

HP has since released Gen10 servers, and they have changed the landscape for those seeking reduced latency by storing larger data sets in memory. For one thing, they raise the bar for what’s now referred to as Scalable Persistent Memory, with a total server capacity of 1TB. To be clear, NVDIMM-N is not used in this configuration. Instead, regular DIMMs are used, and they are persisted to flash via a power source (this was also the case for NVDIMM-N, but both the flash, DIMM, and power source were located on the NVDIMM-N).

In this video, Bob Ward demonstrates ~5x performance increase for the industry’s first “disklesss” database, using a HPE Gen10 server, SUSE Linux, Scalable Persistent Memory, and columnstore (presumably on a “traditional/formerly on-disk table”, not a memory-optimized table, although that’s not specifically detailed in the video).

Brett Gibbs, Persistent Memory Category Manager for HP servers, states in this video that even databases that use In-Memory OLTP can benefit from Scalable Persistent Memory, because the time required to restart the database can be significantly reduced. He stated that a 200GB memory-optimized database that took 20 minutes to restart on SAS drives, took 45 seconds using Persistent Scalable Memory. However, no details are provided about the circumstances under which those results are obtained.

We are left to guess about the number of containers used, and the IOPS available from storage. It may be that in both cases, they tested using a single container, which would be a worst practice. And if that’s correct, to reduce database restart time all you had to do was spread the containers across more volumes, to “parallelize” the streaming from storage to memory.

I’m assuming that the 45 seconds specified represents the amount of time required to get durable memory-optimized data from flash storage back into memory. If that’s correct, then the reduction of time required to restart the database has nothing to do with the Scalable Persistent Memory (other than memory speed), and everything to do with how fast flash storage can read data.

Licensing

The HP video also details how there might be a licensing benefit. It’s stated that if your workload requires 32 cores to perform well, and you reduce latency through the use of Scalable Persistent Memory, then you might be able to handle the same workload with less cores. I’d love to see independent test results of this.

In-Memory OLTP

If you are considering placing a database entirely in memory, and don’t want to be tied to a specific hardware vendor’s solution, In-Memory OLTP might be an option to consider.

This is an extremely vast topic that I’ve been interested in for quite a while, and I’ll summarize some of the potential benefits:

  • Maintaining referential integrity – Microsoft recommends keeping cold data in on-disk tables, and hot data in memory-optimized tables. But there’s just one problem with that: FOREIGN KEY constraints are not supported between on-disk and memory-optimized tables. Migrating all data to memory-optimized tables solves this specific issue.
  • Native compilation – if you want to use native compilation, it can only be used against memory-optimized tables. If you can deal with the potential TSQL surface area restrictions, migrating all data to memory-optimized tables might allow greater use of native compilation.
  • Single table structure – if you were to keep cold data on disk, and hot data in-memory, you would need to use two different table names, and perhaps reference them through a view. Migrating all data to memory-optimized tables solves this problem.
  • Unsupported isolation levels for cross-container transactions – it’s possible to reference both on-disk and memory-optimized tables in a single query, but memory-optimized tables only support a subset of the isolations that are available for on-disk tables, and some combinations are not supported (SNAPSHOT, for example).
  • Near-zero index maintenance – other than potentially reconfiguring the bucket count for HASH indexes, HASH and RANGE indexes don’t require any type of index maintenance. FILLFACTOR and fragmentation don’t exist for any of the indexes that are supported for memory-optimized tables.
  • Very large memory-optimized database size – Windows Server 2016 supports 24TB of memory, and most of that could be assigned to In-Memory OLTP, if you are using Enterprise Edition. This is way beyond the capacity supported by the current line of HP servers using Scalable Persistent Memory.

One extremely crucial point to make is that if you decide to migrate an entire database to In-Memory OLTP, then database recovery time must be rigorously tested. You will need to have enough containers spread across enough volumes to meet your RTO SLA.


Technology can change your life

Roy Dorman

When I first met Roy Dorman, he was in is mid-thirties, and completing a degree in philosophy at Fordham University here in New York City. Roy was coupled with a very close friend; I took a liking to him immediately.

It’s a tall order to find a teaching position in philosophy under the best of circumstances, and upon graduating, he struggled to find work (his age was likely not an asset). To meet the financial demands of living in this city, he toiled at whatever jobs came his way: limousine dispatcher, real-estate manager, etc.

After getting to know him for a while, I could see that Roy had the right combination of natural skills that would be perfect for a role in technology, and I told him so. He was extremely responsible, persevering, detail oriented, and enjoyed solving problems.

Remembering my own struggles to wrap my brain around technology (detailed here), I said to Roy – “Hey man, if you ever want to do something in the tech world, just let me know – I’d be glad to teach you.”

Time ticked on by, and then one day the phone rang – Roy was ready to begin his studies.

First ascent

The first hurdle we faced was that Roy had zero disposafble income, so I offered to lend him the money to purchase a computer. Our plan was that he would do self-study from home, with my oversight. However, there was one gigantic problem with this scenario – he had a very young child at home (I don’t know what we were thinking). As the great jazz musician Ahmad Jamal told me years ago – “you cannot serve two masters.”

With no measurable progress after a fair amount of study time, the likelihood of Roy being gainfully employed in the field of technology was hovering around impossible.

We needed another plan.

Second ascent

I could see that the obligations of a husband and father were not easily circumvented. On the other hand, Roy had worked a series of gigs that had zero financial upside, so he was highly motivated – an essential requirement of all heavy lifting.

Our second plan was more structured and required a larger commitment from both of us. Also built in to the plan were some “teeth”.

I asked Roy how much money he needed to survive, and the figure he came up with was $1,000 per week, and I agreed to sustain him financially throughout his studies. But because lending money to a friend can jeopardize the relationship, we approached the financial aspect from a different perspective. I told him that I would not lend him the money he needed – I would give him the money he needed. And should he one day be in a financial position to return the gesture, so be it. We went forward without my having any expectation of seeing the money that flowed to him.

Roy had no idea if he could morph into a SQL Server DBA, and I was keenly aware of the vast amount of trust he placed in me. It was an awesome responsibility, to say the least.

I had recently started a contract for a large migration of 100+ SQL Servers to the latest version, and this put me in a position to help Roy out. I wrote him a letter explaining what was expected of each of us, and what would happen if we did not succeed. With the letter I enclosed a check for $2,000, saying that there would be subsequent payments of $1,000 per week. My estimation was that it would take approximately nine months for Roy to become employable, but our timeline was open-ended.

Tough love

My letter in part said:

“If I was you, I’d be as frugal as humanly possible with this check, for the following reasons:

· I eat before you do

· Anything can happen

· Don’t assume there will be a next check, because I don’t. You’re a consultant now; your income is unpredictable.”

“I will evaluate your performance each month and discuss with you where you stand. If for whatever reason I deem that you are not living up to your end of the bargain – and it does not change – the deal is off, I become yet another line item in your long list of creditors, and our lives resume as before.“

I didn’t like to be that hard on Roy, but I felt it was the only way we could get to the finish line. It was for his own good, and he knew it.

I told him that he had to treat his SQL Server education exactly like a job. He received a set of keys to my studio apartment in Greenwich Village, and had to be there five days a week, eight hours a day.

To the grindstone

In addition to being a SQL Server DBA, I’ve had a life in music. Many people have come to me for music lessons, and almost none of them have ever returned after the initial encounter. Perhaps it’s because I don’t sugar-coat what’s involved with pursuing the subtleties of creative improvised music. Roy stands alone as the only student I’ve ever had that got into the long run with me.

Keep in mind that he had only very basic knowledge of computers when we began, and he knew absolutely nothing about operating systems or database software. I’ve sometimes thought that if Roy knew how much he’d have to learn in such a short span of time, he might have backed out. But ignorance is bliss, as they say.

To be honest, we worked like mad men.

I drilled him.

I grilled him.

I imparted mindful after mindful of technology upon Roy Dorman each and every day. Weeks turned to months, and slowly, the veil of technological ignorance lifted, giving way to comprehension and knowledge. After seven months, I had taught Roy everything I could think of (but I did leave off some critical items, such as how to determine how much memory is installed on a SQL Server – sorry, Roy!).

One day I arrived home to find a package he had left for me. Inside was a collage that he made about our collaboration, and I laughed out loud when I saw it. It had photos of the two of us, plus artwork for Sybase, Oracle and MS SQL – really hysterical. Also included was a very small pair of scissors. I called Roy and asked him what the scissors were for. He said: “To cut the cord!”

It was time for Roy to fly.

Looking for a gig

Roy wanted to work as a consultant, but at that point in time, there seemed to be more opportunities for him as an employee. Also, with a wife and young child, he needed the benefits that came with full time employment.

One day he called and said that he was going to have an interview, and there was something in his voice that sounded kind of funny. His appointment was to be at the same company I was working for, and in the same building where I worked. He asked me: “Are your manager’s initials ‘XYZ’?”

They were indeed.

Unbelievably, Roy would have an interview in my office the following Monday (I wonder how long those odds were).

As luck would have it, when Roy arrived on site, my manager asked me to give him a technical interview, and then he “introduced” me to Roy.

You will not be shocked to learn that I gave Roy the green light. Not because I had trained him, but because he was a good candidate, and would have been an asset to the team. But as it turned out, the company decided not to engage the services of Roy Dorman. I was really disappointed – we would have had a ball working together.

Opportunity knocks

After interviewing for a while, Roy told me that he was being considered for positions at two different companies – there was now third party endorsement of his skills. He had studied hard and done well.

One of the companies had three servers, and the other had fifteen. I told him that he had to take the job at the fifteen-server company, but he was extremely hesitant to do so. I stated in no uncertain terms that the larger company was his only move, and when he asked me why, I told him: “Because you’ll learn more there.”

Roy accepted the position at the fifteen-server company, and it was the beginning of a career that would encompass working at some very large corporations, such as Viacom, SAC Capital and others.

Through our collaboration, an intensely personal bond between us was formed – I looked upon Roy as a brother, and we stayed close throughout the many years that followed.

The Eternal Optimist

Roy called one day, revealing that he had Stage IV colon cancer. One thing I remember him saying to me was that he was looking at it as one of life’s “bumps in the road”. That was Roy’s essence – he was The Eternal Optimist.

For a while he did well with treatment. But after about a year, it was clear that Roy was losing the battle. When I picked him up from the hospital to take him home for the last time, he thanked me for giving him an opportunity to do better. I told him that my role was simply one of guidance (and ass kicker drill sergeant) – he did all the work.

Tragically, Roy passed away not long after that conversation, leaving behind a wife and two beautiful young daughters. For about ten years, he got to live the life he wanted (if you think that working is living, that is).

Coda

What can we take away from all of this?

· Back in the day, there was a lot less to learn, in order to become gainfully employed as a SQL Server DBA

· it’s never too late to change your life

· Motivation is more important than raw talent

· Good karma is good

In New York City, for decades the final destination for many who were down on their luck was the infamous Bowery. My mother told me that every time she walked along that famed stretch known as Skid Row, she came home broke – you need not be Sigmund Freud to figure out where my generous side originates.

Some of you reading this post might think that I had ulterior motives for helping Roy – that nobody does something for nothing. I can assure you there were no obligations on Roy’s part, but he did remit every penny of the $29,600 that he received.

Now that I think of it, perhaps I did get more from Roy than what I initially gave him. Every time I think of him, and what we achieved together, a warm smile spreads across my face.

Good DBAs always strive to be better at their tradecraft. How many of us attempt to improve our humanity?

I challenge everyone reading this post to commit to making another person’s life better in some way, large or small – no strings attached.

Thanks for reading –

Ned Otter

New York City, 2013

The subtleties of In-Memory OLTP Indexing

For this post, I wanted to cover some of the indexing subtleties for memory-optimized tables, with an accent on columnstore indexes

Let’s create a memory-optimized table:

CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

Now, let’s attempt to create a NONCLUSTERED COLUMNSTORE INDEX:

ALTER TABLE dbo.InMemADD 
ADD INDEX NCCI_InMem NONCLUSTERED COLUMNSTORE (col1);

Msg 10794, Level 16, State 76, Line 76
The feature ‘NONCLUSTERED COLUMNSTORE’ is not supported with memory optimized tables.

It fails because we can only create a CLUSTERED columnstore index (CCI). For 25 years, Microsoft SQL Server differentiated between indexes that physically ordered data on storage (CLUSTERED) and those that did not (NONCLUSTERED). Unfortunately, they chose to ignore that pattern when creating the syntax for memory-optimized tables; using the word CLUSTERED is required when creating a columnstore index on memory-optimized tables.

Can we create a clustered columnstore index on a memory-optimized table that is defined as SCHEMA_ONLY?

Only one way to find out:

DROP TABLE IF EXISTS [dbo].[InMem];
GO

CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY);

ALTER TABLE dbo.InMem ADD INDEX CCI_InMem CLUSTERED COLUMNSTORE

Msg 35320, Level 16, State 1, Line 39
Column store indexes are not allowed on tables for which the durability option SCHEMA_ONLY is specified.

That won’t work, so let’s create our table with SCHEMA_AND_DATA:

DROP TABLE IF EXISTS [dbo].[InMem];
GO
CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL PRIMARY KEY NONCLUSTERED
   ,[col1] [CHAR](100) NOT NULL,
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Now, let’s create a clustered columnstore index:

ALTER TABLE dbo.InMem ADD INDEX CCI_InMem CLUSTERED COLUMNSTORE

Success! Let’s attempt to create a NONCLUSTERED index….

ALTER TABLE dbo.InMem ADD INDEX IX_Index1 (col1);

Msg 10794, Level 16, State 15, Line 117
The operation ‘ALTER TABLE’ is not supported with memory optimized tables that have a column store index.

Ooops – no can do. Once you add a clustered columnstore index to a memory-optimized table, the schema is totally locked down.

What about if we create the CCI and nonclustered index inline?

DROP TABLE IF EXISTS [dbo].[InMem];
GO
CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
   ,INDEX CCI_InMem CLUSTERED COLUMNSTORE
   ,INDEX IX_InMem1 (col1)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Awesome! We’ve proven that we can create both clustered columnstore and nonclustered indexes, but we must create them inline.

Now that we’ve got our indexes created, let’s try to add a column:

ALTER TABLE dbo.InMem ADD col2 INT NULL;

Msg 12349, Level 16, State 1, Line 68
Operation not supported for memory optimized tables having columnstore index.

Hey, when I said that the schema is locked down once you add a clustered columnstore index, I mean it!

What type of index maintenance is possible for indexes on memory-optimized tables?

For HASH indexes there is only one possible type of index maintenance, and that’s to modify/adjust the bucket count. There is zero index maintenance for RANGE/NONCLUSTERED indexes.

Let’s create a memory-optimized table with a HASH index, and verify the syntax for rebuilding the bucket count.

DROP TABLE IF EXISTS [dbo].[InMem];
GO
CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
   ,INDEX IX_InMem1 HASH (col1) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Here’s the syntax for rebuilding the bucket count for a HASH INDEX:

ALTER TABLE dbo.InMem 
ALTER INDEX IX_InMem1 
REBUILD WITH(BUCKET_COUNT = 1001)
GO

We can add a column, as long as we don’t have a CCI in place:

ALTER TABLE dbo.InMem ADD col2 INT NULL;
GO

How about trying to rebuild the bucket count if we created the memory-optimized table with inline CCI and HASH indexes?

DROP TABLE IF EXISTS [dbo].[InMem];
GO

CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
   ,INDEX CCI_InMem CLUSTERED COLUMNSTORE
   ,INDEX IX_InMem1 HASH (col1) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

ALTER TABLE dbo.InMem ALTER INDEX IX_InMem1 REBUILD WITH(BUCKET_COUNT = 1001)
GO

Msg 10794, Level 16, State 13, Line 136
The operation ‘ALTER TABLE’ is not supported with memory optimized tables that have a column store index.

You can’t rebuild that index if you also have a columnstore index on the table. We would have to drop the columnstore index, reconfigure the bucket count for the HASH index, and then recreate the columnstore index. Both the drop and the create of the columnstore index will be fully logged, and executed serially. Not a huge problem if the amount of data is not too large, but it’s a potentially much larger problem if you’ve got a lot of data.

We can create a clustered columnstore index on a #temp table (on-disk):

DROP TABLE IF EXISTS #tempgo
GO

CREATE TABLE #temp
(
    col1 INT NOT NULL
);
GO

-- succeeds
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_InMem ON #temp (col1);

--OR clustered, which also succeeds
CREATE CLUSTERED COLUMNSTORE INDEX NCCI_InMem ON #temp;

We can create multiple indexes with a single command:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId]      [INT]           IDENTITY NOT NULL
   ,[StoreID]      INT             NOT NULL
   ,[Notes1]       [VARCHAR](8000) NULL
   ,[ValidFrom]    [DATETIME2](7)  NOT NULL
   ,[ValidTo]      [DATETIME2](7)  NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

ALTER TABLE [InMemLOB]
 ADD 
 INDEX IX_1 (OrderId) 
,INDEX IX_2 (OrderId) 
,INDEX IX_3 (OrderId) 
,INDEX IX_4 (OrderId) 
,INDEX IX_5 (OrderId) 
,INDEX IX_6 (OrderId) 
,INDEX IX_7 (OrderId)

Can we create a columnstore index on a memory-optimized table variable?

DROP TYPE IF EXISTS dbo.typeTableMem
GO

CREATE TYPE dbo.typeTableMem AS TABLE
(
    [PK] INT NOT NULL PRIMARY KEY NONCLUSTERED
   ,[col1] NVARCHAR(255) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON);
GO

DECLARE @Table1 dbo.typeTableMem;
-- fails--Msg 102, Level 15, State 1--Incorrect syntax near '@Table1'.
ALTER TABLE @Table1 ADD INDEX CCI_InMem CLUSTERED COLUMNSTORE

Create a table that includes a LOB column with a MAX datatype, then add a clustered columnstore index:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId] [INT] IDENTITY NOT NULL
   ,[StoreID] INT NOT NULL
   ,[CustomerID] INT NOT NULL
   ,[OrderDate] [DATETIME] NOT NULL
   ,[DeliveryDate] DATETIME NULL
   ,[Amount] FLOAT NOT NULL
   ,[Notes] [NVARCHAR](MAX) NULL
   ,[ValidFrom] [DATETIME2](7) NOT NULL
   ,[ValidTo] [DATETIME2](7) NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

ALTER TABLE dbo.[InMemLOB] 
ADD INDEX CCI_InMemLOB CLUSTERED COLUMNSTORE

Msg 35343, Level 16, State 1, Line 22    The statement failed. Column ‘Notes’ has a data type that cannot participate in a columnstore index. Omit column ‘Notes’.   

Msg 1750, Level 16, State 1, Line 22    Could not create constraint or index. See previous errors.

For memory-optimized tables, LOB columns prevent creation of a clustered columnstore index.

Now let’s try creating a table using CHAR(8000). Astute readers will notice that the following table would create rows that are 32,060 bytes wide – this would fail for on-disk tables, but is perfectly valid for memory-optimized tables:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId] [INT] IDENTITY NOT NULL
   ,[StoreID] INT NOT NULL
   ,[CustomerID] INT NOT NULL
   ,[OrderDate] [DATETIME] NOT NULL
   ,[DeliveryDate] DATETIME NULL
   ,[Amount] FLOAT
   ,[Notes1] [CHAR](8000) NULL
   ,[Notes2] [CHAR](8000) NULL
   ,[Notes3] [CHAR](8000) NULL
   ,[Notes4] [CHAR](8000) NULL
   ,[ValidFrom] [DATETIME2](7) NOT NULL
   ,[ValidTo] [DATETIME2](7) NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Msg 41833, Level 16, State 1, Line 29    Columnstore index ‘CCI_InMemLOB’ cannot be created, because table ‘InMemLOB’ has columns stored off-row.   
Columnstore indexes can only be created on memory-optimized table if the columns fit within the 8060 byte limit for in-row data.   
Reduce the size of the columns to fit within 8060 bytes.

Create a table with non-MAX LOB columns, but they are stored on-row,  then add a clustered columnstore index:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId] [INT] IDENTITY NOT NULL
   ,[StoreID] INT NOT NULL
   ,[Notes1] [CHAR](8000) NULL
   ,[ValidFrom] [DATETIME2](7) NOT NULL
   ,[ValidTo] [DATETIME2](7) NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

-- works
ALTER TABLE InMemLOB ADD INDEX IndOrders_StoreID (StoreID);
GO

-- works
ALTER TABLE dbo.[InMemLOB] ADD INDEX CCI_InMemLOB CLUSTERED COLUMNSTORE
GO

--Add data to table:
SET NOCOUNT ON
GO
INSERT dbo.[InMemLOB]
SELECT
    3 AS [StoreID]
   ,'ABC' AS [Notes1]
   ,GETDATE() AS [ValidFrom]
   ,GETDATE() AS [ValidTo]
GO 1000

SELECT *
FROM dbo.[InMemLOB]

Let’s create a natively compiled module that selects from this table:

DROP PROCEDURE IF EXISTS dbo.InMem_native_sp
GO
CREATE PROCEDURE dbo.InMem_native_sp
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
    SELECT
        [OrderId]
       ,[StoreID]
       ,[Notes1]
       ,[ValidFrom]
       ,[ValidTo]
    FROM dbo.[InMemLOB];
END;
GO

ENABLE “Actual Plan” and SELECT – which index is used?

SELECT
    [OrderId]
   ,[StoreID]
   ,[Notes1]
   ,[ValidFrom]
   ,[ValidTo]
FROM dbo.[InMemLOB];

CCIPlan1

Now highlight the EXEC statement, and click “Estimated Plan” – which index is used?

EXEC dbo.InMem_native_sp

CCIPlan2

The SELECT statement uses the columnstore index, but the natively compiled procedure does not (that’s because natively compiled procedures ignore columnstore indexes).

Summing up

In this post, we’ve covered some of the finer points of indexing memory-optimized tables. Never know when they might come in handy….

SQL Server on Linux, Part 1

SQL 2017 is just about to be released, and one of the big ticket items is that SQL Server is now supported on the Linux platform.

In subsequent posts, I’ll be reporting on In-Memory OLTP on Linux, but first we’ll need to cover some Linux basics. I flirted with Unix ages ago, and I’ll be the first to admit that my brain doesn’t really work that way (perhaps no one’s brain does).

First, a note about environments – I usually like to work on a server in my home lab, because it has a lot of cores, 64GB of memory, and there’s no hourly cost for using it (and also because I built it….).

So I downloaded a copy of Ubuntu, CentOS, and a trial copy of Redhat Enterprise Linux, and attempted to install each one in my VMware Workstation environment. I spun my wheels for a few hours, and could not get any of them up and running in the way that I required. So, in the interest of saving time, I hit my Azure account, created a VM running Redhat, and proceeded to install SQL 2017 CTP2. Instructions for installing SQL 2017 on Linux can be found at this link. It should be noted that the installation varies by Linux distribution.

Those of us who don’t know Linux commands by heart, and are used to firing up GUI-based virtual machines, are in for a bit of a rude awakening. While it is possible to install GNOME on RHEL, you can’t simply RDP into the VM without a lot of Linux admin setup for xdp (I never did get it to work). So how do you connect to your Linux VM running SQL Server to do basic tasks? The answer is: PuTTY

PuTTY can be downloaded from this link, and after you install it on your client machine (your laptop or home workstation), connecting to your Azure VM is very easy. When you run PuTTY, you’re presented with the following window, and you can simply enter your IP address into the “Host Name (or IP address)” section, and click the “Open” button:

PUTTY

(you might receive a warning to confirm you want to connect).

Once you connect to the Azure VM, you are prompted for your user name and password, and after logging in to the VM, you arrive at the home directory for your login.

HomeDir

Once you’ve installed SQL Server according to the instructions at this link, you can use SSMS from your desktop to connect over the public internet, and manage your SQL Server environment. It’s a really good idea to limit the inbound connections for your VM to only your IP address, otherwise bots from all over the globe will attempt to hack your machine (you have been warned….).

Now that SQL Server is installed and running, we can attempt to connect, and create a database.

In SSMS, click connect, choose “Database Engine”, and when prompted, enter the user name and password. Make sure “SQL Server Authentication” is chosen, and not “Windows Authentication”.

The first thing I did was to determine where the system databases were stored, so I executed:

sp_helpdb master

master

I used the same path as the master database files to create a test database:

USE master
GO
CREATE DATABASE [TestDB]
ON PRIMARY
       (
           NAME = N’TestDBData’
          ,FILENAME = N’/var/opt/mssql/data/TestDB.mdf’
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
LOG ON
    (
        NAME = N’TestDBLog’
       ,FILENAME = N’/var/opt/mssql/data/TestDB.ldf’
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

That worked fine, but what if we want to create a database in a separate folder?

Using PuTTY, we can create a folder using the mkdir command (xp_cmdshell is not currently supported for SQL Server running on Linux):

mkdir /var/opt/sqldata

mkdir1

Unfortunately, that didn’t go as planned! We don’t have permission to create that folder, so we’ll try using sudo (more on sudo at this link):

sudo mkdir /var/opt/sqldata

sudomkdir

sudo prompts you for your password, after which it will create the directory.

Now that the directory has been created, we can attempt to create a new database there.

USE master
GO
CREATE DATABASE [TestDB2]
ON PRIMARY
       (
           NAME = N’TestDB2Data’
          ,FILENAME = N’/var/opt/sqldata/TestDB2.mdf’
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
LOG ON
    (
        NAME = N’TestDB2Log’
       ,FILENAME = N’/var/opt/sqldata/TestDB2.ldf’
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

error1

Still no luck – what could be the issue?

Let’s check the security context of the mssql service:

ps aux | grep mssql

mssql service

So, the sqlserver process executes under the mssql user account. Let’s check permissions in the sqldata directory with:

stat –format “%A” /var/opt/sqldata

On my VM, the results are:

rwxr-xr-x

Permissions for Linux files are separated into three sections:

  • owner
  • group (for the file or directory)
  • others

Each section can have the following attributes:

  • (r)ead
  • (w)rite
  • e(x)ecute

For more information on these attributes, please visit this link.

It’s easier to interpret the output if we break it up:

[rwx] [r-x] [r-x]

  • the directory owner has read, write, and execute permission
  • the directory group has read and execute permission
  • others have read and execute permission

When we create a directory, it’s owned by root. The problem with creating a database in this directory should be obvious: only the owner of the directory has write permission.

Let’s make the mssql user the owner of the sqldata directory:

chown mssql:mssql /var/opt/sqldata

chown
And finally, we’ll check the permissions for the sqldata folder:

final

Now let’s retry our CREATE DATABASE statement.

USE master
GO
CREATE DATABASE [TestDB2]
ON PRIMARY
       (
           NAME = N’TestDB2Data’
          ,FILENAME = N’/var/opt/sqldata/TestDB2.mdf’
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
LOG ON
    (
        NAME = N’TestDB2Log’
       ,FILENAME = N’/var/opt/sqldata/TestDB2.ldf’
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

Voila! We successfully created a database in the intended folder.

Seasoned DBAs might be wondering about Instant File Initialization (IFI), a best practice on Windows that greatly increases the speed of creating or extending data files.

When IFI is not configured, data files must be zeroed when created or extended. Does Linux have something akin to IFI? The answer is…..IFI does not exist as a thing you can configure on the file systems that SQL on Linux supports (EXT4, available on all distributions, or XFS file system, available only on Redhat).

However, the good news is that on the Linux platform, data files are not initialized with zeros when created or extended – Linux takes care of this without any intervention from DBAs.

Anthony Nocentino (@centinosystems) just blogged about the internals of file initialization on the Linux platform in this post.

SQL 2017 In-Memory roundup

SQL Server 2017 includes enhancements to many features, and some of those enhancements include In-Memory OLTP.

  • Microsoft states that ALTER TABLE for memory-optimized tables is now “usually substantially faster”. I asked for clarity about that – if it means that ALTER TABLE is faster for the same events that were able to be executed in parallel and minimally logged in SQL 2016, or if there are new ALTER TABLE statements which now execute in parallel. They replied that there is no change to the set of operations that executed in parallel. So the ALTER TABLE commands that executed fast now (theoretically) execute faster.
  • Up to and including SQL 2016, the maximum number of nonclustered indexes on a memory-optimized table was eight, but that limitation has been removed for SQL 2017. I’ve tested this with almost 300 indexes, and it worked. With this many supported indexes, it’s no wonder they had to….
  • Enhance the index rebuild performance for nonclustered indexes during database recovery. I confirmed with Microsoft that the database does not have be in SQL 2017 compatibility mode (140) to benefit from the index rebuild enhancement. This type of rebuild happens not only for database restore and failover, but also for other “recovery events” – see my blog post here.
  • In SQL 2017, memory-optimized tables now support JSON in native modules (functions, procedures and check constraints).
  • Computed columns, and indexes on computed columns are now supported
  • TSQL enhancements for natively compiled modules include CASE, CROSS APPLY, and TOP (N) WITH TIES
  • Transaction log redo of memory-optimized tables is now done in parallel. This has been the case for on-disk tables since SQL 2016, so it’s great that this potential bottleneck for REDO has been removed.
  • Memory-optimized filegroup files can now be stored on Azure Storage, and you can also backup and restore memory-optimized files on Azure Storage.
  • sp_spaceused is now supported for memory-optimized tables
  • And last but definitely not least,  drum roll, please…….we can now rename memory-optimized tables and natively compiled modules

While Microsoft continues to improve columnstore indexes for on-disk tables, unfortunately columnstore for memory-optimized tables gets left further and further behind. Case in point would be support for LOB columns for on-disk tables in SQL 2017, but no such support for memory-optimized tables. And my good friend Niko Neugebauer (b|t) just reminded me that computed columns for on-disk CCI are supported in SQL 2017, but they are not supported for in-memory CCI. For an in-depth comparison of columnstore differences between on-disk and memory-optimized tables, see my  post here.

In addition to what’s listed above, I tested the following functionality for natively compiled stored procedures:

STRING_AGG()

This works, but you can’t use character functions, such as CHAR(13):

CREATE PROCEDURE dbo.Proc_VehicleRegistration
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT
        STRING_AGG(VehicleRegistration, CHAR(13)) AS csv – fails
    FROM Warehouse.VehicleTemperatures
    WHERE Warehouse.VehicleTemperatures.VehicleTemperatureID BETWEEN 65190 AND 65200
END;
GO

CONCAT_WS()

CREATE OR ALTER PROCEDURE dbo.Proc_VehicleTemperatures
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT CONCAT_WS( ' - ', VehicleTemperatureID, VehicleRegistration) AS DatabaseInfo
    FROM Warehouse.VehicleTemperatures

END;
GO

EXEC dbo.Proc_VehicleTemperatures
GO

TRIM()

CREATE OR ALTER PROCEDURE dbo.Proc_TrimTest
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT TRIM(VehicleRegistration) AS col1
    FROM Warehouse.VehicleTemperatures

END;
GO
EXEC dbo.Proc_TrimTest
GO

TRANSLATE()

CREATE OR ALTER PROCEDURE dbo.Proc_TranslateTest
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT TRANSLATE('2*[3+4]/{7-2}', '[]{}', '()()');
END;
GO
EXEC dbo.Proc_TranslateTest
GO

sys.dm_db_stats_histogram()

CREATE STATISTICS stat_VehicleTemperatures ON Warehouse.VehicleTemperatures(VehicleRegistration)

SELECT s.object_id, OBJECT_NAME(s.object_id), hist.step_number, hist.range_high_key, hist.range_rows, 
    hist.equal_rows, hist.distinct_range_rows, hist.average_range_rows
FROM sys.stats AS s
CROSS APPLY sys.dm_db_stats_histogram(s.[object_id], s.stats_id) AS hist
WHERE OBJECT_NAME(s.object_id) = 'VehicleTemperatures'

My wish list for the In-Memory OLTP feature is huge, but it’s great to see Microsoft continually improve and extend it.

Which events cause durable memory-optimized data to be streamed to memory?

Those who have studied In-Memory OLTP are aware that in the event of “database restart”, durable memory-optimized data must be streamed from disk to memory. But that’s not the only time data must be streamed, and the complete set of events that cause this is not intuitive. To be clear, if your database had to stream data back to memory, that means all your memory-optimized data was cleared from memory. The amount of time it takes to do this depends on:

  • the amount of data that must be streamed
  • the number of indexes that must be rebuilt
  • the number of containers in the memory-optimized database, and how many volumes they’re spread across
  • how many indexes must be recreated (SQL 2017 has a much faster index rebuild process, see below)
  • the number of LOB columns
  • BUCKET count being properly configured for HASH indexes

The following list is for standalone SQL Server instances (under some circumstances, the streaming requirements are different between FCIs and AGs).

Database RESTORE – this is a no brainer – if you restore a database with durable memory-optimized data, of course your data must be streamed from disk into memory. And if you are under the impression that SQL Server verifies if your server has enough memory to complete the RESTORE, you would be mistaken. See my post here.

SQL Service restart in this case, all databases will go through the recovery process, and all memory-optimized databases will stream durable memory-optimized data to memory.

Server reboot – same as “SQL Service restart”

In addition to the list above, there are a few database settings that cause data to be streamed.

  • Changing a database from READ_ONLY to READ_WRITE, or from READ_WRITE to READ_ONLY
  • Setting READ_COMMITTED_SNAPSHOT OFF or ON
  • Taking a database OFFLINE and then ONLINE

A database that contains durable memory-optimized data will not be online until all memory-optimized data is finished streaming, which affects the availability of “traditional” tables (while a database is waiting for streaming to complete, the wait type is “WAIT_XTP_RECOVERY”). There’s nothing you can do to speed up the process, other than having initially defined enough containers on enough volumes, so that streaming executes in parallel.

SQL 2017 enhancements

Because modifications to HASH and NONCLUSTERED/RANGE indexes are not logged for memory-optimized tables, they must be rebuilt when data is streamed to memory. Both SQL 2014 and SQL 2016 have a limit of 8 NONCLUSTERED indexes per memory-optimized table (any combination of HASH and RANGE). Microsoft has designed a new process for enhancing index rebuild speed in SQL 2017. This dovetails perfectly with the removal of the 8-indexes-per-table limit in SQL 2017 (I have personally created a table with 298 NONCLUSTERED indexes in SQL 2017).

Migrating tables to In-Memory OLTP

One of the first things you might do when considering migrating a table to In-Memory OLTP, is to run the “Transaction Performance Analysis Overview” report:

BlogTPAO

Then you arrive at the following window:

BlogTPAnalysis

If you click on “Tables Analysis”, a sort of “magic quadrant” appears:

BlogQuadrant

The chart attempts to display both the best candidates and the degree of difficulty for migration. But there are a few problems with the “difficulty” rating of this internal query, and as a result, if we really want to know how easy/difficult the process might be, we’ll have to roll our own query.

The original query counts the following conditions (the list is out of order on purpose):

nullable columns
LOB data types, i.e. NVARCHAR(MAX)
unique indexes/constraints
default constraints
check constraints
UDT
foreign keys
triggers
index with type > 2
computed columns
sparse columns
IDENTITY <> 1, 1
assembly
unsupported data types, i.e. DATETIMEOFFSET
filestream columns

Some of those conditions are 100% supported in SQL 2016, without any issue, while others have variable levels of migration difficulty, and still others are not supported at all. But even if we remove the items that are completely supported, all of the remaining items have the same weight. That could be pretty misleading, and might cause you to rule out migrating a table to In-Memory that could potentially be an excellent candidate.

Now let’s look at the list in another way:

**************************
supported without any issues
**************************
nullable columns
LOB data types, i.e NVARCHAR(MAX)
unique indexes/constraints

**********************************************************************
supported with a range of migration difficulty (from no difficulty to very difficult)
**********************************************************************
default constraints
check constraints
UDT
foreign keys
triggers

index with type > 2
0 = Heap
1 = Clustered
2 = Nonclustered
3 = XML
4 = Spatial
5 = Clustered columnstore index
6 = Nonclustered columnstore index
7 = Nonclustered hash index

********************
unsupported/blocking
********************
computed columns
sparse columns
IDENTITY <> 1, 1
assembly
unsupported data types, i.e. DATETIMEOFFSET
filestream columns

My version of the script removes the checks for nullable and LOB columns, and also for UNIQUE indexes/constraints.

And for the remaining conditions, since they’re all weighted the same by virtue of counting them, I wanted to place them in different buckets. After running my script on a sample database, I can see that the AuditTrail table has the following potential migration “issues”:

BlogIssues

There are a total of 8 issues, although migrating default constraints, user-defined data types, and LOB columns will be easy. It’s the foreign keys that might prove difficult, potentially leading to a long chain of tables that would have to be migrated (because foreign key constraints on a memory-optimized table can only reference other memory-optimized tables).

We definitely have a much clearer picture of the potential migration difficulties. With this version of the script, you can make informed choices about which tables you might want to migrate to In-Memory OLTP.

Also note that computed columns are supported in SQL 2017, so this script could have some intelligence added to allow for that.

/*
    Ned Otter
    www.NedOtter.com
    21 June 2017
        
    THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF 
    ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED 
    TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
    PARTICULAR PURPOSE. 

    In-Memory OLTP can improve workload performance for highly concurrent, write-intensive workloads.
    This script attempts to determine the locking/latching characteristics, and migration difficulty.

    This script is my version of the code that's generated when you run the
    "Transaction Performance Analysis Overview Report" for tables. In its original form, 
    that script counts the number of conditions that might affect migration, giving them all equal weight. 
    But migrating a user-defined-datatype will be very easy, and migrating foreign keys could be very difficult. 
    The second result set breaks the issues into separate buckets, so you can more easily decide whether or not a table 
    is a good candidate for migration. 

    It does not attempt to determine anything about tempdb contention, which In-Memory can also help with, 
    and it does not attempt to determine which objects might benefit from Native Compilation.


    Instructions:

    Step 1: 
        Execute the first query, so you can determine which tables have high levels of latch/lock activity. 
        NOTE: The results of the first query will be meaningless, unless you execute it in production, or on a test 
        system that has recent production-like activity. 

    Step 2: 
        Based on the output of the first query, adjust the @table_id for the second query.
*/


/*
    Question 1: How much locking/latching does this table have?
*/

/***************************************************
    Performance statistics, specifically latch/lock
***************************************************/
DECLARE @table_id INT --= OBJECT_ID('dbo.YourTable');

DECLARE @sum_table TABLE
(
    database_id              INT NOT NULL
   ,page_latch_lock_wait_sum INT NULL
);

INSERT INTO @sum_table
SELECT i.database_id
      ,ISNULL(NULLIF(SUM(i.page_latch_wait_in_ms) + SUM(i.page_lock_wait_in_ms), 0), 1) AS page_latch_lock_wait_sum
FROM sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) i
WHERE i.object_id IN (SELECT object_id FROM sys.tables)
   OR @table_id IS NULL
GROUP BY i.database_id;

SELECT SCHEMA_NAME(t.schema_id) AS schema_name
      ,t.name AS table_name
      ,SUM(i.page_latch_wait_count) AS page_latch_wait_count
      ,SUM(i.page_lock_count) AS page_lock_count
      ,SUM(i.page_lock_wait_count) AS page_lock_wait_count
      ,SUM(ROUND(CAST(i.page_latch_wait_in_ms AS FLOAT) / CAST((CASE i.page_latch_wait_count WHEN 0 THEN 1 ELSE i.page_latch_wait_count END) AS FLOAT), 2)) AS avg_time_per_page_latch_wait
      ,SUM(ROUND(CAST(i.page_lock_wait_in_ms AS FLOAT) / CAST((CASE i.page_lock_wait_count WHEN 0 THEN 1 ELSE i.page_lock_wait_count END) AS FLOAT), 2)) AS avg_time_per_page_lock_wait
      ,SUM(CAST(ROUND((page_latch_wait_in_ms + page_lock_wait_in_ms) * 100.0 / st.page_latch_lock_wait_sum, 2) AS DECIMAL(5, 2))) AS percentage_of_wait_for_entire_db
FROM sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) i
INNER JOIN sys.tables t ON t.object_id = i.object_id
INNER JOIN sys.indexes ix ON ix.object_id = i.object_id
                   AND ix.index_id = i.index_id
INNER JOIN @sum_table st ON st.database_id = i.database_id
WHERE t.object_id = @table_id
   OR @table_id IS NULL
GROUP BY t.schema_id
        ,t.object_id
        ,t.name
ORDER BY page_latch_wait_count DESC;
GO

/*
Evaluation 

Question 2: How difficult will the migration be?

    ******************************
    supported without any issues 
    ******************************
        nullable columns
        LOB data types, i.e NVARCHAR(MAX)
        unique indexes/constraints

    ******************************
    supported with varying degrees of migration difficulty
    ******************************
        default constraints           -- easy
        check constraints           -- easy
        UDT                           -- easy

        foreign keys               -- potentially difficult
        triggers                   -- potentially difficult/impossible

        index with type > 2        -- might not be supported, i.e. XML, Spatial
            0 = Heap               
            1 = Clustered
            2 = Nonclustered    
            3 = XML
            4 = Spatial
            5 = Clustered columnstore index. Applies to: SQL Server 2014 through SQL Server 2017.
            6 = Nonclustered columnstore index. Applies to: SQL Server 2012 through SQL Server 2017.
            7 = Nonclustered hash index. Applies to: SQL Server 2014 through SQL Server 2017.                   

    ******************************
    unsupported/blocking
    ******************************
        computed columns (SQL 2017+ supports computed 
        sparse columns
        IDENTITY <> 1, 1
        assembly
        unsupported data types, i.e. DATETIMEOFFSET
        filestream columns

*/

DECLARE @table_id INT --= OBJECT_ID('dbo.YourTable');

DECLARE @MemoryOptimizedTables TABLE
(
    object_id INT NOT NULL
);

INSERT INTO @MemoryOptimizedTables
SELECT [object_id]
FROM sys.tables
WHERE [is_memory_optimized] = 1;

DECLARE @Tables TABLE
(
    [table_id]                 INT NOT NULL
   ,[CheckConstraintCount]     INT NULL
   ,[DefaultConstraintCount]   INT NULL
   ,[ForeignKeyCount]          INT NULL
   ,[ProblemIndexCount]        INT NULL
   ,[ComputedColumnCount]      INT NULL
   ,[IdentityViolations]       INT NULL
   ,[UnsupportedDataTypeCount] INT NULL
   ,[FilestreamColumnCount]    INT NULL
   ,[LOBColumnCount]           INT NULL
   ,[AssemblyTypeCount]        INT NULL
   ,[UDTColumnCount]           INT NULL
   ,[SparseColumnCount]        INT NULL
   ,[TriggerCount]             INT NULL
);

INSERT @Tables
(
    table_id
)
SELECT [object_id]
FROM sys.tables
WHERE [is_memory_optimized] = 0
  AND @table_id IS NULL OR [object_id] = @table_id;

DECLARE @DefaultContraintTables TABLE
(
    parent_object_id INT NOT NULL
);

INSERT @DefaultContraintTables
SELECT [parent_object_id]
FROM sys.default_constraints;

WITH CheckConstraints
AS
(
    SELECT t.parent_object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.check_constraints t
    JOIN sys.tables tbl ON tbl.object_id = t.parent_object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.parent_object_id, DB_ID()) IS NOT NULL
      AND t.parent_object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.parent_object_id
)
UPDATE @Tables
SET [CheckConstraintCount] = CheckConstraints.CountRows
FROM @Tables AS Tables
INNER JOIN CheckConstraints ON CheckConstraints.table_id = Tables.table_id;

WITH DefaultConstraints
AS
(
    SELECT t.parent_object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.default_constraints t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.parent_object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.parent_object_id, DB_ID()) IS NOT NULL
      AND t.parent_object_id IN (SELECT * FROM @DefaultContraintTables)
      AND t.parent_object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.parent_object_id
)
UPDATE @Tables
SET DefaultConstraintCount = DefaultConstraints.CountRows
FROM @Tables AS Tables
INNER JOIN DefaultConstraints ON DefaultConstraints.table_id = Tables.table_id;

WITH ForeignKeys
AS
(
    SELECT t.parent_object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.foreign_keys t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.parent_object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.parent_object_id, DB_ID()) IS NOT NULL
      AND t.parent_object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.parent_object_id
)
UPDATE @Tables
SET ForeignKeyCount = ForeignKeys.CountRows
FROM @Tables AS Tables
INNER JOIN ForeignKeys ON ForeignKeys.table_id = Tables.table_id

/*
    Type of index:
    0 = Heap
    1 = Clustered
    2 = Nonclustered
    3 = XML
    4 = Spatial
    5 = Clustered columnstore index. Applies to: SQL Server 2014 through SQL Server 2017.
    6 = Nonclustered columnstore index. Applies to: SQL Server 2012 through SQL Server 2017.
    7 = Nonclustered hash index. Applies to: SQL Server 2014 through SQL Server 2017.
*/
;

WITH Indexes
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(tbl.object_id) AS CountRows
    FROM sys.indexes t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.object_id, DB_ID()) IS NOT NULL
      --AND (((t.is_unique = 1 OR t.is_unique_constraint = 1) AND t.is_primary_key = 0) OR 
      --AND t.type > 2
      AND t.type NOT IN (0, 1, 2, 5) -- clustered columnstore is supported for in-memory, but it's very different. 
      AND t.object_id NOT IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET ProblemIndexCount = Indexes.CountRows
FROM @Tables AS Tables
INNER JOIN Indexes ON Indexes.table_id = Tables.table_id;

-- nullable columns are supported....
--  SELECT 
--t.object_id AS [table_id], 
--      COUNT(*) AS CountRows
--  FROM sys.index_columns ic
--  JOIN sys.tables t ON t.object_id = ic.object_id
--  JOIN sys.all_columns c ON ic.column_id = c.column_id AND t.object_id = c.object_id
--  WHERE t.is_ms_shipped = 0 AND c.is_nullable = 1
--  UNION

WITH Triggers
AS
(
    SELECT tb.object_id AS [table_id]
          ,COUNT(tr.object_id) AS CountRows
    FROM sys.triggers tr
    INNER JOIN sys.tables tb ON tr.parent_id = tb.object_id
    WHERE tr.is_ms_shipped = 0
      AND tr.parent_class = 1
      AND tb.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY tb.object_id
)
UPDATE @Tables
SET TriggerCount = Triggers.CountRows
FROM @Tables AS Tables
INNER JOIN Triggers ON Triggers.table_id = Tables.table_id;


/*
    NOTE:
        If migrating to SQL 2017, computed columns are supported.
*/

WITH ComputedColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND (t.is_computed = 1)
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET ComputedColumnCount = ComputedColumns.CountRows
FROM @Tables AS Tables
INNER JOIN ComputedColumns ON ComputedColumns.table_id = Tables.table_id;

WITH SparseColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND (t.is_sparse = 1)
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET SparseColumnCount = SparseColumns.CountRows
FROM @Tables AS Tables
INNER JOIN SparseColumns ON SparseColumns.table_id = Tables.table_id;

WITH IdentityViolation
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    LEFT JOIN sys.identity_columns ic ON t.object_id = ic.object_id
                                     AND t.column_id = ic.column_id
    WHERE tbl.is_ms_shipped = 0
      AND 
      (
           t.is_identity = 1
      AND (ic.increment_value != 1 OR ic.seed_value != 1)
      )
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET IdentityViolations = IdentityViolation.CountRows
FROM @Tables AS Tables
INNER JOIN IdentityViolation ON IdentityViolation.table_id = Tables.table_id;

WITH UDTColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND t.user_type_id != t.system_type_id
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE
    @Tables
SET
    UDTColumnCount = UDTColumns.CountRows
FROM @Tables AS Tables
INNER JOIN UDTColumns ON UDTColumns.table_id = Tables.table_id;

WITH AssemblyTypes
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND tp.is_assembly_type = 1
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET AssemblyTypeCount = AssemblyTypes.CountRows
FROM @Tables AS Tables
INNER JOIN AssemblyTypes ON AssemblyTypes.table_id = Tables.table_id;

WITH UnsupportedDataTypes
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND UPPER(tp.name) IN ( 'DATETIMEOFFSET', 'GEOGRAPHY', 'GEOMETRY', 'SQL_VARIANT', 'HIERARCHYID', 'XML', 'IMAGE', 'XML', 'TEXT', 'NTEXT', 'TIMESTAMP' )
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET [UnsupportedDataTypeCount] = UnsupportedDataTypes.CountRows
FROM @Tables AS Tables
INNER JOIN UnsupportedDataTypes ON UnsupportedDataTypes.table_id = Tables.table_id;

WITH FileStream
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND t.is_filestream = 1
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET FilestreamColumnCount = FileStream.CountRows
FROM @Tables AS Tables
INNER JOIN FileStream ON FileStream.table_id = Tables.table_id;

WITH LOBColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND t.max_length = -1
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET LOBColumnCount = LOBColumns.CountRows
FROM @Tables AS Tables
INNER JOIN LOBColumns ON LOBColumns.table_id = Tables.table_id;

SELECT
    OBJECT_NAME(table_id) AS TableName
   ,IIF(CheckConstraintCount IS NULL, 0, CheckConstraintCount) + 
    IIF(DefaultConstraintCount IS NULL, 0, DefaultConstraintCount) + 
    IIF(ForeignKeyCount IS NULL, 0, ForeignKeyCount) + 
    IIF(ProblemIndexCount IS NULL, 0, ProblemIndexCount) + 
    IIF(ComputedColumnCount IS NULL, 0, ComputedColumnCount) + 
    IIF(IdentityViolations IS NULL, 0, IdentityViolations) + 
    IIF(UnsupportedDataTypeCount IS NULL, 0, UnsupportedDataTypeCount) + 
    IIF(LOBColumnCount IS NULL, 0, LOBColumnCount) + 
    IIF(AssemblyTypeCount IS NULL, 0, AssemblyTypeCount) + 
    IIF(UDTColumnCount IS NULL, 0, UDTColumnCount) + 
    IIF(SparseColumnCount IS NULL, 0, SparseColumnCount) + 
    IIF(TriggerCount IS NULL, 0, TriggerCount) AS IssueCount
   ,CheckConstraintCount
   ,DefaultConstraintCount
   ,ForeignKeyCount
   ,UDTColumnCount
   ,LOBColumnCount
   ,ProblemIndexCount
   ,ComputedColumnCount
   ,IdentityViolations
   ,UnsupportedDataTypeCount
   ,FilestreamColumnCount
   ,AssemblyTypeCount
   ,SparseColumnCount
   ,TriggerCount
FROM @Tables
--WHERE OBJECT_NAME(table_id) = 'YourTable'
ORDER BY IssueCount DESC;    

Availability Groups and Native Compilation

For disk-based tables, query plans for interpreted/traditional stored procedures will be recompiled when statistics have changed. That’s because when you update statistics, cached query plans for interpreted stored procedures are invalidated, and will automatically recompile the next time they’re executed. That’s true for an interpreted stored procedure that references disk-based tables, and/or memory-optimized tables.

As of SQL 2016, the database engine automatically updates statistics for memory-optimized tables (documentation here), but recompilation of native modules must still be performed manually. But hey, that’s way better than SQL 2014, when you couldn’t recompile at all; you had to drop/recreate the native module. And natively compiled stored procedures don’t reside in the plan cache, because they are executed directly by the database engine.

This post attempts to determine if the requirement to manually recompile native modules is any different for AG secondary replicas.

Stats on the primary

Statistics that are updated on the primary replica will eventually make their way to all secondary replicas. This blog post by Sunil Agarwal details what happens on the secondary replica if the statistics are stale (relative to any temporary statistics that were created on the secondary).

How do we…?

The first question we must answer is: how do you determine when the last time a natively compiled stored procedure was compiled?

We can do that by checking the value of the cached_time column from the following query:

SELECT *
FROM sys.dm_exec_procedure_stats
WHERE OBJECT_NAME(object_id) = '<YourModule>'

The query is simple, but you won’t get any results unless you enable the collection of stored procedure execution statistics for natively compiled procedures. Execution statistics can be collected at the object level or instance level.

NOTE: Enabling the collection of stored procedure statistics for natively compiled procedures can crush your server, potentially resulting in disastrous performance impact. You must be extremely careful with this method of troubleshooting.

Once you’ve enabled stats collection for native procedures, you should get results from the query above.

How I tested

Here are the steps I executed, after creating an AG that used synchronous mode (containing a single database with a memory-optimized filegroup):

  1. Create a sample table
  2. Insert some rows
  3. Create a natively compiled procedure that selects from the sample table
  4. Execute the native procedure on the primary and secondary (it must be executed at least once in order to have usage stats collected)
  5. Enable collection of stored procedure execution statistics on the primary and secondary replicas
  6. Again execute the native procedure on the primary and secondary
  7. Note the value of sys.dm_exec_procedure_stats.cached_time on the primary and secondary
  8. Recompile the native procedure on the primary
  9. Execute the native procedure on the primary and secondary
  10. Again note the value of sys.dm_exec_procedure_stats.cached_time on the primary and secondary

Results

The cached_time value on the secondary did not get updated when the native module was recompiled on the primary.

What does this mean for DBAs that are responsible for maintaining AGs that use native compilation?  It means that when you recompile native modules on the primary replica (which you would always do after updating statistics on the primary), those modules must be recompiled on all secondary replicas. The recompilation on the secondary can be performed manually or perhaps through some automated mechanism. For example, if you have a SQL Agent job on the primary replica to update statistics, one of the job steps might be for marking all natively compiled stored procedures on the secondary for recompilation, using sp_recompile.

How would that job step handle the recompile for all secondary replicas?

Perhaps after defining linked servers, you could do something like:

EXEC SecondaryServer1.msdb.dbo.sp_start_job @job_name = N’Recompile native procs’;

EXEC SecondaryServer2.msdb.dbo.sp_start_job @job_name = N’Recompile native procs’;

But it might be involved to define this for all secondary replicas – it sounds like a topic for another post…..

Happy recompiling –