Technology can change your life

Roy Dorman

When I first met Roy Dorman, he was in is mid-thirties, and completing a degree in philosophy at Fordham University here in New York City. Roy was coupled with a very close friend; I took a liking to him immediately.

It’s a tall order to find a teaching position in philosophy under the best of circumstances, and upon graduating, he struggled to find work (his age was likely not an asset). To meet the financial demands of living in this city, he toiled at whatever jobs came his way: limousine dispatcher, real-estate manager, etc.

After getting to know him for a while, I could see that Roy had the right combination of natural skills that would be perfect for a role in technology, and I told him so. He was extremely responsible, persevering, detail oriented, and enjoyed solving problems.

Remembering my own struggles to wrap my brain around technology (detailed here), I said to Roy – “Hey man, if you ever want to do something in the tech world, just let me know – I’d be glad to teach you.”

Time ticked on by, and then one day the phone rang – Roy was ready to begin his studies.

First ascent

The first hurdle we faced was that Roy had zero disposafble income, so I offered to lend him the money to purchase a computer. Our plan was that he would do self-study from home, with my oversight. However, there was one gigantic problem with this scenario – he had a very young child at home (I don’t know what we were thinking). As the great jazz musician Ahmad Jamal told me years ago – “you cannot serve two masters.”

With no measurable progress after a fair amount of study time, the likelihood of Roy being gainfully employed in the field of technology was hovering around impossible.

We needed another plan.

Second ascent

I could see that the obligations of a husband and father were not easily circumvented. On the other hand, Roy had worked a series of gigs that had zero financial upside, so he was highly motivated – an essential requirement of all heavy lifting.

Our second plan was more structured and required a larger commitment from both of us. Also built in to the plan were some “teeth”.

I asked Roy how much money he needed to survive, and the figure he came up with was $1,000 per week, and I agreed to sustain him financially throughout his studies. But because lending money to a friend can jeopardize the relationship, we approached the financial aspect from a different perspective. I told him that I would not lend him the money he needed – I would give him the money he needed. And should he one day be in a financial position to return the gesture, so be it. We went forward without my having any expectation of seeing the money that flowed to him.

Roy had no idea if he could morph into a SQL Server DBA, and I was keenly aware of the vast amount of trust he placed in me. It was an awesome responsibility, to say the least.

I had recently started a contract for a large migration of 100+ SQL Servers to the latest version, and this put me in a position to help Roy out. I wrote him a letter explaining what was expected of each of us, and what would happen if we did not succeed. With the letter I enclosed a check for $2,000, saying that there would be subsequent payments of $1,000 per week. My estimation was that it would take approximately nine months for Roy to become employable, but our timeline was open-ended.

Tough love

My letter in part said:

“If I was you, I’d be as frugal as humanly possible with this check, for the following reasons:

· I eat before you do

· Anything can happen

· Don’t assume there will be a next check, because I don’t. You’re a consultant now; your income is unpredictable.”

“I will evaluate your performance each month and discuss with you where you stand. If for whatever reason I deem that you are not living up to your end of the bargain – and it does not change – the deal is off, I become yet another line item in your long list of creditors, and our lives resume as before.“

I didn’t like to be that hard on Roy, but I felt it was the only way we could get to the finish line. It was for his own good, and he knew it.

I told him that he had to treat his SQL Server education exactly like a job. He received a set of keys to my studio apartment in Greenwich Village, and had to be there five days a week, eight hours a day.

To the grindstone

In addition to being a SQL Server DBA, I’ve had a life in music. Many people have come to me for music lessons, and almost none of them have ever returned after the initial encounter. Perhaps it’s because I don’t sugar-coat what’s involved with pursuing the subtleties of creative improvised music. Roy stands alone as the only student I’ve ever had that got into the long run with me.

Keep in mind that he had only very basic knowledge of computers when we began, and he knew absolutely nothing about operating systems or database software. I’ve sometimes thought that if Roy knew how much he’d have to learn in such a short span of time, he might have backed out. But ignorance is bliss, as they say.

To be honest, we worked like mad men.

I drilled him.

I grilled him.

I imparted mindful after mindful of technology upon Roy Dorman each and every day. Weeks turned to months, and slowly, the veil of technological ignorance lifted, giving way to comprehension and knowledge. After seven months, I had taught Roy everything I could think of (but I did leave off some critical items, such as how to determine how much memory is installed on a SQL Server – sorry, Roy!).

One day I arrived home to find a package he had left for me. Inside was a collage that he made about our collaboration, and I laughed out loud when I saw it. It had photos of the two of us, plus artwork for Sybase, Oracle and MS SQL – really hysterical. Also included was a very small pair of scissors. I called Roy and asked him what the scissors were for. He said: “To cut the cord!”

It was time for Roy to fly.

Looking for a gig

Roy wanted to work as a consultant, but at that point in time, there seemed to be more opportunities for him as an employee. Also, with a wife and young child, he needed the benefits that came with full time employment.

One day he called and said that he was going to have an interview, and there was something in his voice that sounded kind of funny. His appointment was to be at the same company I was working for, and in the same building where I worked. He asked me: “Are your manager’s initials ‘XYZ’?”

They were indeed.

Unbelievably, Roy would have an interview in my office the following Monday (I wonder how long those odds were).

As luck would have it, when Roy arrived on site, my manager asked me to give him a technical interview, and then he “introduced” me to Roy.

You will not be shocked to learn that I gave Roy the green light. Not because I had trained him, but because he was a good candidate, and would have been an asset to the team. But as it turned out, the company decided not to engage the services of Roy Dorman. I was really disappointed – we would have had a ball working together.

Opportunity knocks

After interviewing for a while, Roy told me that he was being considered for positions at two different companies – there was now third party endorsement of his skills. He had studied hard and done well.

One of the companies had three servers, and the other had fifteen. I told him that he had to take the job at the fifteen-server company, but he was extremely hesitant to do so. I stated in no uncertain terms that the larger company was his only move, and when he asked me why, I told him: “Because you’ll learn more there.”

Roy accepted the position at the fifteen-server company, and it was the beginning of a career that would encompass working at some very large corporations, such as Viacom, SAC Capital and others.

Through our collaboration, an intensely personal bond between us was formed – I looked upon Roy as a brother, and we stayed close throughout the many years that followed.

The Eternal Optimist

Roy called one day, revealing that he had Stage IV colon cancer. One thing I remember him saying to me was that he was looking at it as one of life’s “bumps in the road”. That was Roy’s essence – he was The Eternal Optimist.

For a while he did well with treatment. But after about a year, it was clear that Roy was losing the battle. When I picked him up from the hospital to take him home for the last time, he thanked me for giving him an opportunity to do better. I told him that my role was simply one of guidance (and ass kicker drill sergeant) – he did all the work.

Tragically, Roy passed away not long after that conversation, leaving behind a wife and two beautiful young daughters. For about ten years, he got to live the life he wanted (if you think that working is living, that is).

Coda

What can we take away from all of this?

· Back in the day, there was a lot less to learn, in order to become gainfully employed as a SQL Server DBA

· it’s never too late to change your life

· Motivation is more important than raw talent

· Good karma is good

In New York City, for decades the final destination for many who were down on their luck was the infamous Bowery. My mother told me that every time she walked along that famed stretch known as Skid Row, she came home broke – you need not be Sigmund Freud to figure out where my generous side originates.

Some of you reading this post might think that I had ulterior motives for helping Roy – that nobody does something for nothing. I can assure you there were no obligations on Roy’s part, but he did remit every penny of the $29,600 that he received.

Now that I think of it, perhaps I did get more from Roy than what I initially gave him. Every time I think of him, and what we achieved together, a warm smile spreads across my face.

Good DBAs always strive to be better at their tradecraft. How many of us attempt to improve our humanity?

I challenge everyone reading this post to commit to making another person’s life better in some way, large or small – no strings attached.

Thanks for reading –

Ned Otter

New York City, 2013

In-Memory OLTP diagnostic script

For a while now, I’ve been working on a script to evaluate a SQL Server environment for anything related to In-Memory OLTP.

I now offer it to the world, with the usual caveat: use at your own risk. It’s is a work in progress, so check back here now and then to see if anything has been added.

The script reports on two categories: instance level and database level.

Instance level evaluates the following:

  • which databases are memory-optimized
  • if running Enterprise, if there are any resource groups defined, and which databases are bound to them
  • version/edition of SQL server
  • ‘max memory’ setting
  • whether or not instance-level collection of execution statistics has been enabled for all natively compiled stored procedures
  • memory clerks for the buffer pool and In-Memory OLTP
  • the value of the committed_target_kb column from sys.dm_os_sys_info
  • display any event notifications (because they conflict with deploying In-Memory OLTP

Database level evaluates the following:

For each memory-optimized database:

  • database files, including container names, size, and location
  • memory-optimized tables
  • indexes on all memory-optimized tables
  • count of indexes per memory-optimized table
  • natively compiled stored procedures
  • which native modules are loaded (stored procedures only)
  • whether or not the collection of execution statistics is enabled for any natively compiled procedures
  • count of natively compiled procedures
  • if using the temporal feature for memory-optimized tables, the amount of memory consumed by hidden temporal internal tables
  • memory structures for LOB columns (off-row)
  • average chain length for HASH indexes
  • memory-optimized table types

In the comments, please let me know other things about the in-memory environment or databases you’d like to see included in the script.

The subtleties of In-Memory OLTP Indexing

For this post, I wanted to cover some of the indexing subtleties for memory-optimized tables, with an accent on columnstore indexes

Let’s create a memory-optimized table:

CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

Now, let’s attempt to create a NONCLUSTERED COLUMNSTORE INDEX:

ALTER TABLE dbo.InMemADD 
ADD INDEX NCCI_InMem NONCLUSTERED COLUMNSTORE (col1);

Msg 10794, Level 16, State 76, Line 76
The feature ‘NONCLUSTERED COLUMNSTORE’ is not supported with memory optimized tables.

It fails because we can only create a CLUSTERED columnstore index (CCI). For 25 years, Microsoft SQL Server differentiated between indexes that physically ordered data on storage (CLUSTERED) and those that did not (NONCLUSTERED). Unfortunately, they chose to ignore that pattern when creating the syntax for memory-optimized tables; using the word CLUSTERED is required when creating a columnstore index on memory-optimized tables.

Can we create a clustered columnstore index on a memory-optimized table that is defined as SCHEMA_ONLY?

Only one way to find out:

DROP TABLE IF EXISTS [dbo].[InMem];
GO

CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_ONLY);

ALTER TABLE dbo.InMem ADD INDEX CCI_InMem CLUSTERED COLUMNSTORE

Msg 35320, Level 16, State 1, Line 39
Column store indexes are not allowed on tables for which the durability option SCHEMA_ONLY is specified.

That won’t work, so let’s create our table with SCHEMA_AND_DATA:

DROP TABLE IF EXISTS [dbo].[InMem];
GO
CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL PRIMARY KEY NONCLUSTERED
   ,[col1] [CHAR](100) NOT NULL,
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Now, let’s create a clustered columnstore index:

ALTER TABLE dbo.InMem ADD INDEX CCI_InMem CLUSTERED COLUMNSTORE

Success! Let’s attempt to create a NONCLUSTERED index….

ALTER TABLE dbo.InMem ADD INDEX IX_Index1 (col1);

Msg 10794, Level 16, State 15, Line 117
The operation ‘ALTER TABLE’ is not supported with memory optimized tables that have a column store index.

Ooops – no can do. Once you add a clustered columnstore index to a memory-optimized table, the schema is totally locked down.

What about if we create the CCI and nonclustered index inline?

DROP TABLE IF EXISTS [dbo].[InMem];
GO
CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
   ,INDEX CCI_InMem CLUSTERED COLUMNSTORE
   ,INDEX IX_InMem1 (col1)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Awesome! We’ve proven that we can create both clustered columnstore and nonclustered indexes, but we must create them inline.

Now that we’ve got our indexes created, let’s try to add a column:

ALTER TABLE dbo.InMem ADD col2 INT NULL;

Msg 12349, Level 16, State 1, Line 68
Operation not supported for memory optimized tables having columnstore index.

Hey, when I said that the schema is locked down once you add a clustered columnstore index, I mean it!

What type of index maintenance is possible for indexes on memory-optimized tables?

For HASH indexes there is only one possible type of index maintenance, and that’s to modify/adjust the bucket count. There is zero index maintenance for RANGE/NONCLUSTERED indexes.

Let’s create a memory-optimized table with a HASH index, and verify the syntax for rebuilding the bucket count.

DROP TABLE IF EXISTS [dbo].[InMem];
GO
CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
   ,INDEX IX_InMem1 HASH (col1) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Here’s the syntax for rebuilding the bucket count for a HASH INDEX:

ALTER TABLE dbo.InMem 
ALTER INDEX IX_InMem1 
REBUILD WITH(BUCKET_COUNT = 1001)
GO

We can add a column, as long as we don’t have a CCI in place:

ALTER TABLE dbo.InMem ADD col2 INT NULL;
GO

How about trying to rebuild the bucket count if we created the memory-optimized table with inline CCI and HASH indexes?

DROP TABLE IF EXISTS [dbo].[InMem];
GO

CREATE TABLE [dbo].[InMem]
(
    [PK] [INT] IDENTITY(1, 1) NOT NULL
   ,[col1] [CHAR](100) NOT NULL
   ,PRIMARY KEY NONCLUSTERED HASH ([PK]) WITH (BUCKET_COUNT = 1000)
   ,INDEX CCI_InMem CLUSTERED COLUMNSTORE
   ,INDEX IX_InMem1 HASH (col1) WITH (BUCKET_COUNT = 1000)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

ALTER TABLE dbo.InMem ALTER INDEX IX_InMem1 REBUILD WITH(BUCKET_COUNT = 1001)
GO

Msg 10794, Level 16, State 13, Line 136
The operation ‘ALTER TABLE’ is not supported with memory optimized tables that have a column store index.

You can’t rebuild that index if you also have a columnstore index on the table. We would have to drop the columnstore index, reconfigure the bucket count for the HASH index, and then recreate the columnstore index. Both the drop and the create of the columnstore index will be fully logged, and executed serially. Not a huge problem if the amount of data is not too large, but it’s a potentially much larger problem if you’ve got a lot of data.

We can create a clustered columnstore index on a #temp table (on-disk):

DROP TABLE IF EXISTS #tempgo
GO

CREATE TABLE #temp
(
    col1 INT NOT NULL
);
GO

-- succeeds
CREATE NONCLUSTERED COLUMNSTORE INDEX NCCI_InMem ON #temp (col1);

--OR clustered, which also succeeds
CREATE CLUSTERED COLUMNSTORE INDEX NCCI_InMem ON #temp;

We can create multiple indexes with a single command:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId]      [INT]           IDENTITY NOT NULL
   ,[StoreID]      INT             NOT NULL
   ,[Notes1]       [VARCHAR](8000) NULL
   ,[ValidFrom]    [DATETIME2](7)  NOT NULL
   ,[ValidTo]      [DATETIME2](7)  NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

ALTER TABLE [InMemLOB]
 ADD 
 INDEX IX_1 (OrderId) 
,INDEX IX_2 (OrderId) 
,INDEX IX_3 (OrderId) 
,INDEX IX_4 (OrderId) 
,INDEX IX_5 (OrderId) 
,INDEX IX_6 (OrderId) 
,INDEX IX_7 (OrderId)

Can we create a columnstore index on a memory-optimized table variable?

DROP TYPE IF EXISTS dbo.typeTableMem
GO

CREATE TYPE dbo.typeTableMem AS TABLE
(
    [PK] INT NOT NULL PRIMARY KEY NONCLUSTERED
   ,[col1] NVARCHAR(255) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON);
GO

DECLARE @Table1 dbo.typeTableMem;
-- fails--Msg 102, Level 15, State 1--Incorrect syntax near '@Table1'.
ALTER TABLE @Table1 ADD INDEX CCI_InMem CLUSTERED COLUMNSTORE

Create a table that includes a LOB column with a MAX datatype, then add a clustered columnstore index:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId] [INT] IDENTITY NOT NULL
   ,[StoreID] INT NOT NULL
   ,[CustomerID] INT NOT NULL
   ,[OrderDate] [DATETIME] NOT NULL
   ,[DeliveryDate] DATETIME NULL
   ,[Amount] FLOAT NOT NULL
   ,[Notes] [NVARCHAR](MAX) NULL
   ,[ValidFrom] [DATETIME2](7) NOT NULL
   ,[ValidTo] [DATETIME2](7) NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

ALTER TABLE dbo.[InMemLOB] 
ADD INDEX CCI_InMemLOB CLUSTERED COLUMNSTORE

Msg 35343, Level 16, State 1, Line 22    The statement failed. Column ‘Notes’ has a data type that cannot participate in a columnstore index. Omit column ‘Notes’.   

Msg 1750, Level 16, State 1, Line 22    Could not create constraint or index. See previous errors.

For memory-optimized tables, LOB columns prevent creation of a clustered columnstore index.

Now let’s try creating a table using CHAR(8000). Astute readers will notice that the following table would create rows that are 32,060 bytes wide – this would fail for on-disk tables, but is perfectly valid for memory-optimized tables:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId] [INT] IDENTITY NOT NULL
   ,[StoreID] INT NOT NULL
   ,[CustomerID] INT NOT NULL
   ,[OrderDate] [DATETIME] NOT NULL
   ,[DeliveryDate] DATETIME NULL
   ,[Amount] FLOAT
   ,[Notes1] [CHAR](8000) NULL
   ,[Notes2] [CHAR](8000) NULL
   ,[Notes3] [CHAR](8000) NULL
   ,[Notes4] [CHAR](8000) NULL
   ,[ValidFrom] [DATETIME2](7) NOT NULL
   ,[ValidTo] [DATETIME2](7) NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

Msg 41833, Level 16, State 1, Line 29    Columnstore index ‘CCI_InMemLOB’ cannot be created, because table ‘InMemLOB’ has columns stored off-row.   
Columnstore indexes can only be created on memory-optimized table if the columns fit within the 8060 byte limit for in-row data.   
Reduce the size of the columns to fit within 8060 bytes.

Create a table with non-MAX LOB columns, but they are stored on-row,  then add a clustered columnstore index:

DROP TABLE IF EXISTS [dbo].[InMemLOB];
GO
CREATE TABLE [dbo].[InMemLOB]
(
    [OrderId] [INT] IDENTITY NOT NULL
   ,[StoreID] INT NOT NULL
   ,[Notes1] [CHAR](8000) NULL
   ,[ValidFrom] [DATETIME2](7) NOT NULL
   ,[ValidTo] [DATETIME2](7) NOT NULL
   ,CONSTRAINT [PK_InMemLOB_OrderID] PRIMARY KEY NONCLUSTERED (OrderId)
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);
GO

-- works
ALTER TABLE InMemLOB ADD INDEX IndOrders_StoreID (StoreID);
GO

-- works
ALTER TABLE dbo.[InMemLOB] ADD INDEX CCI_InMemLOB CLUSTERED COLUMNSTORE
GO

--Add data to table:
SET NOCOUNT ON
GO
INSERT dbo.[InMemLOB]
SELECT
    3 AS [StoreID]
   ,'ABC' AS [Notes1]
   ,GETDATE() AS [ValidFrom]
   ,GETDATE() AS [ValidTo]
GO 1000

SELECT *
FROM dbo.[InMemLOB]

Let’s create a natively compiled module that selects from this table:

DROP PROCEDURE IF EXISTS dbo.InMem_native_sp
GO
CREATE PROCEDURE dbo.InMem_native_sp
WITH NATIVE_COMPILATION, SCHEMABINDING, EXECUTE AS OWNER
AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'us_english')
    SELECT
        [OrderId]
       ,[StoreID]
       ,[Notes1]
       ,[ValidFrom]
       ,[ValidTo]
    FROM dbo.[InMemLOB];
END;
GO

ENABLE “Actual Plan” and SELECT – which index is used?

SELECT
    [OrderId]
   ,[StoreID]
   ,[Notes1]
   ,[ValidFrom]
   ,[ValidTo]
FROM dbo.[InMemLOB];

CCIPlan1

Now highlight the EXEC statement, and click “Estimated Plan” – which index is used?

EXEC dbo.InMem_native_sp

CCIPlan2

The SELECT statement uses the columnstore index, but the natively compiled procedure does not (that’s because natively compiled procedures ignore columnstore indexes).

Summing up

In this post, we’ve covered some of the finer points of indexing memory-optimized tables. Never know when they might come in handy….

SQL Server on Linux, Part 1

SQL 2017 is just about to be released, and one of the big ticket items is that SQL Server is now supported on the Linux platform.

In subsequent posts, I’ll be reporting on In-Memory OLTP on Linux, but first we’ll need to cover some Linux basics. I flirted with Unix ages ago, and I’ll be the first to admit that my brain doesn’t really work that way (perhaps no one’s brain does).

First, a note about environments – I usually like to work on a server in my home lab, because it has a lot of cores, 64GB of memory, and there’s no hourly cost for using it (and also because I built it….).

So I downloaded a copy of Ubuntu, CentOS, and a trial copy of Redhat Enterprise Linux, and attempted to install each one in my VMware Workstation environment. I spun my wheels for a few hours, and could not get any of them up and running in the way that I required. So, in the interest of saving time, I hit my Azure account, created a VM running Redhat, and proceeded to install SQL 2017 CTP2. Instructions for installing SQL 2017 on Linux can be found at this link. It should be noted that the installation varies by Linux distribution.

Those of us who don’t know Linux commands by heart, and are used to firing up GUI-based virtual machines, are in for a bit of a rude awakening. While it is possible to install GNOME on RHEL, you can’t simply RDP into the VM without a lot of Linux admin setup for xdp (I never did get it to work). So how do you connect to your Linux VM running SQL Server to do basic tasks? The answer is: PuTTY

PuTTY can be downloaded from this link, and after you install it on your client machine (your laptop or home workstation), connecting to your Azure VM is very easy. When you run PuTTY, you’re presented with the following window, and you can simply enter your IP address into the “Host Name (or IP address)” section, and click the “Open” button:

PUTTY

(you might receive a warning to confirm you want to connect).

Once you connect to the Azure VM, you are prompted for your user name and password, and after logging in to the VM, you arrive at the home directory for your login.

HomeDir

Once you’ve installed SQL Server according to the instructions at this link, you can use SSMS from your desktop to connect over the public internet, and manage your SQL Server environment. It’s a really good idea to limit the inbound connections for your VM to only your IP address, otherwise bots from all over the globe will attempt to hack your machine (you have been warned….).

Now that SQL Server is installed and running, we can attempt to connect, and create a database.

In SSMS, click connect, choose “Database Engine”, and when prompted, enter the user name and password. Make sure “SQL Server Authentication” is chosen, and not “Windows Authentication”.

The first thing I did was to determine where the system databases were stored, so I executed:

sp_helpdb master

master

I used the same path as the master database files to create a test database:

USE master
GO
CREATE DATABASE [TestDB]
ON PRIMARY
       (
           NAME = N’TestDBData’
          ,FILENAME = N’/var/opt/mssql/data/TestDB.mdf’
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
LOG ON
    (
        NAME = N’TestDBLog’
       ,FILENAME = N’/var/opt/mssql/data/TestDB.ldf’
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

That worked fine, but what if we want to create a database in a separate folder?

Using PuTTY, we can create a folder using the mkdir command (xp_cmdshell is not currently supported for SQL Server running on Linux):

mkdir /var/opt/sqldata

mkdir1

Unfortunately, that didn’t go as planned! We don’t have permission to create that folder, so we’ll try using sudo (more on sudo at this link):

sudo mkdir /var/opt/sqldata

sudomkdir

sudo prompts you for your password, after which it will create the directory.

Now that the directory has been created, we can attempt to create a new database there.

USE master
GO
CREATE DATABASE [TestDB2]
ON PRIMARY
       (
           NAME = N’TestDB2Data’
          ,FILENAME = N’/var/opt/sqldata/TestDB2.mdf’
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
LOG ON
    (
        NAME = N’TestDB2Log’
       ,FILENAME = N’/var/opt/sqldata/TestDB2.ldf’
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

error1

Still no luck – what could be the issue?

Let’s check the security context of the mssql service:

ps aux | grep mssql

mssql service

So, the sqlserver process executes under the mssql user account. Let’s check permissions in the sqldata directory with:

stat –format “%A” /var/opt/sqldata

On my VM, the results are:

rwxr-xr-x

Permissions for Linux files are separated into three sections:

  • owner
  • group (for the file or directory)
  • others

Each section can have the following attributes:

  • (r)ead
  • (w)rite
  • e(x)ecute

For more information on these attributes, please visit this link.

It’s easier to interpret the output if we break it up:

[rwx] [r-x] [r-x]

  • the directory owner has read, write, and execute permission
  • the directory group has read and execute permission
  • others have read and execute permission

When we create a directory, it’s owned by root. The problem with creating a database in this directory should be obvious: only the owner of the directory has write permission.

Let’s make the mssql user the owner of the sqldata directory:

chown mssql:mssql /var/opt/sqldata

chown
And finally, we’ll check the permissions for the sqldata folder:

final

Now let’s retry our CREATE DATABASE statement.

USE master
GO
CREATE DATABASE [TestDB2]
ON PRIMARY
       (
           NAME = N’TestDB2Data’
          ,FILENAME = N’/var/opt/sqldata/TestDB2.mdf’
          ,SIZE = 100MB
          ,MAXSIZE = UNLIMITED
          ,FILEGROWTH = 100MB
       )
LOG ON
    (
        NAME = N’TestDB2Log’
       ,FILENAME = N’/var/opt/sqldata/TestDB2.ldf’
       ,SIZE = 100MB
       ,MAXSIZE = 2048GB
       ,FILEGROWTH = 100MB
    );
GO

Voila! We successfully created a database in the intended folder.

Seasoned DBAs might be wondering about Instant File Initialization (IFI), a best practice on Windows that greatly increases the speed of creating or extending data files.

When IFI is not configured, data files must be zeroed when created or extended. Does Linux have something akin to IFI? The answer is…..IFI does not exist as a thing you can configure on the file systems that SQL on Linux supports (EXT4, available on all distributions, or XFS file system, available only on Redhat).

However, the good news is that on the Linux platform, data files are not initialized with zeros when created or extended – Linux takes care of this without any intervention from DBAs.

Anthony Nocentino (@centinosystems) just blogged about the internals of file initialization on the Linux platform in this post.

Using native compilation to insert parent/child tables

This blog post demonstrates various approaches when using native compilation to insert rows into parent/child tables.

First, let’s create tables named Parent and Child, and relate them with a FOREIGN KEY constraint. Note that the Parent table uses the IDENTITY property for the PRIMARY KEY column.

DROP TABLE IF EXISTS dbo.Child
GO
DROP TABLE IF EXISTS dbo.Parent
GO

CREATE TABLE dbo.Parent
(
     ParentID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
CREATE TABLE dbo.Child
(
     ChildID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,ParentID INT NOT NULL FOREIGN KEY REFERENCES dbo.Parent (ParentID) INDEX IX_Child_ParentID 
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO

Next, we attempt to create a natively compiled procedure that performs an INSERT to the Parent table, and tries to reference the key value we just inserted, with @@IDENTITY.

Scenario 1

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    VALUES
    (
        'Parent1'
       ,'SomeDescription'
    )

    DECLARE @NewParentID INT
    SELECT @NewParentID  = SCOPE_IDENTITY()

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentID
       ,'Child1'
       ,'SomeDescription' 
    )
END
GO

EXEC dbo.Proc_InsertParentAndChild

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

 

Results4

This works, but there are other approaches to solving this problem.

Next, we’ll try to DECLARE a table variable, and OUTPUT the new key value.

Scenario 2

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NewParentID TABLE (ParentID INT NOT NULL)
    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    OUTPUT Inserted.ParentID INTO @NewParentID
    /*
        Msg 12305, Level 16, State 24, Procedure Proc_InsertParentAndChild, Line 7 [Batch Start Line 64]
        Inline table variables are not supported with natively compiled modules.
    */
    
    VALUES
    (
        'Parent1' 
       ,'SomeDescription' 
    ) 
END
GO

But again we have issues with unsupported T-SQL.

Now we’ll try creating a memory-optimized table variable outside the native procedure, and then declare a variable of that type inside the native procedure.

Scenario 3

CREATE TYPE dbo.ID_Table AS TABLE
(
    ParentID INT NOT NULL PRIMARY KEY NONCLUSTERED
)
WITH (MEMORY_OPTIMIZED = ON)

GO

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NewParentID dbo.ID_Table 
    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    OUTPUT Inserted.ParentID INTO @NewParentID
    VALUES
    (
        'Parent1' 
       ,'SomeDescription' 
    )

    DECLARE @NewParentValue INT = (SELECT ParentID FROM @NewParentID)

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentValue
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

This compiles, so now let’s test it.

EXEC dbo.Proc_InsertParentAndChild

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

Results3
This works great, but for completeness, we should test other possibilities.

This time, we’ll recreate the tables, but we’ll leave off the IDENTITY property for the Parent table. Instead of IDENTITY, we’ll create a SEQUENCE, and attempt to generate the next value within the native module.

Scenario 4

DROP PROCEDURE IF EXISTS dbo.Proc_InsertParentAndChild  
go
DROP TABLE IF EXISTS dbo.Child
GO
DROP TABLE IF EXISTS dbo.Parent
GO

CREATE TABLE dbo.Parent
(
     ParentID INT PRIMARY KEY NONCLUSTERED – no IDENTITY property used here!
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
CREATE TABLE dbo.Child
(
     ChildID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,ParentID INT NOT NULL FOREIGN KEY REFERENCES dbo.Parent (ParentID) INDEX IX_Child_ParentID 
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO


CREATE SEQUENCE dbo.ParentSequence AS INT

GO

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NextParentSequence INT = NEXT VALUE FOR dbo.ParentSequence

    INSERT dbo.Parent
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
         @NextParentSequence
       ,'Parent1' 
       ,'SomeDescription' 
    )

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NextParentSequence
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

/*
    Msg 10794, Level 16, State 72, Procedure Proc_InsertParentAndChild, Line 19 [Batch Start Line 176]
    The operator 'NEXT VALUE FOR' is not supported with natively compiled modules.
*/

But this fails, because as the error states, we can’t use NEXT VALUE FOR within native modules.

Scenario 5

How about if we generate the next value for the sequence outside the module, and pass that value?

Let’s see —

 

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
(
    @NewParentValue INT
)
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    INSERT dbo.Parent
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
         @NewParentValue
       ,'Parent1' -- Name - char(50)
       ,'SomeDescription' -- Description - char(100)
    )

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentValue
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID

DECLARE @NextParentSequence INT 
SELECT @NextParentSequence = NEXT VALUE FOR dbo.ParentSequence
EXEC dbo.Proc_InsertParentAndChild  @NextParentSequence

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

Results1

This also works, so we’ll add it to our arsenal. But there’s one weird thing – the value that was inserted into the Parent table is –2147483647, which is probably not what we intended. So we’ll have to tidy up our SEQUENCE a bit.

DROP SEQUENCE dbo.ParentSequence 
GO
CREATE SEQUENCE dbo.ParentSequence AS INT START WITH 1
GO
DECLARE @NextParentSequence INT 
SELECT @NextParentSequence = NEXT VALUE FOR dbo.ParentSequence
EXEC dbo.Proc_InsertParentAndChild  @NextParentSequence

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID

Everything looks good now:

Results2

In this post, we have verified three different ways to successfully insert into parent/child records, when using native compilation.

SQL 2017 In-Memory roundup

SQL Server 2017 includes enhancements to many features, and some of those enhancements include In-Memory OLTP.

  • Microsoft states that ALTER TABLE for memory-optimized tables is now “usually substantially faster”. I asked for clarity about that – if it means that ALTER TABLE is faster for the same events that were able to be executed in parallel and minimally logged in SQL 2016, or if there are new ALTER TABLE statements which now execute in parallel. They replied that there is no change to the set of operations that executed in parallel. So the ALTER TABLE commands that executed fast now (theoretically) execute faster.
  • Up to and including SQL 2016, the maximum number of nonclustered indexes on a memory-optimized table was eight, but that limitation has been removed for SQL 2017. I’ve tested this with almost 300 indexes, and it worked. With this many supported indexes, it’s no wonder they had to….
  • Enhance the index rebuild performance for nonclustered indexes during database recovery. I confirmed with Microsoft that the database does not have be in SQL 2017 compatibility mode (140) to benefit from the index rebuild enhancement. This type of rebuild happens not only for database restore and failover, but also for other “recovery events” – see my blog post here.
  • In SQL 2017, memory-optimized tables now support JSON in native modules (functions, procedures and check constraints).
  • Computed columns, and indexes on computed columns are now supported
  • TSQL enhancements for natively compiled modules include CASE, CROSS APPLY, and TOP (N) WITH TIES
  • Transaction log redo of memory-optimized tables is now done in parallel. This has been the case for on-disk tables since SQL 2016, so it’s great that this potential bottleneck for REDO has been removed.
  • Memory-optimized filegroup files can now be stored on Azure Storage, and you can also backup and restore memory-optimized files on Azure Storage.
  • sp_spaceused is now supported for memory-optimized tables
  • And last but definitely not least,  drum roll, please…….we can now rename memory-optimized tables and natively compiled modules

While Microsoft continues to improve columnstore indexes for on-disk tables, unfortunately columnstore for memory-optimized tables gets left further and further behind. Case in point would be support for LOB columns for on-disk tables in SQL 2017, but no such support for memory-optimized tables. And my good friend Niko Neugebauer (b|t) just reminded me that computed columns for on-disk CCI are supported in SQL 2017, but they are not supported for in-memory CCI. For an in-depth comparison of columnstore differences between on-disk and memory-optimized tables, see my  post here.

In addition to what’s listed above, I tested the following functionality for natively compiled stored procedures:

My wish list for the In-Memory OLTP feature is huge, but it’s great to see Microsoft continually improve and extend it.

All about In-Memory isolation levels, Part 2

In the Part 1, we covered the basics of transaction initiation modes and isolation levels. Today we’ll continue with more details about isolation levels and initiation modes for memory-optimized tables, and finally we’ll see how to reference both types of tables in a query.

But first, let’s summarize supported isolation levels.

List 1:

OnDiskIsolation_thumb4

Last time, we had this for “List 2”:

InMemIsolation_thumb1

But that’s not the whole truth – the complete picture of isolation levels and initiation modes for memory-optimized tables is summarized in the following table:

InMemComplete

In Part 1, we said that READ COMMITED is supported for memory-optimized tables, but we didn’t explain how. Here we can see that it’s supported, but only for single statement, “autocommit” transactions. Autocommit transactions are not possible within a native module, so you’re limited to interpreted TSQL (un-compiled), as indicated in the table above.

Let’s work through an example.

If the transaction isolation level is set to READ COMMITED SNAPSHOT – which, as detailed in the last post, can only be set with an ALTER DATBASE command – then you can execute the following:

That’s a single statement that will be autocommitted.

But you cannot execute the following:

Why will it fail?

It will fail because the initiation mode of this transaction is not autocommit, which is required for READ COMMITED SNAPSHOT when referencing memory-optimized tables (the initiation mode is explicit, because we explicitly defined a transaction).  So to be totally clear, for queries that only reference memory-optimized tables, we can use the READ COMMITTED or READ COMMITTED SNAPSHOT isolation levels, but the transaction initiation mode must be autocommit. Keep this in mind, because in a moment, you’ll be questioning that statement….

Now let’s put it all together and understand the rules for referencing on-disk and memory-optimized tables in the same query.

Cross-Container

A query that references both on-disk and memory-optimized tables is known as a “cross-container” transaction.

The following table lists the rules:

Interop1

If the on-disk or database isolation level is READ UNCOMMITTED, READ COMMITTED, or READ COMMITTED SNAPSHOT, then you can reference memory-optimized tables using SNAPSHOT, REPEATABLE READ, or SERIALIZABLE isolation levels. An example would be:

But wait – a moment ago we proved that when we use the READ COMMITTED isolation level, and we query a memory-optimized table, the transaction initiation mode must be autocommit. The code we just displayed uses an explicit transaction to do the same thing, so we’ve got some explaining to do….

The answer is that for queries that only reference memory-optimized tables, we must use autocommit. But the rules are different for cross-container transactions, and in that case, we can use explicit transactions.

Back to SNAPSHOT

What if we converted some tables to be memory-optimized, and they were referenced everywhere in our TSQL code?

Would we have to change all of our code to use WITH (SNAPSHOT)?

Fear not, there is a solution, and it’s a database setting known as MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT. If this option is enabled, then you don’t have to change your code to use WITH (SNAPSHOT) for interop statements. The engine will automagically elevate the isolation level to SNAPSHOT for all memory-optimized tables that use interop/cross-container. More information on MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT is available at this link.

Just to recap what we covered last time about the different forms of snapshot isolation:

  • READ COMMITTED SNAPSHOT isolation is “statement-level consistency”
  • SNAPSHOT isolation is “transaction-level consistency”

A cross-container transaction doesn’t support snapshot isolation, because it’s actually two sub-transactions, each with its own internal transaction ID and start time. As a result, it’s impossible to synchronize transaction-level consistency between on-disk and memory-optimized tables.

Wrapping up

In the first post on transaction processing, we covered isolation levels for both on-disk and memory-optimized tables, but we left out some of the details for memory-optimized tables. This post has filled in those details, and also introduced the rules for cross-container transactions.

All about In-Memory isolation levels, Part 1

 

Transaction initiation modes

If you want to understand the details of transaction isolation requirements for memory-optimized tables, you must first understand transaction initiation modes. That’s because the initiation mode affects what type of isolation levels are possible when referencing memory-optimized tables.

There are four different modes that describe the way in which a transaction is initiated:

Atomic Block – An atomic block is a unit of work that occurs within a natively compiled module (procedure, function, or trigger). Native modules can only reference memory-optimized tables.

Explicit – We’re all familiar with this mode, because it requires defining an explicit beginning for the transaction, and then either a commit or rollback.

Implicit – We’ll cover this mode for the sake of completeness, but I’ve not seen an implicit transaction in all my years of SQL Server. Implicit transactions require you to SET IMPLICIT_TRANSACTIONS ON, which then  – for specific types of TSQL statements – has the effect of beginning a transaction for you. It’s only benefit is that it spares you from having to write your own BEGIN TRAN statement (woo hoo).

Documentation for implicit transactions can be found here.

Autocommit – If you execute TSQL statements outside of an explicit or implicit transaction, and outside of an atomic block, then for each individual statement, the SQL Server engine starts a transaction. That transaction is automatically committed or rolled back.

An example of an autocommit transaction would be:

We did not create an explicit transaction with BEGIN TRAN, and we didn’t SET IMPLICIT_TRANSACTIONS ON, which would have allowed the engine to implicitly start a transaction. Therefore, this TSQL statement will be automatically committed or rolled back by the engine.

Isolation levels

Now that we have a basic understanding of transaction initiation modes, let’s move on to isolation levels. Isolation levels are what determine whether certain “concurrency side effects” are allowed, such as dirty reads (uncommitted data), or phantom reads. Please refer to the SQL Server documentation on isolation levels at this link or this link for specific details.

First, let’s display which types of isolation levels are available for each type of table.

List 1:

OnDiskIsolation

“Snapshot”

One thing I want to clear up right way, is how freely the word “snapshot” is used in the SQL Server documentation, the technology world in general, and how confusing this label is in the context of transaction isolation levels.

Some editions of SQL Server have the ability to create database snapshots, which use NTFS sparse files to make a “copy on write”, read-only version of a database. This type of snapshot has absolutely nothing to do with isolation levels.

The word “snapshot” is also used to describe saving the state of a virtual machine, i.e. Hyper-V, VMware, etc.

And there are also SAN snapshots, which create an image of your storage at a fixed point in time. Again, none of these types of snapshots have anything to do with isolation levels in SQL Server.

There are two variations of snapshot isolation in SQL Server:

  • statement-level consistency – Within the context of a transaction, each statement sees consistent data as of the moment the statement executed. Other transactions can modify data while your transaction is executing, potentially affecting results.
  • transaction-level consistency – All data that is referenced within the context of a transaction is guaranteed to be consistent as of the transaction start time. While your transaction is executing, modifications by other transactions cannot be seen by any statement within your transaction. When you attempt to COMMIT there can be conflicts, but we won’t cover that in this post.

Statement-level consistency is known as “read committed snapshot isolation”, while transaction-level consistency is known as “snapshot isolation”. Both can be enabled at the database level, while only transaction-level consistency can be set with the SET TRANSACTION ISOLATION command.

OnDiskSnapshot

(wrapping your brain around variations of snapshot isolation will help you understand some of the nuances in the next post)

List 2*:

(*READ COMMITTED isolation is supported for memory-optimized tables, and we’ll cover that in the next post, but for now let’s concentrate on the isolations listed here)

InMemIsolation

If you are only querying on-disk tables, you can use any of the isolations levels from List 1. And if you are only querying memory-optimized tables, you can use any of the isolation levels from List 2.

But what if you want to reference both on-disk and memory-optimized tables in the same query? Of course, the answer is “it depends”, with transaction initiation modes and isolation levels being the components of that dependency.

As mentioned earlier, you can’t use native compilation to reference both on-disk and memory-optimized tables – instead you must use interpreted TSQL, otherwise known as “interop”. In the next post we’ll discuss the requirements for using interop to reference both on-disk and memory-optimized tables within a single query.

 

Using temporal memory-optimized tables

The temporal feature works for both on-disk and memory-optimized tables, but has a slightly different implementation for memory-optimized tables.

As changes are made to rows in the temporal memory-optimized table, before being transferred to the history table on disk, they are first migrated to an internal memory-optimized staging table. That means when you query the “history table”, you could be retrieving rows from both the on-disk history table, and internal staging table. Because no custom indexing was possible on the internal staging table, there could be performance implications when executing queries against historical data. Microsoft addressed these potential performance issues in SQL 2016 SP1 (detailed in this CAT blog post).

The internal staging table only gets flushed to the on-disk history table when it reaches 8% of the size of the temporal table. Given the current capacities of Windows Server 2016 (24TB memory), it’s now possible to have very large memory-optimized tables. 8% of one of those large memory-optimized tables could be quite large, which will affect query performance, if predicates don’t match available indexes.

As of SP1 you can address the performance issues by adding (multiple) indexes to the internal staging table, and while that’s a fantastic improvement, there are some things to be aware of:

  • The fix requires Trace Flag 10316, so it’s one more thing you’ll have to remember to configure in your environments.
  • The name of the internal staging table is dynamic. This means that the script you maintain to recreate indexes on the internal table must first determine the name of the internal staging table (the CAT post details how to do this). And you’ll have the same issue for failover, whether it’s FCI or AG.

Now imagine you have dozens (or multiple dozens) of memory-optimized tables that use the temporal feature, and you want to customize indexing differently on all of them. The current SP1 solution doesn’t seem like such a great fix when DBAs must maintain dozens of scripts to apply custom indexing upon server reboot or failover.

There is perhaps a simpler solution, and that would be to monitor the number of rows in the internal staging table, and flush it to disk once it hits a threshold. Don’t assume that you’ll want to flush at the same threshold for all tables. And this won’t solve the custom indexing issue, but it could make queries perform less-worse until they’re flushed to the history table, where presumably you already have custom indexing in place. But again, you’ll have the dynamic table name issue.

You’d have to create a SQL Agent job that checks the rowcount for internal staging tables, and then call sys.sp_xtp_flush_temporal_history if required.

Your script would have to be intelligent enough to determine which tables are memory-optimized, and whether or not a given memory-optimized table uses the temporal feature.

And when you add/remove the temporal feature for a memory-optimized table, you’ll have to remember to update the custom script and Agent job. And of course this custom script will have to be executed upon reboot and/or failover.

This is just one more thing to be aware of when you consider deploying the temporal feature with In-Memory OLTP.

DML for memory-optimized tables in partitioned views

As part of a conversation on the #sqlhelp, I was giving feedback to a member of the Microsoft Tiger team about how In-Memory OLTP might be improved. One of my responses was about implementing a hot/cold data solution, and  Adam Machanic (b|t) responded, suggesting that partitioned views might work. I had already tested partitioned views and memory-optimized tables for queries, and confirmed that it worked as expected, but I had never tested DML for this combination of features.

Based on my new testing, I can state with confidence that partitioned view support for DML with memory-optimized tables will properly handle INSERT and DELETE operations, but some UPDATE operations might fail, and that failure occurs in a most ungraceful way. After verifying that this is an issue on SQL 2016 SP1/CU3 and SQL 2017 CTP2.1, I filed this Connect item. Microsoft has confirmed this issue, but has not yet determined how it will be fixed. For example, they could decide to disallow all UPDATEs when a memory-optimized table  belongs to a partitioned view, or instead decide to support it under limited circumstances. I’ll update this post when I have more detail from Microsoft.

Let’s assume that we have two tables that belong to a partitioned view. Both tables can be memory-optimized, or one table can be memory-optimized, and the other on-disk.

Success condition

an UPDATE occurs to a row in a table, and the UPDATE does not change where the row would reside, i.e. does not cause it to “move” to another table, based on defined CONSTRAINTS

Failure conditions:

   a. UPDATE occurs to a row in the memory-optimized table that causes it to move to either another memory-optimized table, or a on-disk table

   b. UPDATE occurs to a row in the on-disk table that causes it to move to the memory-optimized table

The failure looks like this:

PartitionedViewError

Repro script: