Category Archives: In-Memory OLTP

Using native compilation to insert parent/child tables

This blog post demonstrates various approaches when using native compilation to insert rows into parent/child tables.

First, let’s create tables named Parent and Child, and relate them with a FOREIGN KEY constraint. Note that the Parent table uses the IDENTITY property for the PRIMARY KEY column.

DROP TABLE IF EXISTS dbo.Child
GO
DROP TABLE IF EXISTS dbo.Parent
GO

CREATE TABLE dbo.Parent
(
     ParentID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
CREATE TABLE dbo.Child
(
     ChildID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,ParentID INT NOT NULL FOREIGN KEY REFERENCES dbo.Parent (ParentID) INDEX IX_Child_ParentID 
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO

Next, we attempt to create a natively compiled procedure that performs an INSERT to the Parent table, and tries to reference the key value we just inserted, with @@IDENTITY.

Scenario 1

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    VALUES
    (
        'Parent1'
       ,'SomeDescription'
    )

    DECLARE @NewParentID INT
    SELECT @NewParentID  = SCOPE_IDENTITY()

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentID
       ,'Child1'
       ,'SomeDescription' 
    )
END
GO

EXEC dbo.Proc_InsertParentAndChild

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

This works, but there are other approaches to solving this problem.

Next, we’ll try to DECLARE a table variable, and OUTPUT the new key value.

Scenario 2

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NewParentID TABLE (ParentID INT NOT NULL)
    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    OUTPUT Inserted.ParentID INTO @NewParentID
    /*
        Msg 12305, Level 16, State 24, Procedure Proc_InsertParentAndChild, Line 7 [Batch Start Line 64]
        Inline table variables are not supported with natively compiled modules.
    */
    
    VALUES
    (
        'Parent1' 
       ,'SomeDescription' 
    ) 
END
GO

But again we have issues with unsupported T-SQL.

Now we’ll try creating a memory-optimized table variable outside the native procedure, and then declare a variable of that type inside the native procedure.

Scenario 3

CREATE TYPE dbo.ID_Table AS TABLE
(
    ParentID INT NOT NULL PRIMARY KEY NONCLUSTERED
)
WITH (MEMORY_OPTIMIZED = ON)

GO

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NewParentID dbo.ID_Table 
    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    OUTPUT Inserted.ParentID INTO @NewParentID
    VALUES
    (
        'Parent1' 
       ,'SomeDescription' 
    )

    DECLARE @NewParentValue INT = (SELECT ParentID FROM @NewParentID)

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentValue
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

This compiles, so now let’s test it.

EXEC dbo.Proc_InsertParentAndChild

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

This works great, but for completeness, we should test other possibilities.

This time, we’ll recreate the tables, but we’ll leave off the IDENTITY property for the Parent table. Instead of IDENTITY, we’ll create a SEQUENCE, and attempt to generate the next value within the native module.

Scenario 4

DROP PROCEDURE IF EXISTS dbo.Proc_InsertParentAndChild  
go
DROP TABLE IF EXISTS dbo.Child
GO
DROP TABLE IF EXISTS dbo.Parent
GO

CREATE TABLE dbo.Parent
(
     ParentID INT PRIMARY KEY NONCLUSTERED – no IDENTITY property used here!
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
CREATE TABLE dbo.Child
(
     ChildID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,ParentID INT NOT NULL FOREIGN KEY REFERENCES dbo.Parent (ParentID) INDEX IX_Child_ParentID 
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO


CREATE SEQUENCE dbo.ParentSequence AS INT

GO

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NextParentSequence INT = NEXT VALUE FOR dbo.ParentSequence

    INSERT dbo.Parent
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
         @NextParentSequence
       ,'Parent1' 
       ,'SomeDescription' 
    )

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NextParentSequence
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

/*
    Msg 10794, Level 16, State 72, Procedure Proc_InsertParentAndChild, Line 19 [Batch Start Line 176]
    The operator 'NEXT VALUE FOR' is not supported with natively compiled modules.
*/

But this fails, because as the error states, we can’t use NEXT VALUE FOR within native modules.

Scenario 5

How about if we generate the next value for the sequence outside the module, and pass that value?

Let’s see —

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
(
    @NewParentValue INT
)
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    INSERT dbo.Parent
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
         @NewParentValue
       ,'Parent1' -- Name - char(50)
       ,'SomeDescription' -- Description - char(100)
    )

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentValue
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID

DECLARE @NextParentSequence INT 
SELECT @NextParentSequence = NEXT VALUE FOR dbo.ParentSequence
EXEC dbo.Proc_InsertParentAndChild  @NextParentSequence

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

This also works, so we’ll add it to our arsenal. But there’s one weird thing – the value that was inserted into the Parent table is –2147483647, which is probably not what we intended. So we’ll have to tidy up our SEQUENCE a bit.

DROP SEQUENCE dbo.ParentSequence 
GO
CREATE SEQUENCE dbo.ParentSequence AS INT START WITH 1
GO
DECLARE @NextParentSequence INT 
SELECT @NextParentSequence = NEXT VALUE FOR dbo.ParentSequence
EXEC dbo.Proc_InsertParentAndChild  @NextParentSequence

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID

Everything looks good now:

In this post, we have verified three different ways to successfully insert into parent/child records, when using native compilation.

SQL 2017 In-Memory roundup

1 Reply

SQL Server 2017 includes enhancements to many features, and some of those enhancements include In-Memory OLTP.

Microsoft states that ALTER TABLE for memory-optimized tables is now “usually substantially faster”. I asked for clarity about that – if it means that ALTER TABLE is faster for the same events that were able to be executed in parallel and minimally logged in SQL 2016, or if there are new ALTER TABLE statements which now execute in parallel. They replied that there is no change to the set of operations that executed in parallel. So the ALTER TABLE commands that executed fast now (theoretically) execute faster.
Up to and including SQL 2016, the maximum number of nonclustered indexes on a memory-optimized table was eight, but that limitation has been removed for SQL 2017. I’ve tested this with almost 300 indexes, and it worked. With this many supported indexes, it’s no wonder they had to….
Enhance the index rebuild performance for nonclustered indexes during database recovery. I confirmed with Microsoft that the database does not have be in SQL 2017 compatibility mode (140) to benefit from the index rebuild enhancement. This type of rebuild happens not only for database restore and failover, but also for other “recovery events” – see my blog post here.
In SQL 2017, memory-optimized tables now support JSON in native modules (functions, procedures and check constraints).
Computed columns, and indexes on computed columns are now supported
TSQL enhancements for natively compiled modules include CASE, CROSS APPLY, and TOP (N) WITH TIES
Transaction log redo of memory-optimized tables is now done in parallel. This has been the case for on-disk tables since SQL 2016, so it’s great that this potential bottleneck for REDO has been removed.
Memory-optimized filegroup files can now be stored on Azure Storage, and you can also backup and restore memory-optimized files on Azure Storage.
sp_spaceused is now supported for memory-optimized tables
And last but definitely not least, drum roll, please…….we can now rename memory-optimized tables and natively compiled modules

While Microsoft continues to improve columnstore indexes for on-disk tables, unfortunately columnstore for memory-optimized tables gets left further and further behind. Case in point would be support for LOB columns for on-disk tables in SQL 2017, but no such support for memory-optimized tables. And my good friend Niko Neugebauer (b|t) just reminded me that computed columns for on-disk CCI are supported in SQL 2017, but they are not supported for in-memory CCI. For an in-depth comparison of columnstore differences between on-disk and memory-optimized tables, see my post here.

In addition to what’s listed above, I tested the following functionality for natively compiled stored procedures:

STRING_AGG()

This works, but you can’t use character functions, such as CHAR(13):

CREATE PROCEDURE dbo.Proc_VehicleRegistration
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT
        STRING_AGG(VehicleRegistration, CHAR(13)) AS csv – fails
    FROM Warehouse.VehicleTemperatures
    WHERE Warehouse.VehicleTemperatures.VehicleTemperatureID BETWEEN 65190 AND 65200
END;
GO

CONCAT_WS()

CREATE OR ALTER PROCEDURE dbo.Proc_VehicleTemperatures
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT CONCAT_WS( ' - ', VehicleTemperatureID, VehicleRegistration) AS DatabaseInfo
    FROM Warehouse.VehicleTemperatures

END;
GO

EXEC dbo.Proc_VehicleTemperatures
GO

TRIM()

CREATE OR ALTER PROCEDURE dbo.Proc_TrimTest
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT TRIM(VehicleRegistration) AS col1
    FROM Warehouse.VehicleTemperatures

END;
GO
EXEC dbo.Proc_TrimTest
GO

TRANSLATE()

CREATE OR ALTER PROCEDURE dbo.Proc_TranslateTest
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT TRANSLATE('2*[3+4]/{7-2}', '[]{}', '()()');
END;
GO
EXEC dbo.Proc_TranslateTest
GO

sys.dm_db_stats_histogram()

CREATE STATISTICS stat_VehicleTemperatures ON Warehouse.VehicleTemperatures(VehicleRegistration)

SELECT s.object_id, OBJECT_NAME(s.object_id), hist.step_number, hist.range_high_key, hist.range_rows, 
    hist.equal_rows, hist.distinct_range_rows, hist.average_range_rows
FROM sys.stats AS s
CROSS APPLY sys.dm_db_stats_histogram(s.[object_id], s.stats_id) AS hist
WHERE OBJECT_NAME(s.object_id) = 'VehicleTemperatures'

My wish list for the In-Memory OLTP feature is huge, but it’s great to see Microsoft continually improve and extend it.

All about In-Memory isolation levels, Part 2

3 Replies

In the Part 1, we covered the basics of transaction initiation modes and isolation levels. Today we’ll continue with more details about isolation levels and initiation modes for memory-optimized tables, and finally we’ll see how to reference both types of tables in a query.

But first, let’s summarize supported isolation levels.

List 1:

Last time, we had this for “List 2”:

But that’s not the whole truth – the complete picture of isolation levels and initiation modes for memory-optimized tables is summarized in the following table:

In Part 1, we said that READ COMMITED is supported for memory-optimized tables, but we didn’t explain how. Here we can see that it’s supported, but only for single statement, “autocommit” transactions. Autocommit transactions are not possible within a native module, so you’re limited to interpreted TSQL (un-compiled), as indicated in the table above.

Let’s work through an example.

If the transaction isolation level is set to READ COMMITED SNAPSHOT – which, as detailed in the last post, can only be set with an ALTER DATBASE command – then you can execute the following:

SELECT * 
FROM dbo.InMemTable1

That’s a single statement that will be autocommitted.

But you cannot execute the following:

BEGIN TRAN

  SELECT * 
  FROM dbo.InMemTable1 

COMMIT

Why will it fail?

It will fail because the initiation mode of this transaction is not autocommit, which is required for READ COMMITED SNAPSHOT when referencing memory-optimized tables (the initiation mode is explicit, because we explicitly defined a transaction). So to be totally clear, for queries that only reference memory-optimized tables, we can use the READ COMMITTED or READ COMMITTED SNAPSHOT isolation levels, but the transaction initiation mode must be autocommit. Keep this in mind, because in a moment, you’ll be questioning that statement….

Now let’s put it all together and understand the rules for referencing on-disk and memory-optimized tables in the same query.

Cross-Container

A query that references both on-disk and memory-optimized tables is known as a “cross-container” transaction.

The following table lists the rules:

If the on-disk or database isolation level is READ UNCOMMITTED, READ COMMITTED, or READ COMMITTED SNAPSHOT, then you can reference memory-optimized tables using SNAPSHOT, REPEATABLE READ, or SERIALIZABLE isolation levels. An example would be:

BEGIN TRANSACTION

  SELECT *
  FROM dbo.OnDiskT1 (READCOMMITTED)
  INNER JOIN dbo.InMemT1 WITH (SNAPSHOT) ON InMemT1.pk = OnDiskT1.pk

ROLLBACK

But wait – a moment ago we proved that when we use the READ COMMITTED isolation level, and we query a memory-optimized table, the transaction initiation mode must be autocommit. The code we just displayed uses an explicit transaction to do the same thing, so we’ve got some explaining to do….

The answer is that for queries that only reference memory-optimized tables, we must use autocommit. But the rules are different for cross-container transactions, and in that case, we can use explicit transactions.

Back to SNAPSHOT

What if we converted some tables to be memory-optimized, and they were referenced everywhere in our TSQL code?

Would we have to change all of our code to use WITH (SNAPSHOT)?

Fear not, there is a solution, and it’s a database setting known as MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT. If this option is enabled, then you don’t have to change your code to use WITH (SNAPSHOT) for interop statements. The engine will automagically elevate the isolation level to SNAPSHOT for all memory-optimized tables that use interop/cross-container. More information on MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT is available at this link.

Just to recap what we covered last time about the different forms of snapshot isolation:

READ COMMITTED SNAPSHOT isolation is “statement-level consistency”
SNAPSHOT isolation is “transaction-level consistency”

A cross-container transaction doesn’t support snapshot isolation, because it’s actually two sub-transactions, each with its own internal transaction ID and start time. As a result, it’s impossible to synchronize transaction-level consistency between on-disk and memory-optimized tables.

Wrapping up

In the first post on transaction processing, we covered isolation levels for both on-disk and memory-optimized tables, but we left out some of the details for memory-optimized tables. This post has filled in those details, and also introduced the rules for cross-container transactions.

All about In-Memory isolation levels, Part 1

1 Reply

Transaction initiation modes

If you want to understand the details of transaction isolation requirements for memory-optimized tables, you must first understand transaction initiation modes. That’s because the initiation mode affects what type of isolation levels are possible when referencing memory-optimized tables.

There are four different modes that describe the way in which a transaction is initiated:

Atomic Block – An atomic block is a unit of work that occurs within a natively compiled module (procedure, function, or trigger). Native modules can only reference memory-optimized tables.

Explicit – We’re all familiar with this mode, because it requires defining an explicit beginning for the transaction, and then either a commit or rollback.

BEGIN TRANSACTION

IF 
    COMMIT
ELSE
    ROLLBACK

Implicit – We’ll cover this mode for the sake of completeness, but I’ve not seen an implicit transaction in all my years of SQL Server. Implicit transactions require you to SET IMPLICIT_TRANSACTIONS ON, which then – for specific types of TSQL statements – has the effect of beginning a transaction for you. It’s only benefit is that it spares you from having to write your own BEGIN TRAN statement (woo hoo).

Documentation for implicit transactions can be found here.

Autocommit – If you execute TSQL statements outside of an explicit or implicit transaction, and outside of an atomic block, then for each individual statement, the SQL Server engine starts a transaction. That transaction is automatically committed or rolled back.

An example of an autocommit transaction would be:

UPDATE dbo.MyTable
SET name = ‘Joseph’
WHERE TableID = 5

We did not create an explicit transaction with BEGIN TRAN, and we didn’t SET IMPLICIT_TRANSACTIONS ON, which would have allowed the engine to implicitly start a transaction. Therefore, this TSQL statement will be automatically committed or rolled back by the engine.

Isolation levels

Now that we have a basic understanding of transaction initiation modes, let’s move on to isolation levels. Isolation levels are what determine whether certain “concurrency side effects” are allowed, such as dirty reads (uncommitted data), or phantom reads. Please refer to the SQL Server documentation on isolation levels at this link or this link for specific details.

First, let’s display which types of isolation levels are available for each type of table.

List 1:

“Snapshot”

One thing I want to clear up right way, is how freely the word “snapshot” is used in the SQL Server documentation, the technology world in general, and how confusing this label is in the context of transaction isolation levels.

Some editions of SQL Server have the ability to create database snapshots, which use NTFS sparse files to make a “copy on write”, read-only version of a database. This type of snapshot has absolutely nothing to do with isolation levels.

The word “snapshot” is also used to describe saving the state of a virtual machine, i.e. Hyper-V, VMware, etc.

And there are also SAN snapshots, which create an image of your storage at a fixed point in time. Again, none of these types of snapshots have anything to do with isolation levels in SQL Server.

There are two variations of snapshot isolation in SQL Server:

statement-level consistency – Within the context of a transaction, each statement sees consistent data as of the moment the statement executed. Other transactions can modify data while your transaction is executing, potentially affecting results.
transaction-level consistency – All data that is referenced within the context of a transaction is guaranteed to be consistent as of the transaction start time. While your transaction is executing, modifications by other transactions cannot be seen by any statement within your transaction. When you attempt to COMMIT there can be conflicts, but we won’t cover that in this post.

Statement-level consistency is known as “read committed snapshot isolation”, while transaction-level consistency is known as “snapshot isolation”. Both can be enabled at the database level, while only transaction-level consistency can be set with the SET TRANSACTION ISOLATION command.

(wrapping your brain around variations of snapshot isolation will help you understand some of the nuances in the next post)

List 2*:

(*READ COMMITTED isolation is supported for memory-optimized tables, and we’ll cover that in the next post, but for now let’s concentrate on the isolations listed here)

If you are only querying on-disk tables, you can use any of the isolations levels from List 1. And if you are only querying memory-optimized tables, you can use any of the isolation levels from List 2.

But what if you want to reference both on-disk and memory-optimized tables in the same query? Of course, the answer is “it depends”, with transaction initiation modes and isolation levels being the components of that dependency.

As mentioned earlier, you can’t use native compilation to reference both on-disk and memory-optimized tables – instead you must use interpreted TSQL, otherwise known as “interop”. In the next post we’ll discuss the requirements for using interop to reference both on-disk and memory-optimized tables within a single query.

Using temporal memory-optimized tables

1 Reply

The temporal feature works for both on-disk and memory-optimized tables, but has a slightly different implementation for memory-optimized tables.

As changes are made to rows in the temporal memory-optimized table, before being transferred to the history table on disk, they are first migrated to an internal memory-optimized staging table. That means when you query the “history table”, you could be retrieving rows from both the on-disk history table, and internal staging table. Because no custom indexing was possible on the internal staging table, there could be performance implications when executing queries against historical data. Microsoft addressed these potential performance issues in SQL 2016 SP1 (detailed in this CAT blog post).

The internal staging table only gets flushed to the on-disk history table when it reaches 8% of the size of the temporal table. Given the current capacities of Windows Server 2016 (24TB memory), it’s now possible to have very large memory-optimized tables. 8% of one of those large memory-optimized tables could be quite large, which will affect query performance, if predicates don’t match available indexes.

As of SP1 you can address the performance issues by adding (multiple) indexes to the internal staging table, and while that’s a fantastic improvement, there are some things to be aware of:

The fix requires Trace Flag 10316, so it’s one more thing you’ll have to remember to configure in your environments.
The name of the internal staging table is dynamic. This means that the script you maintain to recreate indexes on the internal table must first determine the name of the internal staging table (the CAT post details how to do this). And you’ll have the same issue for failover, whether it’s FCI or AG.

Now imagine you have dozens (or multiple dozens) of memory-optimized tables that use the temporal feature, and you want to customize indexing differently on all of them. The current SP1 solution doesn’t seem like such a great fix when DBAs must maintain dozens of scripts to apply custom indexing upon server reboot or failover.

There is perhaps a simpler solution, and that would be to monitor the number of rows in the internal staging table, and flush it to disk once it hits a threshold. Don’t assume that you’ll want to flush at the same threshold for all tables. And this won’t solve the custom indexing issue, but it could make queries perform less-worse until they’re flushed to the history table, where presumably you already have custom indexing in place. But again, you’ll have the dynamic table name issue.

You’d have to create a SQL Agent job that checks the rowcount for internal staging tables, and then call sys.sp_xtp_flush_temporal_history if required.

Your script would have to be intelligent enough to determine which tables are memory-optimized, and whether or not a given memory-optimized table uses the temporal feature.

And when you add/remove the temporal feature for a memory-optimized table, you’ll have to remember to update the custom script and Agent job. And of course this custom script will have to be executed upon reboot and/or failover.

This is just one more thing to be aware of when you consider deploying the temporal feature with In-Memory OLTP.

DML for memory-optimized tables in partitioned views

4 Replies

As part of a conversation on the #sqlhelp, I was giving feedback to a member of the Microsoft Tiger team about how In-Memory OLTP might be improved. One of my responses was about implementing a hot/cold data solution, and Adam Machanic (b|t) responded, suggesting that partitioned views might work. I had already tested partitioned views and memory-optimized tables for queries, and confirmed that it worked as expected, but I had never tested DML for this combination of features.

Based on my new testing, I can state with confidence that partitioned view support for DML with memory-optimized tables will properly handle INSERT and DELETE operations, but some UPDATE operations might fail, and that failure occurs in a most ungraceful way. After verifying that this is an issue on SQL 2016 SP1/CU3 and SQL 2017 CTP2.1, I filed this Connect item. Microsoft has confirmed this issue, but has not yet determined how it will be fixed. For example, they could decide to disallow all UPDATEs when a memory-optimized table belongs to a partitioned view, or instead decide to support it under limited circumstances. I’ll update this post when I have more detail from Microsoft.

Let’s assume that we have two tables that belong to a partitioned view. Both tables can be memory-optimized, or one table can be memory-optimized, and the other on-disk.

Success condition

an UPDATE occurs to a row in a table, and the UPDATE does not change where the row would reside, i.e. does not cause it to “move” to another table, based on defined CONSTRAINTS

Failure conditions:

a. UPDATE occurs to a row in the memory-optimized table that causes it to move to either another memory-optimized table, or a on-disk table

b. UPDATE occurs to a row in the on-disk table that causes it to move to the memory-optimized table

The failure looks like this:

Repro script:

/*
	Repro for UPDATEs on a partitioned view
*/

SET NOCOUNT ON 

DROP VIEW IF EXISTS dbo.vwPartitioned;
GO
DROP TABLE IF EXISTS dbo.Table1;
GO
DROP TABLE IF EXISTS dbo.Table2;
GO

CREATE TABLE dbo.Table1
(
    TxDate DATE
   ,OrderID INT 
   CONSTRAINT PK_Table1 PRIMARY KEY NONCLUSTERED (TxDate, OrderID)
)
--WITH (DURABILITY = SCHEMA_AND_DATA, MEMORY_OPTIMIZED = ON);
GO
CREATE TABLE dbo.Table2
(
    TxDate DATE
   ,OrderID INT
   CONSTRAINT PK_Table2 PRIMARY KEY NONCLUSTERED (TxDate, OrderID)
)
WITH (DURABILITY = SCHEMA_AND_DATA, MEMORY_OPTIMIZED = ON);
GO

INSERT dbo.Table1
(
    TxDate
   ,OrderID
)
VALUES
 ('2014-01-01', 1)
,('2014-01-10', 2)
,('2014-03-15', 3)
,('2014-07-20', 4);

INSERT dbo.Table2
(
    TxDate
   ,OrderID
)
VALUES
 ('2015-01-07', 5)
,('2015-02-10', 6)
,('2015-03-18', 7)
,('2015-10-28', 8);
GO

CREATE VIEW dbo.vwPartitioned
WITH SCHEMABINDING AS
(
	SELECT
	    TxDate
	   ,OrderID
	FROM dbo.Table1
	UNION ALL
	SELECT
	    TxDate
	   ,OrderID
	FROM dbo.Table2
);
GO
SET STATISTICS IO ON;
GO

ALTER TABLE dbo.Table2 
ADD CONSTRAINT CC_Table2_TxDate CHECK (TxDate >= '2015-01-01');
GO
ALTER TABLE dbo.Table1 
ADD CONSTRAINT CC_Table1_TxDate CHECK (TxDate < '2015-01-01');
GO

ALTER VIEW dbo.vwPartitioned
WITH SCHEMABINDING AS
(
	SELECT
	    TxDate
	   ,OrderID
	FROM dbo.Table1
	UNION ALL
	SELECT
	    TxDate
	   ,OrderID
	FROM dbo.Table2
);
GO

INSERT dbo.vwPartitioned
(
    TxDate
   ,OrderID
)
VALUES
 ('2014-01-13', 91)
,('2015-01-08', 85);
GO
UPDATE dbo.vwPartitioned 
SET OrderID = 2
WHERE TxDate = '2014-01-13';
GO
DELETE dbo.vwPartitioned
WHERE TxDate = '2014-01-13';
GO
-- check location of rows
SELECT *
FROM dbo.Table2
ORDER BY OrderID
GO
SELECT *
FROM dbo.Table1
ORDER BY OrderID
GO

-- This UPDATE succeeds, because it does NOT attempt to "move" the row to another table
UPDATE dbo.vwPartitioned
SET OrderID = 9999
WHERE TxDate = '2015-01-07';
GO

-- This UPDATE fails with a stack dump, due to row "movement"
UPDATE dbo.vwPartitioned
SET TxDate = '2015-01-01'
WHERE TxDate = '2014-01-01';
GO

-- This UPDATE fails with a stack dump, due to row "movement"
UPDATE dbo.vwPartitioned
SET TxDate = '2014-01-01'
WHERE OrderID = 9999;
GO

-- check location of rows
SELECT *
FROM dbo.Table2
ORDER BY OrderID
GO
SELECT *
FROM dbo.Table1
ORDER BY OrderID

Which events cause durable memory-optimized data to be streamed to memory?

1 Reply

Those who have studied In-Memory OLTP are aware that in the event of “database restart”, durable memory-optimized data must be streamed from disk to memory. But that’s not the only time data must be streamed, and the complete set of events that cause this is not intuitive. To be clear, if your database had to stream data back to memory, that means all your memory-optimized data was cleared from memory. The amount of time it takes to do this depends on:

the amount of data that must be streamed
the number of indexes that must be rebuilt
the number of containers in the memory-optimized database, and how many volumes they’re spread across
how many indexes must be recreated (SQL 2017 has a much faster index rebuild process, see below)
the number of LOB columns
BUCKET count being properly configured for HASH indexes

The following list is for standalone SQL Server instances (under some circumstances, the streaming requirements are different between FCIs and AGs).

Database RESTORE – this is a no brainer – if you restore a database with durable memory-optimized data, of course your data must be streamed from disk into memory. And if you are under the impression that SQL Server verifies if your server has enough memory to complete the RESTORE, you would be mistaken. See my post here.

SQL Service restart – in this case, all databases will go through the recovery process, and all memory-optimized databases will stream durable memory-optimized data to memory.

Server reboot – same as “SQL Service restart”

In addition to the list above, there are a few database settings that cause data to be streamed.

Changing a database from READ_ONLY to READ_WRITE, or from READ_WRITE to READ_ONLY
Setting READ_COMMITTED_SNAPSHOT OFF or ON
Taking a database OFFLINE and then ONLINE

A database that contains durable memory-optimized data will not be online until all memory-optimized data is finished streaming, which affects the availability of “traditional” tables (while a database is waiting for streaming to complete, the wait type is “WAIT_XTP_RECOVERY”). There’s nothing you can do to speed up the process, other than having initially defined enough containers on enough volumes, so that streaming executes in parallel.

SQL 2017 enhancements

Because modifications to HASH and NONCLUSTERED/RANGE indexes are not logged for memory-optimized tables, they must be rebuilt when data is streamed to memory. Both SQL 2014 and SQL 2016 have a limit of 8 NONCLUSTERED indexes per memory-optimized table (any combination of HASH and RANGE). Microsoft has designed a new process for enhancing index rebuild speed in SQL 2017. This dovetails perfectly with the removal of the 8-indexes-per-table limit in SQL 2017 (I have personally created a table with 298 NONCLUSTERED indexes in SQL 2017).

Migrating tables to In-Memory OLTP

6 Replies

One of the first things you might do when considering migrating a table to In-Memory OLTP, is to run the “Transaction Performance Analysis Overview” report:

Then you arrive at the following window:

If you click on “Tables Analysis”, a sort of “magic quadrant” appears:

The chart attempts to display both the best candidates and the degree of difficulty for migration. But there are a few problems with the “difficulty” rating of this internal query, and as a result, if we really want to know how easy/difficult the process might be, we’ll have to roll our own query.

The original query counts the following conditions (the list is out of order on purpose):

nullable columns
LOB data types, i.e. NVARCHAR(MAX)
unique indexes/constraints
default constraints
check constraints
UDT
foreign keys
triggers
index with type > 2
computed columns
sparse columns
IDENTITY <> 1, 1
assembly
unsupported data types, i.e. DATETIMEOFFSET
filestream columns

Some of those conditions are 100% supported in SQL 2016, without any issue, while others have variable levels of migration difficulty, and still others are not supported at all. But even if we remove the items that are completely supported, all of the remaining items have the same weight. That could be pretty misleading, and might cause you to rule out migrating a table to In-Memory that could potentially be an excellent candidate.

Now let’s look at the list in another way:

**************************
supported without any issues
**************************
nullable columns
LOB data types, i.e NVARCHAR(MAX)
unique indexes/constraints

**********************************************************************
supported with a range of migration difficulty (from no difficulty to very difficult)
**********************************************************************
default constraints
check constraints
UDT
foreign keys
triggers

index with type > 2
0 = Heap
1 = Clustered
2 = Nonclustered
3 = XML
4 = Spatial
5 = Clustered columnstore index
6 = Nonclustered columnstore index
7 = Nonclustered hash index

********************
unsupported/blocking
********************
computed columns
sparse columns
IDENTITY <> 1, 1
assembly
unsupported data types, i.e. DATETIMEOFFSET
filestream columns

My version of the script removes the checks for nullable and LOB columns, and also for UNIQUE indexes/constraints.

And for the remaining conditions, since they’re all weighted the same by virtue of counting them, I wanted to place them in different buckets. After running my script on a sample database, I can see that the AuditTrail table has the following potential migration “issues”:

There are a total of 8 issues, although migrating default constraints, user-defined data types, and LOB columns will be easy. It’s the foreign keys that might prove difficult, potentially leading to a long chain of tables that would have to be migrated (because foreign key constraints on a memory-optimized table can only reference other memory-optimized tables).

We definitely have a much clearer picture of the potential migration difficulties. With this version of the script, you can make informed choices about which tables you might want to migrate to In-Memory OLTP.

Also note that computed columns are supported in SQL 2017, so this script could have some intelligence added to allow for that.

/*
    Ned Otter
    www.NedOtter.com
    21 June 2017
        
    THIS CODE AND INFORMATION ARE PROVIDED "AS IS" WITHOUT WARRANTY OF 
    ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED 
    TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A
    PARTICULAR PURPOSE. 

    In-Memory OLTP can improve workload performance for highly concurrent, write-intensive workloads.
    This script attempts to determine the locking/latching characteristics, and migration difficulty.

    This script is my version of the code that's generated when you run the
    "Transaction Performance Analysis Overview Report" for tables. In its original form, 
    that script counts the number of conditions that might affect migration, giving them all equal weight. 
    But migrating a user-defined-datatype will be very easy, and migrating foreign keys could be very difficult. 
    The second result set breaks the issues into separate buckets, so you can more easily decide whether or not a table 
    is a good candidate for migration. 

    It does not attempt to determine anything about tempdb contention, which In-Memory can also help with, 
    and it does not attempt to determine which objects might benefit from Native Compilation.


    Instructions:

    Step 1: 
        Execute the first query, so you can determine which tables have high levels of latch/lock activity. 
        NOTE: The results of the first query will be meaningless, unless you execute it in production, or on a test 
        system that has recent production-like activity. 

    Step 2: 
        Based on the output of the first query, adjust the @table_id for the second query.
*/


/*
    Question 1: How much locking/latching does this table have?
*/

/***************************************************
    Performance statistics, specifically latch/lock
***************************************************/
DECLARE @table_id INT --= OBJECT_ID('dbo.YourTable');

DECLARE @sum_table TABLE
(
    database_id              INT NOT NULL
   ,page_latch_lock_wait_sum INT NULL
);

INSERT INTO @sum_table
SELECT i.database_id
      ,ISNULL(NULLIF(SUM(i.page_latch_wait_in_ms) + SUM(i.page_lock_wait_in_ms), 0), 1) AS page_latch_lock_wait_sum
FROM sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) i
WHERE i.object_id IN (SELECT object_id FROM sys.tables)
   OR @table_id IS NULL
GROUP BY i.database_id;

SELECT SCHEMA_NAME(t.schema_id) AS schema_name
      ,t.name AS table_name
      ,SUM(i.page_latch_wait_count) AS page_latch_wait_count
      ,SUM(i.page_lock_count) AS page_lock_count
      ,SUM(i.page_lock_wait_count) AS page_lock_wait_count
      ,SUM(ROUND(CAST(i.page_latch_wait_in_ms AS FLOAT) / CAST((CASE i.page_latch_wait_count WHEN 0 THEN 1 ELSE i.page_latch_wait_count END) AS FLOAT), 2)) AS avg_time_per_page_latch_wait
      ,SUM(ROUND(CAST(i.page_lock_wait_in_ms AS FLOAT) / CAST((CASE i.page_lock_wait_count WHEN 0 THEN 1 ELSE i.page_lock_wait_count END) AS FLOAT), 2)) AS avg_time_per_page_lock_wait
      ,SUM(CAST(ROUND((page_latch_wait_in_ms + page_lock_wait_in_ms) * 100.0 / st.page_latch_lock_wait_sum, 2) AS DECIMAL(5, 2))) AS percentage_of_wait_for_entire_db
FROM sys.dm_db_index_operational_stats(DB_ID(), NULL, NULL, NULL) i
INNER JOIN sys.tables t ON t.object_id = i.object_id
INNER JOIN sys.indexes ix ON ix.object_id = i.object_id
                   AND ix.index_id = i.index_id
INNER JOIN @sum_table st ON st.database_id = i.database_id
WHERE t.object_id = @table_id
   OR @table_id IS NULL
GROUP BY t.schema_id
        ,t.object_id
        ,t.name
ORDER BY page_latch_wait_count DESC;
GO

/*
Evaluation 

Question 2: How difficult will the migration be?

    ******************************
    supported without any issues 
    ******************************
        nullable columns
        LOB data types, i.e NVARCHAR(MAX)
        unique indexes/constraints

    ******************************
    supported with varying degrees of migration difficulty
    ******************************
        default constraints           -- easy
        check constraints           -- easy
        UDT                           -- easy

        foreign keys               -- potentially difficult
        triggers                   -- potentially difficult/impossible

        index with type > 2        -- might not be supported, i.e. XML, Spatial
            0 = Heap               
            1 = Clustered
            2 = Nonclustered    
            3 = XML
            4 = Spatial
            5 = Clustered columnstore index. Applies to: SQL Server 2014 through SQL Server 2017.
            6 = Nonclustered columnstore index. Applies to: SQL Server 2012 through SQL Server 2017.
            7 = Nonclustered hash index. Applies to: SQL Server 2014 through SQL Server 2017.                   

    ******************************
    unsupported/blocking
    ******************************
        computed columns (SQL 2017+ supports computed 
        sparse columns
        IDENTITY <> 1, 1
        assembly
        unsupported data types, i.e. DATETIMEOFFSET
        filestream columns

*/

DECLARE @table_id INT --= OBJECT_ID('dbo.YourTable');

DECLARE @MemoryOptimizedTables TABLE
(
    object_id INT NOT NULL
);

INSERT INTO @MemoryOptimizedTables
SELECT [object_id]
FROM sys.tables
WHERE [is_memory_optimized] = 1;

DECLARE @Tables TABLE
(
    [table_id]                 INT NOT NULL
   ,[CheckConstraintCount]     INT NULL
   ,[DefaultConstraintCount]   INT NULL
   ,[ForeignKeyCount]          INT NULL
   ,[ProblemIndexCount]        INT NULL
   ,[ComputedColumnCount]      INT NULL
   ,[IdentityViolations]       INT NULL
   ,[UnsupportedDataTypeCount] INT NULL
   ,[FilestreamColumnCount]    INT NULL
   ,[LOBColumnCount]           INT NULL
   ,[AssemblyTypeCount]        INT NULL
   ,[UDTColumnCount]           INT NULL
   ,[SparseColumnCount]        INT NULL
   ,[TriggerCount]             INT NULL
);

INSERT @Tables
(
    table_id
)
SELECT [object_id]
FROM sys.tables
WHERE [is_memory_optimized] = 0
  AND @table_id IS NULL OR [object_id] = @table_id;

DECLARE @DefaultContraintTables TABLE
(
    parent_object_id INT NOT NULL
);

INSERT @DefaultContraintTables
SELECT [parent_object_id]
FROM sys.default_constraints;

WITH CheckConstraints
AS
(
    SELECT t.parent_object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.check_constraints t
    JOIN sys.tables tbl ON tbl.object_id = t.parent_object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.parent_object_id, DB_ID()) IS NOT NULL
      AND t.parent_object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.parent_object_id
)
UPDATE @Tables
SET [CheckConstraintCount] = CheckConstraints.CountRows
FROM @Tables AS Tables
INNER JOIN CheckConstraints ON CheckConstraints.table_id = Tables.table_id;

WITH DefaultConstraints
AS
(
    SELECT t.parent_object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.default_constraints t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.parent_object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.parent_object_id, DB_ID()) IS NOT NULL
      AND t.parent_object_id IN (SELECT * FROM @DefaultContraintTables)
      AND t.parent_object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.parent_object_id
)
UPDATE @Tables
SET DefaultConstraintCount = DefaultConstraints.CountRows
FROM @Tables AS Tables
INNER JOIN DefaultConstraints ON DefaultConstraints.table_id = Tables.table_id;

WITH ForeignKeys
AS
(
    SELECT t.parent_object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.foreign_keys t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.parent_object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.parent_object_id, DB_ID()) IS NOT NULL
      AND t.parent_object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.parent_object_id
)
UPDATE @Tables
SET ForeignKeyCount = ForeignKeys.CountRows
FROM @Tables AS Tables
INNER JOIN ForeignKeys ON ForeignKeys.table_id = Tables.table_id

/*
    Type of index:
    0 = Heap
    1 = Clustered
    2 = Nonclustered
    3 = XML
    4 = Spatial
    5 = Clustered columnstore index. Applies to: SQL Server 2014 through SQL Server 2017.
    6 = Nonclustered columnstore index. Applies to: SQL Server 2012 through SQL Server 2017.
    7 = Nonclustered hash index. Applies to: SQL Server 2014 through SQL Server 2017.
*/
;

WITH Indexes
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(tbl.object_id) AS CountRows
    FROM sys.indexes t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    WHERE tbl.is_ms_shipped = 0
      AND OBJECT_NAME(t.object_id, DB_ID()) IS NOT NULL
      --AND (((t.is_unique = 1 OR t.is_unique_constraint = 1) AND t.is_primary_key = 0) OR 
      --AND t.type > 2
      AND t.type NOT IN (0, 1, 2, 5) -- clustered columnstore is supported for in-memory, but it's very different. 
      AND t.object_id NOT IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET ProblemIndexCount = Indexes.CountRows
FROM @Tables AS Tables
INNER JOIN Indexes ON Indexes.table_id = Tables.table_id;

-- nullable columns are supported....
--  SELECT 
--t.object_id AS [table_id], 
--      COUNT(*) AS CountRows
--  FROM sys.index_columns ic
--  JOIN sys.tables t ON t.object_id = ic.object_id
--  JOIN sys.all_columns c ON ic.column_id = c.column_id AND t.object_id = c.object_id
--  WHERE t.is_ms_shipped = 0 AND c.is_nullable = 1
--  UNION

WITH Triggers
AS
(
    SELECT tb.object_id AS [table_id]
          ,COUNT(tr.object_id) AS CountRows
    FROM sys.triggers tr
    INNER JOIN sys.tables tb ON tr.parent_id = tb.object_id
    WHERE tr.is_ms_shipped = 0
      AND tr.parent_class = 1
      AND tb.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY tb.object_id
)
UPDATE @Tables
SET TriggerCount = Triggers.CountRows
FROM @Tables AS Tables
INNER JOIN Triggers ON Triggers.table_id = Tables.table_id;


/*
    NOTE:
        If migrating to SQL 2017, computed columns are supported.
*/

WITH ComputedColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND (t.is_computed = 1)
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET ComputedColumnCount = ComputedColumns.CountRows
FROM @Tables AS Tables
INNER JOIN ComputedColumns ON ComputedColumns.table_id = Tables.table_id;

WITH SparseColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND (t.is_sparse = 1)
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET SparseColumnCount = SparseColumns.CountRows
FROM @Tables AS Tables
INNER JOIN SparseColumns ON SparseColumns.table_id = Tables.table_id;

WITH IdentityViolation
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(*) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    LEFT JOIN sys.identity_columns ic ON t.object_id = ic.object_id
                                     AND t.column_id = ic.column_id
    WHERE tbl.is_ms_shipped = 0
      AND 
      (
           t.is_identity = 1
      AND (ic.increment_value != 1 OR ic.seed_value != 1)
      )
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET IdentityViolations = IdentityViolation.CountRows
FROM @Tables AS Tables
INNER JOIN IdentityViolation ON IdentityViolation.table_id = Tables.table_id;

WITH UDTColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND t.user_type_id != t.system_type_id
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE
    @Tables
SET
    UDTColumnCount = UDTColumns.CountRows
FROM @Tables AS Tables
INNER JOIN UDTColumns ON UDTColumns.table_id = Tables.table_id;

WITH AssemblyTypes
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND tp.is_assembly_type = 1
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET AssemblyTypeCount = AssemblyTypes.CountRows
FROM @Tables AS Tables
INNER JOIN AssemblyTypes ON AssemblyTypes.table_id = Tables.table_id;

WITH UnsupportedDataTypes
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND UPPER(tp.name) IN ( 'DATETIMEOFFSET', 'GEOGRAPHY', 'GEOMETRY', 'SQL_VARIANT', 'HIERARCHYID', 'XML', 'IMAGE', 'XML', 'TEXT', 'NTEXT', 'TIMESTAMP' )
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET [UnsupportedDataTypeCount] = UnsupportedDataTypes.CountRows
FROM @Tables AS Tables
INNER JOIN UnsupportedDataTypes ON UnsupportedDataTypes.table_id = Tables.table_id;

WITH FileStream
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND t.is_filestream = 1
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET FilestreamColumnCount = FileStream.CountRows
FROM @Tables AS Tables
INNER JOIN FileStream ON FileStream.table_id = Tables.table_id;

WITH LOBColumns
AS
(
    SELECT t.object_id AS [table_id]
          ,COUNT(t.object_id) AS CountRows
    FROM sys.all_columns t
    INNER JOIN sys.tables tbl ON tbl.object_id = t.object_id
    INNER JOIN sys.types tp ON t.user_type_id = tp.user_type_id
    WHERE tbl.is_ms_shipped = 0
      AND t.max_length = -1
      AND t.object_id IN (SELECT table_id FROM @Tables)
    GROUP BY t.object_id
)
UPDATE @Tables
SET LOBColumnCount = LOBColumns.CountRows
FROM @Tables AS Tables
INNER JOIN LOBColumns ON LOBColumns.table_id = Tables.table_id;

SELECT
    OBJECT_NAME(table_id) AS TableName
   ,IIF(CheckConstraintCount IS NULL, 0, CheckConstraintCount) + 
    IIF(DefaultConstraintCount IS NULL, 0, DefaultConstraintCount) + 
    IIF(ForeignKeyCount IS NULL, 0, ForeignKeyCount) + 
    IIF(ProblemIndexCount IS NULL, 0, ProblemIndexCount) + 
    IIF(ComputedColumnCount IS NULL, 0, ComputedColumnCount) + 
    IIF(IdentityViolations IS NULL, 0, IdentityViolations) + 
    IIF(UnsupportedDataTypeCount IS NULL, 0, UnsupportedDataTypeCount) + 
    IIF(LOBColumnCount IS NULL, 0, LOBColumnCount) + 
    IIF(AssemblyTypeCount IS NULL, 0, AssemblyTypeCount) + 
    IIF(UDTColumnCount IS NULL, 0, UDTColumnCount) + 
    IIF(SparseColumnCount IS NULL, 0, SparseColumnCount) + 
    IIF(TriggerCount IS NULL, 0, TriggerCount) AS IssueCount
   ,CheckConstraintCount
   ,DefaultConstraintCount
   ,ForeignKeyCount
   ,UDTColumnCount
   ,LOBColumnCount
   ,ProblemIndexCount
   ,ComputedColumnCount
   ,IdentityViolations
   ,UnsupportedDataTypeCount
   ,FilestreamColumnCount
   ,AssemblyTypeCount
   ,SparseColumnCount
   ,TriggerCount
FROM @Tables
--WHERE OBJECT_NAME(table_id) = 'YourTable'
ORDER BY IssueCount DESC;

SQL Server specialization: the journey

Availability Groups and Native Compilation

3 Replies

For disk-based tables, query plans for interpreted/traditional stored procedures will be recompiled when statistics have changed. That’s because when you update statistics, cached query plans for interpreted stored procedures are invalidated, and will automatically recompile the next time they’re executed. That’s true for an interpreted stored procedure that references disk-based tables, and/or memory-optimized tables.

As of SQL 2016, the database engine automatically updates statistics for memory-optimized tables (documentation here), but recompilation of native modules must still be performed manually. But hey, that’s way better than SQL 2014, when you couldn’t recompile at all; you had to drop/recreate the native module. And natively compiled stored procedures don’t reside in the plan cache, because they are executed directly by the database engine.

This post attempts to determine if the requirement to manually recompile native modules is any different for AG secondary replicas.

Stats on the primary

Statistics that are updated on the primary replica will eventually make their way to all secondary replicas. This blog post by Sunil Agarwal details what happens on the secondary replica if the statistics are stale (relative to any temporary statistics that were created on the secondary).

How do we…?

The first question we must answer is: how do you determine when the last time a natively compiled stored procedure was compiled?

We can do that by checking the value of the cached_time column from the following query:

SELECT *
FROM sys.dm_exec_procedure_stats
WHERE OBJECT_NAME(object_id) = '<YourModule>'

The query is simple, but you won’t get any results unless you enable the collection of stored procedure execution statistics for natively compiled procedures. Execution statistics can be collected at the object level or instance level.

NOTE: Enabling the collection of stored procedure statistics for natively compiled procedures can crush your server, potentially resulting in disastrous performance impact. You must be extremely careful with this method of troubleshooting.

Once you’ve enabled stats collection for native procedures, you should get results from the query above.

How I tested

Here are the steps I executed, after creating an AG that used synchronous mode (containing a single database with a memory-optimized filegroup):

Create a sample table
Insert some rows
Create a natively compiled procedure that selects from the sample table
Execute the native procedure on the primary and secondary (it must be executed at least once in order to have usage stats collected)
Enable collection of stored procedure execution statistics on the primary and secondary replicas
Again execute the native procedure on the primary and secondary
Note the value of sys.dm_exec_procedure_stats.cached_time on the primary and secondary
Recompile the native procedure on the primary
Execute the native procedure on the primary and secondary
Again note the value of sys.dm_exec_procedure_stats.cached_time on the primary and secondary

Results

The cached_time value on the secondary did not get updated when the native module was recompiled on the primary.

What does this mean for DBAs that are responsible for maintaining AGs that use native compilation? It means that when you recompile native modules on the primary replica (which you would always do after updating statistics on the primary), those modules must be recompiled on all secondary replicas. The recompilation on the secondary can be performed manually or perhaps through some automated mechanism. For example, if you have a SQL Agent job on the primary replica to update statistics, one of the job steps might be for marking all natively compiled stored procedures on the secondary for recompilation, using sp_recompile.

How would that job step handle the recompile for all secondary replicas?

Perhaps after defining linked servers, you could do something like:

EXEC SecondaryServer1.msdb.dbo.sp_start_job @job_name = N’Recompile native procs’;

EXEC SecondaryServer2.msdb.dbo.sp_start_job @job_name = N’Recompile native procs’;

But it might be involved to define this for all secondary replicas – it sounds like a topic for another post…..

Happy recompiling –

Ned Otter Blog

SQL Server DBA and Musician

Category Archives: In-Memory OLTP

Using native compilation to insert parent/child tables

SQL 2017 In-Memory roundup

All about In-Memory isolation levels, Part 2

Cross-Container

Back to SNAPSHOT

Wrapping up

All about In-Memory isolation levels, Part 1

Transaction initiation modes

Isolation levels

Using temporal memory-optimized tables

DML for memory-optimized tables in partitioned views

Which events cause durable memory-optimized data to be streamed to memory?

Migrating tables to In-Memory OLTP

SQL Server specialization: the journey

Availability Groups and Native Compilation