Category Archives: Native compilation

Using native compilation to insert parent/child tables

This blog post demonstrates various approaches when using native compilation to insert rows into parent/child tables.

First, let’s create tables named Parent and Child, and relate them with a FOREIGN KEY constraint. Note that the Parent table uses the IDENTITY property for the PRIMARY KEY column.

DROP TABLE IF EXISTS dbo.Child
GO
DROP TABLE IF EXISTS dbo.Parent
GO

CREATE TABLE dbo.Parent
(
     ParentID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
CREATE TABLE dbo.Child
(
     ChildID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,ParentID INT NOT NULL FOREIGN KEY REFERENCES dbo.Parent (ParentID) INDEX IX_Child_ParentID 
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO

Next, we attempt to create a natively compiled procedure that performs an INSERT to the Parent table, and tries to reference the key value we just inserted, with @@IDENTITY.

Scenario 1

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    VALUES
    (
        'Parent1'
       ,'SomeDescription'
    )

    DECLARE @NewParentID INT
    SELECT @NewParentID  = SCOPE_IDENTITY()

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentID
       ,'Child1'
       ,'SomeDescription' 
    )
END
GO

EXEC dbo.Proc_InsertParentAndChild

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

This works, but there are other approaches to solving this problem.

Next, we’ll try to DECLARE a table variable, and OUTPUT the new key value.

Scenario 2

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NewParentID TABLE (ParentID INT NOT NULL)
    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    OUTPUT Inserted.ParentID INTO @NewParentID
    /*
        Msg 12305, Level 16, State 24, Procedure Proc_InsertParentAndChild, Line 7 [Batch Start Line 64]
        Inline table variables are not supported with natively compiled modules.
    */
    
    VALUES
    (
        'Parent1' 
       ,'SomeDescription' 
    ) 
END
GO

But again we have issues with unsupported T-SQL.

Now we’ll try creating a memory-optimized table variable outside the native procedure, and then declare a variable of that type inside the native procedure.

Scenario 3

CREATE TYPE dbo.ID_Table AS TABLE
(
    ParentID INT NOT NULL PRIMARY KEY NONCLUSTERED
)
WITH (MEMORY_OPTIMIZED = ON)

GO

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NewParentID dbo.ID_Table 
    INSERT dbo.Parent
    (
        Name
       ,Description
    )
    OUTPUT Inserted.ParentID INTO @NewParentID
    VALUES
    (
        'Parent1' 
       ,'SomeDescription' 
    )

    DECLARE @NewParentValue INT = (SELECT ParentID FROM @NewParentID)

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentValue
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

This compiles, so now let’s test it.

EXEC dbo.Proc_InsertParentAndChild

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

This works great, but for completeness, we should test other possibilities.

This time, we’ll recreate the tables, but we’ll leave off the IDENTITY property for the Parent table. Instead of IDENTITY, we’ll create a SEQUENCE, and attempt to generate the next value within the native module.

Scenario 4

DROP PROCEDURE IF EXISTS dbo.Proc_InsertParentAndChild  
go
DROP TABLE IF EXISTS dbo.Child
GO
DROP TABLE IF EXISTS dbo.Parent
GO

CREATE TABLE dbo.Parent
(
     ParentID INT PRIMARY KEY NONCLUSTERED – no IDENTITY property used here!
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO
CREATE TABLE dbo.Child
(
     ChildID INT IDENTITY PRIMARY KEY NONCLUSTERED
    ,ParentID INT NOT NULL FOREIGN KEY REFERENCES dbo.Parent (ParentID) INDEX IX_Child_ParentID 
    ,Name CHAR(50) NOT NULL
    ,Description CHAR(100) NOT NULL
)
WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)
GO


CREATE SEQUENCE dbo.ParentSequence AS INT

GO

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    DECLARE @NextParentSequence INT = NEXT VALUE FOR dbo.ParentSequence

    INSERT dbo.Parent
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
         @NextParentSequence
       ,'Parent1' 
       ,'SomeDescription' 
    )

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NextParentSequence
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

/*
    Msg 10794, Level 16, State 72, Procedure Proc_InsertParentAndChild, Line 19 [Batch Start Line 176]
    The operator 'NEXT VALUE FOR' is not supported with natively compiled modules.
*/

But this fails, because as the error states, we can’t use NEXT VALUE FOR within native modules.

Scenario 5

How about if we generate the next value for the sequence outside the module, and pass that value?

Let’s see —

CREATE OR ALTER PROCEDURE dbo.Proc_InsertParentAndChild  
(
    @NewParentValue INT
)
WITH NATIVE_COMPILATION, SCHEMABINDING
AS
BEGIN ATOMIC
WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT,  LANGUAGE = N'English')  

    INSERT dbo.Parent
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
         @NewParentValue
       ,'Parent1' -- Name - char(50)
       ,'SomeDescription' -- Description - char(100)
    )

    INSERT dbo.Child
    (
        ParentID
       ,Name
       ,Description
    )
    VALUES
    (
        @NewParentValue
       ,'Child1'
       ,'SomeDescriptioin' 
    )
END
GO

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID

DECLARE @NextParentSequence INT 
SELECT @NextParentSequence = NEXT VALUE FOR dbo.ParentSequence
EXEC dbo.Proc_InsertParentAndChild  @NextParentSequence

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID
GO

This also works, so we’ll add it to our arsenal. But there’s one weird thing – the value that was inserted into the Parent table is –2147483647, which is probably not what we intended. So we’ll have to tidy up our SEQUENCE a bit.

DROP SEQUENCE dbo.ParentSequence 
GO
CREATE SEQUENCE dbo.ParentSequence AS INT START WITH 1
GO
DECLARE @NextParentSequence INT 
SELECT @NextParentSequence = NEXT VALUE FOR dbo.ParentSequence
EXEC dbo.Proc_InsertParentAndChild  @NextParentSequence

SELECT *
FROM Parent
ORDER BY ParentID

SELECT *
FROM Child
ORDER BY ParentID

Everything looks good now:

In this post, we have verified three different ways to successfully insert into parent/child records, when using native compilation.

SQL 2017 In-Memory roundup

1 Reply

SQL Server 2017 includes enhancements to many features, and some of those enhancements include In-Memory OLTP.

Microsoft states that ALTER TABLE for memory-optimized tables is now “usually substantially faster”. I asked for clarity about that – if it means that ALTER TABLE is faster for the same events that were able to be executed in parallel and minimally logged in SQL 2016, or if there are new ALTER TABLE statements which now execute in parallel. They replied that there is no change to the set of operations that executed in parallel. So the ALTER TABLE commands that executed fast now (theoretically) execute faster.
Up to and including SQL 2016, the maximum number of nonclustered indexes on a memory-optimized table was eight, but that limitation has been removed for SQL 2017. I’ve tested this with almost 300 indexes, and it worked. With this many supported indexes, it’s no wonder they had to….
Enhance the index rebuild performance for nonclustered indexes during database recovery. I confirmed with Microsoft that the database does not have be in SQL 2017 compatibility mode (140) to benefit from the index rebuild enhancement. This type of rebuild happens not only for database restore and failover, but also for other “recovery events” – see my blog post here.
In SQL 2017, memory-optimized tables now support JSON in native modules (functions, procedures and check constraints).
Computed columns, and indexes on computed columns are now supported
TSQL enhancements for natively compiled modules include CASE, CROSS APPLY, and TOP (N) WITH TIES
Transaction log redo of memory-optimized tables is now done in parallel. This has been the case for on-disk tables since SQL 2016, so it’s great that this potential bottleneck for REDO has been removed.
Memory-optimized filegroup files can now be stored on Azure Storage, and you can also backup and restore memory-optimized files on Azure Storage.
sp_spaceused is now supported for memory-optimized tables
And last but definitely not least, drum roll, please…….we can now rename memory-optimized tables and natively compiled modules

While Microsoft continues to improve columnstore indexes for on-disk tables, unfortunately columnstore for memory-optimized tables gets left further and further behind. Case in point would be support for LOB columns for on-disk tables in SQL 2017, but no such support for memory-optimized tables. And my good friend Niko Neugebauer (b|t) just reminded me that computed columns for on-disk CCI are supported in SQL 2017, but they are not supported for in-memory CCI. For an in-depth comparison of columnstore differences between on-disk and memory-optimized tables, see my post here.

In addition to what’s listed above, I tested the following functionality for natively compiled stored procedures:

STRING_AGG()

This works, but you can’t use character functions, such as CHAR(13):

CREATE PROCEDURE dbo.Proc_VehicleRegistration
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT
        STRING_AGG(VehicleRegistration, CHAR(13)) AS csv – fails
    FROM Warehouse.VehicleTemperatures
    WHERE Warehouse.VehicleTemperatures.VehicleTemperatureID BETWEEN 65190 AND 65200
END;
GO

CONCAT_WS()

CREATE OR ALTER PROCEDURE dbo.Proc_VehicleTemperatures
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT CONCAT_WS( ' - ', VehicleTemperatureID, VehicleRegistration) AS DatabaseInfo
    FROM Warehouse.VehicleTemperatures

END;
GO

EXEC dbo.Proc_VehicleTemperatures
GO

TRIM()

CREATE OR ALTER PROCEDURE dbo.Proc_TrimTest
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT TRIM(VehicleRegistration) AS col1
    FROM Warehouse.VehicleTemperatures

END;
GO
EXEC dbo.Proc_TrimTest
GO

TRANSLATE()

CREATE OR ALTER PROCEDURE dbo.Proc_TranslateTest
WITH NATIVE_COMPILATION, SCHEMABINDING AS
BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')
    SELECT TRANSLATE('2*[3+4]/{7-2}', '[]{}', '()()');
END;
GO
EXEC dbo.Proc_TranslateTest
GO

sys.dm_db_stats_histogram()

CREATE STATISTICS stat_VehicleTemperatures ON Warehouse.VehicleTemperatures(VehicleRegistration)

SELECT s.object_id, OBJECT_NAME(s.object_id), hist.step_number, hist.range_high_key, hist.range_rows, 
    hist.equal_rows, hist.distinct_range_rows, hist.average_range_rows
FROM sys.stats AS s
CROSS APPLY sys.dm_db_stats_histogram(s.[object_id], s.stats_id) AS hist
WHERE OBJECT_NAME(s.object_id) = 'VehicleTemperatures'

STRING_AGG()

This works, but you can’t use character functions, such as CHAR(13):

CREATE PROCEDURE dbo.Proc_VehicleRegistration

WITH NATIVE_COMPILATION, SCHEMABINDING AS

BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')

SELECT

STRING_AGG(VehicleRegistration, CHAR(13)) AS csv – fails

FROM Warehouse.VehicleTemperatures

WHERE Warehouse.VehicleTemperatures.VehicleTemperatureID BETWEEN 65190 AND 65200

END;

CONCAT_WS()

CREATE OR ALTER PROCEDURE dbo.Proc_VehicleTemperatures

WITH NATIVE_COMPILATION, SCHEMABINDING AS

BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')

SELECT CONCAT_WS( ' - ', VehicleTemperatureID, VehicleRegistration) AS DatabaseInfo

FROM Warehouse.VehicleTemperatures

END;

EXEC dbo.Proc_VehicleTemperatures

TRIM()

CREATE OR ALTER PROCEDURE dbo.Proc_TrimTest

WITH NATIVE_COMPILATION, SCHEMABINDING AS

BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')

SELECT TRIM(VehicleRegistration) AS col1

FROM Warehouse.VehicleTemperatures

END;

EXEC dbo.Proc_TrimTest

TRANSLATE()

CREATE OR ALTER PROCEDURE dbo.Proc_TranslateTest

WITH NATIVE_COMPILATION, SCHEMABINDING AS

BEGIN ATOMIC WITH (TRANSACTION ISOLATION LEVEL = SNAPSHOT, LANGUAGE = N'English')

SELECT TRANSLATE('2*[3+4]/{7-2}', '[]{}', '()()');

END;

EXEC dbo.Proc_TranslateTest

sys.dm_db_stats_histogram()

CREATE STATISTICS stat_VehicleTemperatures ON Warehouse.VehicleTemperatures(VehicleRegistration)

SELECT s.object_id, OBJECT_NAME(s.object_id), hist.step_number, hist.range_high_key, hist.range_rows,

hist.equal_rows, hist.distinct_range_rows, hist.average_range_rows

FROM sys.stats AS s

CROSS APPLY sys.dm_db_stats_histogram(s.[object_id], s.stats_id) AS hist

WHERE OBJECT_NAME(s.object_id) = 'VehicleTemperatures'

My wish list for the In-Memory OLTP feature is huge, but it’s great to see Microsoft continually improve and extend it.

Availability Groups and Native Compilation

3 Replies

For disk-based tables, query plans for interpreted/traditional stored procedures will be recompiled when statistics have changed. That’s because when you update statistics, cached query plans for interpreted stored procedures are invalidated, and will automatically recompile the next time they’re executed. That’s true for an interpreted stored procedure that references disk-based tables, and/or memory-optimized tables.

As of SQL 2016, the database engine automatically updates statistics for memory-optimized tables (documentation here), but recompilation of native modules must still be performed manually. But hey, that’s way better than SQL 2014, when you couldn’t recompile at all; you had to drop/recreate the native module. And natively compiled stored procedures don’t reside in the plan cache, because they are executed directly by the database engine.

This post attempts to determine if the requirement to manually recompile native modules is any different for AG secondary replicas.

Stats on the primary

Statistics that are updated on the primary replica will eventually make their way to all secondary replicas. This blog post by Sunil Agarwal details what happens on the secondary replica if the statistics are stale (relative to any temporary statistics that were created on the secondary).

How do we…?

The first question we must answer is: how do you determine when the last time a natively compiled stored procedure was compiled?

We can do that by checking the value of the cached_time column from the following query:

SELECT *
FROM sys.dm_exec_procedure_stats
WHERE OBJECT_NAME(object_id) = '<YourModule>'

SELECT *

FROM sys.dm_exec_procedure_stats

WHERE OBJECT_NAME(object_id) = '<YourModule>'

The query is simple, but you won’t get any results unless you enable the collection of stored procedure execution statistics for natively compiled procedures. Execution statistics can be collected at the object level or instance level.

NOTE: Enabling the collection of stored procedure statistics for natively compiled procedures can crush your server, potentially resulting in disastrous performance impact. You must be extremely careful with this method of troubleshooting.

Once you’ve enabled stats collection for native procedures, you should get results from the query above.

How I tested

Here are the steps I executed, after creating an AG that used synchronous mode (containing a single database with a memory-optimized filegroup):

Create a sample table
Insert some rows
Create a natively compiled procedure that selects from the sample table
Execute the native procedure on the primary and secondary (it must be executed at least once in order to have usage stats collected)
Enable collection of stored procedure execution statistics on the primary and secondary replicas
Again execute the native procedure on the primary and secondary
Note the value of sys.dm_exec_procedure_stats.cached_time on the primary and secondary
Recompile the native procedure on the primary
Execute the native procedure on the primary and secondary
Again note the value of sys.dm_exec_procedure_stats.cached_time on the primary and secondary

Results

The cached_time value on the secondary did not get updated when the native module was recompiled on the primary.

What does this mean for DBAs that are responsible for maintaining AGs that use native compilation? It means that when you recompile native modules on the primary replica (which you would always do after updating statistics on the primary), those modules must be recompiled on all secondary replicas. The recompilation on the secondary can be performed manually or perhaps through some automated mechanism. For example, if you have a SQL Agent job on the primary replica to update statistics, one of the job steps might be for marking all natively compiled stored procedures on the secondary for recompilation, using sp_recompile.

How would that job step handle the recompile for all secondary replicas?

Perhaps after defining linked servers, you could do something like:

EXEC SecondaryServer1.msdb.dbo.sp_start_job @job_name = N’Recompile native procs’;

EXEC SecondaryServer2.msdb.dbo.sp_start_job @job_name = N’Recompile native procs’;

But it might be involved to define this for all secondary replicas – it sounds like a topic for another post…..

Happy recompiling –

How NOT to benchmark In-Memory OLTP

Transactional Replication meets In-Memory OLTP

1 Reply

Transactional replication hasn’t changed much since it was re-written for SQL 2005. However, with the release of SQL 2014 , there is at least one new possibility: memory-optimized tables at the subscriber.

With the release of SQL 2016, some of the restrictions for memory-optimized subscriber tables have been lifted:

snapshot schema files that create the memory-optimized tables no longer have to be manually modified (but see “Gotcha #6, silent schema killer” below)
tables replicated to memory-optimized tables on a subscriber are no longer limited to the SQL 2014 max row length of 8060 bytes for memory-optimized tables. This seems sort of moot, because published tables cannot themselves be memory-optimized, and are therefore still restricted to 8060 bytes. However, if for some reason you needed to add a lot of columns to the subscriber table that cause it to be greater than 8060 bytes, you can do it. Note that there is no limit on how large a row can be for memory-optimized tables in SQL 2016. The following statement is perfectly valid:

CREATE TABLE [dbo].[T01]
(
     [PKcol] [INT] IDENTITY(1, 1) NOT NULL
    ,[col2] CHAR(5000) NOT NULL
    ,[col3] CHAR(5000) NOT NULL
    ,[col4] CHAR(5000) NOT NULL
    ,[col5] CHAR(5000) NOT NULL
    ,[col6] CHAR(5000) NOT NULL
    ,[col7] CHAR(5000) NOT NULL

 PRIMARY KEY NONCLUSTERED HASH 
(
    [PKcol]
) WITH (BUCKET_COUNT = 1000)
) WITH (MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA)

CREATE TABLE [dbo].[T01]

(

[PKcol] [INT] IDENTITY(1, 1) NOT NULL

,[col2] CHAR(5000) NOT NULL

,[col3] CHAR(5000) NOT NULL

,[col4] CHAR(5000) NOT NULL

,[col5] CHAR(5000) NOT NULL

,[col6] CHAR(5000) NOT NULL

,[col7] CHAR(5000) NOT NULL

PRIMARY KEY NONCLUSTERED HASH

(

[PKcol]

) WITH (BUCKET_COUNT = 1000)

) WITH (MEMORY_OPTIMIZED = ON , DURABILITY = SCHEMA_AND_DATA)

Why would you want to use memory-optimized tables in a subscriber database? There can only be one answer: speed.

Subscriber latency due to data volume could be a result of the following, in combination with each other, or individually:

excessive logging – changes to indexes are not logged for memory-optimized tables, and in general logging is much more efficient than for traditional/on-disk tables
locking – no locks are taken for DML statements that touch memory-optimized tables
blocking – blocking as a result of a transaction making changes to rows is not possible for memory-optimized tables
latching – no latches are taken on memory-optimized tables

The design of the In-Memory OLTP engine can alleviate latency due to these issues – BUT – before you start jumping for joy, you’ll need to be aware of the impact of deploying In-Memory OLTP in general.

DBAs love to tune things (indexes, queries, etc.), and subscriber tables are no exception. Until SQL 2014, when memory-optimized subscriber tables were introduced, some of the things that DBAs tuned on the subscriber included:

compression settings
different ways that the data in subscriber tables can be reinitialized, i.e. TRUNCATE TABLE, DELETE, DROP/CREATE table, or do nothing (these choices are for the ‘Action if name is in use’ section of the ‘Destination Object’, see the next screen shot).
custom indexes
snapshot isolation

For reinitializing, being able to use TRUNCATE TABLE is a great benefit, because all custom indexes and compression settings are retained for the destination table. If you choose drop/create, all compression settings and custom indexing must be reapplied to the subscriber table upon (re)initialization.

Deployment considerations

Article properties

On the dialog for Article Properties, you’ll need to make sure that both “Enable Memory Optimization” and “Convert clustered index to nonclustered index for memory optimized article” are set to “True”. Despite what you might have read, there is no concept of a “clustered” index for a memory-optimized table. If you have a clustered index on the published table, the DDL will fail when applied on the subscriber unless you set this option.

Subscription Properties

The Subscription Properties can be configured when initially creating the subscription:

or from the Subscription Properties dialog, if the subscription already exists:

Gotcha #1, DML support

Reinitialization is likely to happen at some point in the future, and so we’ll need to make the correct choice for “Action if name is in use”, on the same Article Properties dialog.

TRUNCATE TABLE is not supported for memory-optimized tables. If the table must be dropped, you’ll have to reapply scripts to handle any subscriber-level customization.

Gotcha #2, compression

On-disk tables are stored in pages. Memory-optimized tables are not stored in pages, and therefore don’t support any form of compression (columnstore indexes on memory-optimized tables create a separate compressed copy of the rows, but the primary data source remains the rows in memory).

Gotcha #3, potential WRITELOG bottleneck

All DML operations on durable memory-optimized tables are fully logged, regardless of database-level recovery settings (for more details, see my post on “Optimizing Data Load” here). If deploying In-Memory OLTP solves the latency issues your app was experiencing, WRITELOG is likely to become one of the top waits. This prevents realizing the full potential of deploying In-Memory OLTP, but fear not – as of SQL 2016/SP1, NVDIMM is supported for the transaction log, reducing/eliminating the log as a performance bottleneck. See the link here for more detail.

Gotcha #4, impact on RTO

If by chance you must restore a subscriber database that contains a lot of durable memory-optimized data (I realize that “a lot” is subjective), RTO will be affected. That’s because the number and placement of containers has a significant effect on the amount of time required to recover a database that contains durable memory-optimized data. See my post “In-Memory OLTP: The moving target that is RTO” here for more details. You might also be interested in “Backup and Recovery for SQL Server databases that contain durable memory-optimized data” here.

Gotcha #5, resource consumption

Updates on memory-optimized tables are performed as DELETE + INSERT, and INSERTs create row versions, and the newly inserted row becomes the current version. Older versions consume additional memory, and must be retained as long as any processes that reference them are still executing (like queries running on the subscriber). It’s possible to have long chains of versioned rows, and that means your environment might require additional memory. For a detailed explanation of row versioning, including the Garbage Collection process, see my post on “Row version lifecycle for In-Memory OLTP” here. There are additional considerations if your workload uses memory-optimized table variables (also detailed in that post).

Gotcha #6, silent schema killer

Let’s say you’ve done you’re homework, and that your configuration for memory-optimized subscriber tables is perfect. There is additional database configuration that must be done to support memory-optimized tables, and without that, your subscriber tables will not be initialized/reinitialized as memory-optimized (they’ll still be created on the subscriber, but will be traditional/on-disk tables). In the stored procedure that executes on the subscriber, there is validation to determine if there is a memory-optimized filegroup for the subscriber database (there are other conditions, but this is the one we’re interested in).

IF NOT EXISTS(select top 1 1 from sys.filegroups FG where type = 'FX')

1	IF NOT EXISTS(select top 1 1 from sys.filegroups FG where type = 'FX')

If you lookup the definition of sys.filegroups, it relates to sys.data_spaces, and there we see a column named “type” that can have the following values:

FG = Filegroup
FD = FILESTREAM data filegroup
FX = Memory-optimized tables filegroup
PS = Partition scheme

If the query finds a filegroup of type “FX”, the table is created as memory-optimized, and if not (along with some other conditions), it’s created as a traditional/on-disk table.

While it seems obvious that you should have already configured your database to have a memory-optimized filegroup, if you skipped that step, there is no warning, error, or other type of message received, stating that the subscriber database is not memory-optimized. Of course, simply having a memory-optimized filegroup is not enough to create memory-optimized tables, because you must also have containers that belong to that memory-optimized filegroup. The “memory-optimized filegroup exists” validation will pass, but the (re)initialization will fail because no containers exist, and you’ll receive an error about that.

Index limitations

As of this writing (SQL 2016, SP1), a memory-optimized table can have a maximum of 9 indexes (if one of them is a columnstore index). That may or may not be an issue for your environment, but it’s a much lower number than traditional/on-disk tables.

Stored procedure execution

A quick review of Interop vs. Native Compilation:

Interop – interpreted TSQL as existed prior to SQL 2014. The full TSQL surface area is available with interop mode, and you can access both on-disk and memory-optimized tables.
Native Compilation – for maximum speed, you’ll want to use natively compiled stored procedures. There are restrictions for natively compiled modules, the most significant being that they can only reference memory-optimized tables, and the full TSQL surface area is not available. As of SQL 2016/SP1, natively compiled modules don’t support CASE statements, views, and there many other restrictions. For more details, check “Transact-SQL Constructs Not Supported by In-Memory OLTP” here.

If you execute an UPDATE or DELETE that affects a large number of rows, then that statement is turned into individual UPDATE or DELETE statements that are sent to the distributor, and finally to the subscriber(s). To avoid the overhead of sending all those changes, it’s possible to publish the “execution” of a stored procedure. The documentation says: “..only the procedure execution is replicated, bypassing the need to replicate the individual changes for each row..” Please refer to the document about replicating stored procedure execution here.

The documentation also states that you can customize the stored procedure on the subscriber. Although the documentation doesn’t mention it, the stored procedure can be natively compiled, which should greatly increase performance on the subscriber for transactions that affect a large number of rows. Keep in mind that any changes made to the procedure at the publisher are sent to the subscriber. If this isn’t the behavior you want, disable the propagation of schema changes before executing ALTER PROCEDURE.

IDENTITY crisis

You’ll likely be disappointed with native compilation if you’re trying to INSERT many rows at the subscriber, and the destination table includes an IDENTITY column. That’s because it’s not possible to insert a row that has an IDENTITY column in a natively compiled stored procedure. Even if you SET IDENTITY_INSERT on before calling the procedure, the insert still fails with: “The function ‘setidentity’ is not supported with natively compiled modules.”

Custom stored procedures

There is a difference between “replicating stored procedure execution”, and using “custom stored procedures”. Microsoft does not support anything you might create as a “custom stored procedure”, whether or not it’s natively compiled.

Please check the documentation here.

Wrapping up

In-Memory OLTP is steadily making its way into the full feature set offered by SQL Server. If you’re running SQL 2016 SP1, In-Memory OLTP is now included with all editions of SQL 2016, except LocalDB.

Troubleshooting Natively Compiled Stored Procedures, Part 1

Leave a reply

A subset of the tools available for troubleshooting interpreted stored procedures are available for troubleshooting natively compiled procedures.

The following table highlights the differences

Method	Interpreted	Natively compiled
Recompile specific statements	Supported	Not supported – but theoretically not required, due to the impossibility of parameter sniffing
Execute procedure with RECOMPILE	Supported	Not supported – but theoretically not required, due to the impossibility of parameter sniffing
Estimated/Actual plan	Supported	“Estimated Plan” makes no sense in the context of natively compiled stored procedures. The plan that will be executed is available from SHOWPLAN_XML or by clicking (“Estimated Plan” in SSMS, but it’s not “estimated”)
Remove plan from plan cache	Supported	Not supported – plans for natively compiled stored procedures are not stored in the plan cache.
DBCC FREEPROCCACHE	Supported	No effect, because plans for natively compiled stored procedures are not stored in the plan cache.
SET STATISTICS IO ON	Supported	Not supported/required, because there is no such thing as IO for memory-optimized tables.
SET STATISTICS TIME ON	Supported	Supported, but might not be 100% accurate, because execution times less than 1 millisecond are reported as 0 seconds. Total_worker_time may not be accurate if many executions take less than 1 millisecond.
SET FMTONLY	Supported	Not supported, but you can use sp_describe_first_result_set.
SHOWPLAN_XML	Supported	Supported
SHOWPLAN_ALL	Supported	Not supported
SHOWPLAN_TEXT	Supported	Not supported
Mismatched datatypes		xEvents hekaton_slow_parameter_passing, with reason = parameter_conversion.
Named parameters, i.e. EXEC dbo.Proc @Param1 = @Param1	Supported	Supported, but not recommended, due to performance impact. You can track this type of execution with xEvents hekaton_slow_parameter_passing, with reason = named_parameters.

If any SET options are in effect, statistics are gathered at the procedure level and not at the statement level.

Note 1: Statement-level execution statistics can be gathered with xEvents by capturing the sp_statement_completed event. They can also be seen using Query Store (detailed in a future post).

Note 2: Due to the nature of working with memory-optimized tables in general, it’s likely that you will have to implement retry logic. Because of this, and also because of feature limitations within the natively compiled space, Microsoft suggest using an interpreted TSQL wrapper when calling natively compiled stored procedures.

The following query references sys.dm_exec_query_stats to get statistics for natively compiled procedures:

SELECT  st.objectid
       ,OBJECT_NAME(st.objectid) AS 'object name'
       ,SUBSTRING(st.text, ( qs.statement_start_offset / 2 ) + 1,
                  ( ( qs.statement_end_offset - qs.statement_start_offset )
                    / 2 ) + 1) AS 'query text'
       ,qs.creation_time
       ,qs.last_execution_time
       ,qs.execution_count
       ,qs.total_worker_time
       ,qs.last_worker_time
       ,qs.min_worker_time
       ,qs.max_worker_time
       ,qs.total_elapsed_time
       ,qs.last_elapsed_time
       ,qs.min_elapsed_time
       ,qs.max_elapsed_time
       ,qs.total_rows
       ,qs.min_rows
       ,qs.max_rows
       ,qs.last_rows
FROM    sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) st
WHERE   st.dbid = DB_ID()
        AND st.objectid IN ( SELECT object_id
                             FROM   sys.sql_modules
                             WHERE  uses_native_compilation = 1 )
ORDER BY qs.total_worker_time DESC;

SELECT st.objectid

,OBJECT_NAME(st.objectid) AS 'object name'

,SUBSTRING(st.text, ( qs.statement_start_offset / 2 ) + 1,

( ( qs.statement_end_offset - qs.statement_start_offset )

/ 2 ) + 1) AS 'query text'

,qs.creation_time

,qs.last_execution_time

,qs.execution_count

,qs.total_worker_time

,qs.last_worker_time

,qs.min_worker_time

,qs.max_worker_time

,qs.total_elapsed_time

,qs.last_elapsed_time

,qs.min_elapsed_time

,qs.max_elapsed_time

,qs.total_rows

,qs.min_rows

,qs.max_rows

,qs.last_rows

FROM sys.dm_exec_query_stats qs

CROSS APPLY sys.dm_exec_sql_text(sql_handle) st

WHERE st.dbid = DB_ID()

AND st.objectid IN ( SELECT object_id

FROM sys.sql_modules

WHERE uses_native_compilation = 1 )

ORDER BY qs.total_worker_time DESC;

Note 3: The documentation for sys.dm_exec_query_stats states that the total_rows, min_rows, max_rows, and last_rows columns cannot be NULL, but NULL is still returned. A Connect item has been filed to have those columns return 0.

Parallelism

Parallelism is supported for memory-optimized tables for all index types. While that statement is true when using interpreted stored procedures that reference memory-optimized tables, unfortunately it’s not true when using natively compiled stored procedures.

Actual vs. Estimated

These terms have confused generations of SQL Server technologists.

For natively compiled procedures, enabling “Actual Plan” in SSMS does not return any plan information, but still executes the procedure. Enabling “Estimated Plan” in SSMS for natively compiled procedures is the same as setting SHOWPLAN_XML ON, but does not actually execute the stored procedure. The plan that will be executed is displayed.

Removing plans from the cache

For interpreted stored procedures, DBAs have the ability to remove an offending plan from the plan cache. This is not possible with natively compiled stored procedures, because the plan is not stored in the plan cache.

DBCC FREEPROCCACHE

If you execute DBCC FREEPROCCACHE and expect your natively compiled plans to magically disappear, you will no doubt be disappointed. That’s because DBCC FREEPROCCACHE has no effect on compiled modules, as they are not stored in the plan cache that’s used for interpreted TSQL. But executing DBCC FREEPROCCACHE will of course remove all existing plans for interpreted TSQL from the plan cache (so don’t do that…unless you’re really, really sure you want to recompile all of your interpreted procs).

Parameter sniffing

With interpreted stored procedures, parameter sniffing can severely impact performance. Parameter sniffing is not possible for natively compiled procedures, because all natively compiled procedures are executed with OPTIMIZE FOR UNKNOWN.

Statistics

SQL 2016 has the ability to automatically update statistics on memory-optimized tables if your database is has a compatibility level of at least 130. If you don’t want to depend on SQL Server to determine when stats should be updated, you can update statistics manually (and we no longer have to use FULLSCAN, as was the case in SQL 2014). Statistics for index key columns are created when an index is created.

Database upgrades and statistics

As mentioned earlier, if your database was set to compatibility level 120 (SQL 2014), and you want to take advantage of auto-update statistics, you must change the compatibility level to 130. But statistics still won’t be auto-updated unless you manually update them one last time.

Recompilation

When you create a natively compiled stored procedure, it gets compiled, and execution plans for the queries contained within the procedure are created. As the data changes, those execution plans will be based on older statistics, and might not perform at the highest possible level. Many people think that if you update statistics, natively compiled stored procedures will magically recompile. Unfortunately, this is not correct – natively compiled stored procedures are only recompiled under the following circumstances:

When you execute sp_recompile (this should be done after statistics are updated)
Database restart

Database restart includes at least the following events:

Database RESTORE
OFFLINE/ONLINE of database
Failover (FCI or Availability Group)
SQL Server service restart
Server boot

Unlike memory-optimized tables – which are all created, compiled, and placed into memory upon database restart – natively compiled stored procedures are recompiled when first executed. This reduces the amount of time required for database recovery, but affects the first-time execution of the procedure.

Plan operators

For traditional tables (harddrive-based), the number of pages expected to be returned by an operator has a significant impact on the cost, and therefore affects the plan. Since memory-optimized tables are not stored in pages, this type of calculation is irrelevant.

For memory-optimized tables, the engine keeps track of how many rows are in each table. This means that estimates for full table scans and index scans are always accurate (because they are always known). For memory-optimized tables, the most important factor for costing is the number of rows that will be processed by a single operator. Older statistics might reference row counts that are no longer valid, and this can affect plan quality.

Nested execution

Prior to SQL 2014, it was not possible for one natively compiled stored procedure to call another natively compiled stored procedure. This restriction has been lifted in SQL 2016.

We will continue troubleshooting natively compiled stored procedures in a future post.

Ned Otter Blog

SQL Server DBA and Musician

Category Archives: Native compilation

Using native compilation to insert parent/child tables

SQL 2017 In-Memory roundup

Availability Groups and Native Compilation

How NOT to benchmark In-Memory OLTP

Transactional Replication meets In-Memory OLTP

Troubleshooting Natively Compiled Stored Procedures, Part 1