Unique Constraints on Partitioned Tables28 Nov 2019
In this post, you can find 2 scenarios that explain how selected partition key affects unique constraints defined for your partitioned tables. One of them will be straightforward to handle, the other one is more complex.
Selecting Proper Column As a Partition Key
- If you have table structure similar to the below one and you want to partition your table according to month, then you should compute your partition key with a selected
DateTimecolumn. So, at first look, it is very reasonable to take
createdTimecolumn into account to computation. Since when the record is inserted to the table, it will be automatically placed on the correct partition. However, there is a unique combined index on table which constraints it as there cannot be more than one record with the same (partitionKey, name, startTime, endTime) combination in the same partition. If we choose, createdTime as our partition key then it is guaranteed that this combination is unique in each partition but other partitions can have the same combination as well. So, it violates our unique constraint. To overcome this problem, it is better to choose endTime column for computation of a partition key so that (name, startTime, endTime) combination can be unique across whole table. We have faced with the similar scenario in one of our projects and that is how we managed to resolve it.
CREATE TABLE [MySchema].[MyTableName] ( [id] BIGINT NOT NULL IDENTITY(1,1), [name] NVARCHAR(20) NOT NULL, [startTime] DATETIME2 NOT NULL, [endTime] DATETIME2 NOT NULL, [createdTime] DATETIME2 NOT NULL DEFAULT SYSUTCDATETIME(), [partitionKey] AS (DATEPART(YEAR, [endTime]) * 100 + DATEPART(MONTH, [endTime])) PERSISTED NOT NULL ) ON MyPSMonth ([partitionKey]); GO CREATE UNIQUE NONCLUSTERED INDEX [MyUniqueIndex] ON [MySchema].[MyTableName]([partitionKey] ASC, [name] ASC, [startTime] ASC) GO
Ensure Uniqueness with
- Again same scenario with above, we are trying to ensure uniqueness on our partitioned table but the table structure is similar to the below one. We can choose partition key as DateCreated column. Actually, there is no another column that we can choose other than DateCreated column since it is the only column that has
CREATE TABLE [MySchema].[MyTableName] ( [Id] BIGINT NOT NULL IDENTITY(1,1), [SpecificId] NVARCHAR(40) NOT NULL, [DateCreated] DATETIME NOT NULL DEFAULT (GETUTCDATE()), [PartitionKey] AS (DATEPART(YEAR, [DateCreated]) * 100 + DATEPART(MONTH, [DateCreated])) PERSISTED NOT NULL ) ON [MyPSMonth]([PartitionKey]); GO CREATE UNIQUE NONCLUSTERED INDEX [MyUniqueIndex] ON [MySchema].[MyTableName] ([PartitionKey], [SpecificId]); GO
- So, at first it can be seen as a straightforward task. However, the table also has a unique index on some
SpecificIdcolumn. By applying partitioning without considering this fact results in violating uniqueness of SpecificId. It is just ensured that SpecificId is unique on each partition but there can be records with same SpecificId value on different partitions. So, it is not desired by design. As you can see, the same thing happened for above scenario. But, we had another
DateTimefield in key constraint and we used it in a partition key computation. We choose endTime rather than createdTime column and it was resolved. But we cannot apply the same solution for this scenario since we don’t have any other option other than DateCreated column. In order to overcome this problem, we can define a
DMLtrigger that will be triggered every time a new record is intended to be inserted. So, first it is better to explain this trigger.
CREATE TRIGGER [MySchema].[MyTrigger] ON [MySchema].[MyTable] INSTEAD OF INSERT AS BEGIN SET NOCOUNT ON; SET XACT_ABORT ON; DECLARE @SpecificId NVARCHAR(40) = NULL, @MessageText NVARCHAR(200) = '', @MessageNumber INT = 0, @MessageState INT = 0; SELECT TOP 1 @SpecificId = T.[SpecificId] FROM [inserted] I INNER JOIN [MySchema].[MyTable] T ON I.[SpecificId] = T.[SpecificId]; IF (@SpecificId IS NOT NULL) BEGIN SET @MessageText = 'The SpecificId with value ' + @SpecificId + ' does already exist in table ' + '[MySchema].[MyTable]'; SET @MessageState = 1; SET @MessageNumber=50102; Throw @MessageNumber = @MessageNumber, @MessageText = @MessageText, @MessageState = @MessageState; END ELSE BEGIN INSERT INTO [MySchema].[MyTable] ( [SpecificId], [DateCreated] ) SELECT [SpecificId], [DateCreated] FROM [inserted]; SELECT [inserted].[Id] FROM [inserted] WHERE @@ROWCOUNT > 0 AND [inserted].[Id] = scope_identity(); END END
- Basically, this trigger first checks if there is any record with same SpecificId present in the whole table before inserting new ones. Note that, we are joining inserted and target table rather than just selecting from target table. The reason is, there can be
bulk insertsituations in the future. If there is no duplicated SpecificId, then it is safe to insert. Otherwise, just throw an exception. In fact, this trigger decreases performance but since we do not have any other option to ensure uniqueness and there is an index on SpecificId column, performance is not a first priority. There is another interesting point in this trigger definition which is the line:
SELECT [inserted].[Id] FROM [inserted] WHERE @@ROWCOUNT > 0 AND [inserted].[Id] = scope_identity();
- When insert is tested manually through
SSMSeverything works, nothing wrong. But while related application is trying to insert records to the table by using
EF Core, the trigger did not work. The reason to this bug is,
EF Corethrows DbUpdateConcurrencyException when entities do not have IdentityColumn that is a primary key. And also
EF Coreonly cares about latest inserted identity column value, we have to add the final line of code in order it to be worked with
EF Coreas well.
You have to be careful while selecting your partition key before applying a table partitioning. Especially, if you have unique constraints since uniqueness of the selected column combinations are enforced in each partition rather than the entire table. Also, if you don’t have a chance to select another column as your partition key to enforce the uniqueness in the entire table, you can define custom
DML trigger. This solution decreases the performance of your inserts but it is an option when you don’t have any other.