Well recently I tried to give a live example of a design pattern for optimistic concurrency. It’s been a while (more than 12 months) since I last demonstrated this in practice, so I’ve been a bit rusty.
Some time ago, I was working on a project at Microsoft with a member of the SQL team (Mr. Danny Tambs), where he theorised that we could use timestamp columns more than for the practice of single row versioning.
This idea was to calculate the differences for a set of rows by calculating the sum of the timestamps for the given range.
Take the following example: we create a table with a column called Version, and type is timestamp.
For the sake of simplicity, let’s use a in integer as the PK and make it an identity column.
USE <Database>
GO
IF EXISTS(SELECT * from sys.tables where name = ‘Widget’)
DROP TABLE Widget
DROP TABLE Widget
GO
CREATE TABLE [dbo].[Widget]
(
[WidgetId] [int] IDENTITY(1,1) NOT NULL,
[SomeValue] [nvarchar](50) NOT NULL,
[State] [int] NULL,
[Version] [timestamp] NOT NULL,
CONSTRAINT [PK_Widget] PRIMARY KEY CLUSTERED
(
[WidgetId] ASC
)
WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
(
[WidgetId] [int] IDENTITY(1,1) NOT NULL,
[SomeValue] [nvarchar](50) NOT NULL,
[State] [int] NULL,
[Version] [timestamp] NOT NULL,
CONSTRAINT [PK_Widget] PRIMARY KEY CLUSTERED
(
[WidgetId] ASC
)
WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 1’, 1)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 2’, 2)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 3’, 3)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 4’, 1)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 5’, 3)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 6’, 1)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 7’, 3)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 8’, 1)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 2’, 2)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 3’, 3)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 4’, 1)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 5’, 3)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 6’, 1)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 7’, 3)
INSERT INTO [Widget]([SomeValue],[State]) VALUES (‘Test Value 8’, 1)
When we do a select query, we wil return a set of data inclusive of the timestamp values. When this data is modified we might want to compare the timestamps to the individual rows to determine if the data has changed since the original read.
We can save time by determining whether to even process individual rows by determining a master version timestamp by summing the values together.
This can be achieved by:
SELECT SUM(CONVERT(bigint, [Version]))
FROM [Widget]
WHERE [State] = 1
FROM [Widget]
WHERE [State] = 1
Thus, the following SQL can be executed which demonstrates how the timestamp sum indicates the changed state of one or more rows in the set of data:
SELECT SUM(CONVERT(bigint, [Version]))
FROM [Widget]
WHERE [State] = 1
FROM [Widget]
WHERE [State] = 1
GO
UPDATE [Widget]
SET [SomeValue] = ‘Blah’, [State] = ‘2’
WHERE [WidgetId] = 1
SET [SomeValue] = ‘Blah’, [State] = ‘2’
WHERE [WidgetId] = 1
GO
SELECT SUM(CONVERT(bigint, [Version]))
FROM [Widget]
WHERE [State] = 1
FROM [Widget]
WHERE [State] = 1
GO
UPDATE [Widget]
SET [SomeValue] = ‘Test Data 1’, [State] = ‘1’
WHERE [WidgetId] = 1
SET [SomeValue] = ‘Test Data 1’, [State] = ‘1’
WHERE [WidgetId] = 1
GO
SELECT SUM(CONVERT(bigint, [Version]))
FROM [Widget]
WHERE [State] = 1
FROM [Widget]
WHERE [State] = 1
Recently, when we talked, Danny suggested using checksums of the versions in colaboration with CHECKSUM_AGG to create a more sensitive calculation.
SELECT CHECKSUM_AGG(CHECKSUM([Version]))
FROM [Widget]
WHERE [State] = 1
FROM [Widget]
WHERE [State] = 1
This gets around the need for converting the 8-byte timestamps into ints or bigints, and a checksum is more sensitive to changes in the underlying data.
I’ll need to do some more work to investigate the performance aspects, but it looks like a nice little pattern for validating whether data has changed between reads (when using optimistic concurrency).