In the previous posts, we examined two transaction isolation levels: READ COMMITTED and READ UNCOMMITTED. In this post we will look at REPEATABLE READ isolation level. Before we discuss REPEATABLE READ, let me show a few queries that demonstrate the problem REPEATABLE READ tries to solve.
Create the sample tables with this script.
---------------------------------------------------------------------
-- Create tables
---------------------------------------------------------------------
IF OBJECT_ID('Consultants') IS NOT NULL DROP TABLE Consultants
GO
CREATE TABLE Consultants (
ConsultantID INT IDENTITY,
ConsultantName VARCHAR(20),
Technology VARCHAR(15),
HireDate DATETIME,
HourlyRate MONEY)
---------------------------------------------------------------------
-- Fill some data
---------------------------------------------------------------------
INSERT INTO Consultants (ConsultantName, Technology, HireDate, HourlyRate)
SELECT 'Jacob', 'SQL Server', '2000-10-03', 120
INSERT INTO Consultants (ConsultantName, Technology, HireDate, HourlyRate)
SELECT 'Smith', 'ASP.NET', '2005-01-01', 140
INSERT INTO Consultants (ConsultantName, Technology, HireDate, HourlyRate)
SELECT 'Michael', 'C#.NET', '2002-10-30', 100
SELECT * FROM Consultants
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
1 Jacob SQL Server 2000-10-03 00:00:00.000 120.00
2 Smith ASP.NET 2005-01-01 00:00:00.000 140000.00
3 Michael C#.NET 2002-10-30 00:00:00.000 80000.00
*/
Open a query window (Query Window 1) and run the following code. This code locks the 3rd row.
BEGIN TRAN
-- lock the second row
UPDATE Consultants
SET Technology = Technology
WHERE ConsultantID = 3
Now open another query window (Query Window 2) and run the following.
SELECT *
FROM Consultants
WHERE Technology = 'SQL Server'
This will read the first two rows and then will block when it attempt to read the third row. After reading the first and second rows, the query identifies the first row matches the criteria and retrieves it. Now, let us see what happens if another transaction modifies the first row before this transaction completes. Run the following from Query Window 1.
UPDATE Consultants
SET Technology = 'C#.NET'
WHERE ConsultantID = 1
UPDATE Consultants
SET Technology = 'SQL Server'
WHERE ConsultantID = 2
COMMIT TRAN
Since the transaction is committed, the lock on the row will be released. This will clear the block on the second query and it will complete execution. Move to Query Window 1 and you will see the following results.
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
1 Jacob SQL Server 2000-10-03 00:00:00.000 120.00
*/
But, note that the results are wrong. Run the query once again and see what is the result it returns.
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
2 Smith SQL Server 2005-01-01 00:00:00.000 140.00
*/
See that we got a completely different result when we ran the query again. While this behavior is natural, some times we might need to control this behavior. For example, assume that you have a stored procedure that retrieves customers with a certain set of characteristics. You might want to do some processing with each customer. However, it is possible that some other transaction might update the customers while you are still processing it and the data you retrieved for processing may not be relevant any more.
So you need a way to make sure that when you run a query for the second time, you should get the same results. All the active customers that you received when you started your processing should remain active until you complete the operation. No other transaction should be allowed to update any of the active customers to inactive. Similarly, no other transaction should be allowed to turn an inactive customer to active. Having this restriction makes sure that the customers that your first query retrieved will remain valid until you complete your process.
This can be achieved by REPEATABLE READ transaction isolation level. SQL Server will lock each row that you touch irrespective of whether it matches the criteria or not. So the rows you have already read cannot be modified by other transactions.
Let us revisit the previous example and see how REPEATABLE READ handles this. Run this query in Query Window 1.
BEGIN TRAN
-- lock the third row
UPDATE Consultants
SET Technology = Technology
WHERE ConsultantID = 3
Run the following in query window 2.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
SELECT *
FROM Consultants
WHERE Technology = 'SQL Server'
As expected, this will block because the third row is locked by another session. Now go back to Query Window 1 and try to update the first row.
UPDATE Consultants
SET Technology = 'C#.NET'
WHERE ConsultantID = 1
UPDATE Consultants
SET Technology = 'SQL Server'
WHERE ConsultantID = 2
COMMIT TRAN
Note that the transaction will try to acquire a lock on the first row which is locked by another session. This will cause a dead lock. Move back to Query Window 2 and you will see that the query failed with a deadlock error. This is acceptable because it is better to fail the query rather than giving incorrect data.
Now let us see a slightly different version of the above query. Run the script that recreates the tables and fills sample data. Run the following on query window 2.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT *
FROM Consultants
WHERE Technology = 'SQL Server'
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
1 Jacob SQL Server 2000-10-03 00:00:00.000 120.00
*/
Now move to Query Window 1 and try to update the second row.
UPDATE Consultants
SET Technology = 'SQL Server'
WHERE ConsultantID = 1
Note that this query blocks. It blocks because, under REPEATABLE READ, SQL Server will lock all rows it reads. Even rows that do not match with the given criteria. The first row does not match with the criteria we specified in the query we ran in Query Window 2. But still the row is locked. No other transaction can modify this row until the REPEATABLE READ transaction on Query Window 2 completes.
Move to query window 2 and run the query again.
SELECT *
FROM Consultants
WHERE Technology = 'SQL Server'
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
1 Jacob SQL Server 2000-10-03 00:00:00.000 120.00
*/
COMMIT TRAN
Note that you got the same results. Note also that we have committed the transaction and the lock is released now. The transaction waiting on Query Window 1 can go ahead and complete the update now.
Phantom Read
REPEATABLE READ isolation level can cause Phantom Reads. Under REPEATABLE READ, SQL Server will lock the rows it reads. But it does not prevent from inserting new rows. So, it can happen that when you run the same query for second time, under REPEATABLE READ, you might find new rows in the second query. Such a row is called 'Phantom Row' and a read that returns a Phantom Row is called a Phantom Read.
Let us look at the example that we saw earlier. Run the scripts to recreate the table and fill sample data. Then run the following query in Query Window 2.
SET TRANSACTION ISOLATION LEVEL REPEATABLE READ
BEGIN TRANSACTION
SELECT *
FROM Consultants
WHERE Technology = 'SQL Server'
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
1 Jacob SQL Server 2000-10-03 00:00:00.000 120.00
*/
Now run the following in Query Window 1.
INSERT INTO Consultants (ConsultantName, Technology, HireDate, HourlyRate)
SELECT 'Mike', 'SQL Server', '2000-10-03', 120
Note that this does not block. This row is new and hence the session on the other window does not have a lock on this. Now move back to Query Window 2 and run the query again.
SELECT *
FROM Consultants
WHERE Technology = 'SQL Server'
/*
ConsultantID ConsultantName Technology HireDate HourlyRate
------------ -------------------- --------------- ----------------------- ---------------------
1 Jacob SQL Server 2000-10-03 00:00:00.000 120.00
4 Mike SQL Server 2000-10-03 00:00:00.000 120.00
*/
COMMIT TRAN
Note that we have 2 rows. The previous query returned only one row, but this one returns two rows. The second row is a Phantom Row, that did not exist in the first query and appeared all of a sudden.
Note that a REPEATABLE READ locks only the row that it reads. If your query causes a table scan then all the rows in the table will be locked. However, if there is an index on the filter expression, the query processor does not need to scan all the rows in the table. In this case, the transaction will not lock all the rows. It will read only the rows that it has read. Other transactions can still update rows that are not locked. This can also cause phantom read.
So, REPEATABLE READ does not guarantee that you will always get the same result. But it guarantees that the rows that SQL Server has read to process the query are locked and no other transaction can modify it.
SERIALIZABLE isolation level prevents Phantom rows. Under SERIALIZABLE isolation level, SQL Server will apply a range lock that prevents rows being inserted within the given range. We will examine SERIALIZABLE isolation level in the next post.
See also
- Introduction to SQL Server Transaction Isolation Levels
- SQL Server Transaction Isolation Level - READ COMMITTED
- SQL Server Transaction Isolation Level - READ UNCOMMITTED
- SQL Server Transaction Isolation Level - REPEATABLE READ
- SQL Server Transaction Isolation Level - SERIALIZABLE
- SQL Server Transaction Isolation Level - SNAPSHOT