When I was looking the plan cache for the queries having “Table spool”, I came across this query.
INSERT INTO @Result (OID,Partkey)
SELECT OID, dbo.fgetPartKey(OID)
FROM fn_getSplit( @P_OIDS,',' )
WHERE OID > 0
Query optimizer used the below plan to execute the above query.
Here Eager spool occupies 50% of the plan cost. Before understanding the plan, lets know a bit about the spools.
Spools Behavior:
There are different types of spools in the query engine. However, all spools write data to TempDB and re-use that data again. We can see spools in action, when we are performing an expensive correlated subquery which is performing same computations for the same set of values. By writing subquery results to TempDB, the optimizer avoids performing same calculations again by simply reading from TempDB.
Eager Spool:
What it does
Eager spool will read all values from input at once, and write all the values to tempdb. This is a blocking iterator, as it will read all values at one go.
When is it used
Eager spool will be mostly used to solve the problems - e.g., where a query engine suspects that the target table suffers from DML statements, which goes into an unending iterative loop. A simple way to reproduce Eager spool is below.
DECLARE @t TABLE
(
ID INT
)
INSERT INTO @t VALUES (1),(2)
INSERT INTO @t
SELECT * FROM @t
The execution plan of insert statement.
Problem Without Eagerspool:
Step 1 : Read Cursor reads 1 from table and hands over to write cursor, which in turn will write to table.
Step 2: Read Cursor reads 2 from table and hands over to write cursor, which in turn writes to table.
Step 3: Read cursor again reads 1 from table and hands over to write cursor. This tries to write the same record again.
Step 4 : Insert 2 again… and so on….
Query engine will get stuck into infinite loop by re-inserting the records again and again.
Solution By using Eager Spool:
To prevent this scenario, Query Engine came up with Eager Spool. Here, it initially reads all rows from table and writes it to tempDB. Now instead of reading from table, it will read from TempDB table and writes to table one by one record.
My Current Query:
In the query below, a function fgetPartKey() is called for each row. As SQL server doesn’t know what is inside the function, It assumes it mayl modify source tables. In this case, to avoid the repetitive insertions, it spools data to TempDB and reads from TempDB.
INSERT INTO @Result (OID,Partkey)
SELECT OID, dbo.fgetPartKey(OID)
FROM fn_getSplit( @P_OIDS,',' )
WHERE OID > 0
Trick to avoid Spooling in this case:
By specifying the option “With SchemaBinding”, we can guarantee the query engine, that tables are not going to be affected. With that assurance, query engine avoids the spooling and directly reads from disk.
CREATE FUNCTION fgetPartKey (@i INT )
RETURNS BIT WITH schemabinding
Now the resultant plan is without spooling and saves extra I/o