Facebook Sign in | Join
Getting Started with Adobe After Effects - Part 6: Motion Blur

TSQL Challenge 47

rated by 0 users
This post has 40 Replies | 13 Followers

Top 10 Contributor
Posts 831
Points 12,705

MisterMagoo:

I might have a play with pseudo-recursion to see if that can help at all with this one...

So you are the second person mentioning this term 'pseudo-recursion'.
I would be curious to know what this refers to.
Stefan mentioned it in the previous challenge but the solutions are not yet posted.
Thus I'm crying out of curiosity. :-)

  • Post Points: 35
Top 50 Contributor
Posts 83
Points 1,165

dishdy:

MisterMagoo:

I might have a play with pseudo-recursion to see if that can help at all with this one...


So you are the second person mentioning this term 'pseudo-recursion'.
I would be curious to know what this refers to.
Stefan mentioned it in the previous challenge but the solutions are not yet posted.
Thus I'm crying out of curiosity. :-)

 

Well, I picked up the term from looking at one of Stefan's solutions here : http://beyondrelational.com/tc/submissions/3196/default.aspx

Basically, it seems to involve a recursive CTE that selects zero rows in the recursive part deliberately.

This is the first time I have tried it - but for me it did not provide any benefit - in fact it made my solution slower.

I am not sure exactly what the effect is - but when I have time to investigate the execution plan i am sure it will be obvious.

  • Post Points: 20
Top 50 Contributor
Male
Posts 43
Points 205

This article sorts of explains it

http://bradsruminations.blogspot.com/2010/03/this-article-on-recurson-is-entitled.html

My understanding is that it prevents the input to a recursive cte from being evaluated lots of times.  Instead it's spooled into a work table.

For my solution is decreased the time but increased the number of reads

 

regards

David

 

  • Post Points: 5
Top 200 Contributor
Posts 11
Points 45

dishdy:

@jimbobmcgee
I see what you're asking.
You want to see the output from the performance data to see if your solution, besides going like lightning, also produces correct results. :-)
I can give you an answer later today when I get to my computer.

I've given up on 'going like lightning' for the time being.  Too much talk of sub-400ms execution times for me!

Correct results FTW!

  • Post Points: 5
Top 200 Contributor
Posts 11
Points 45

Stefan_G:

jimbobmcgee:

Stefan_G; Since your performance test code is essentially adding, I am right in thinking that you can summarise the expected results as follows...?

 

 No, that is not correct.

Yeah, I realised that pretty soon after!!

Stefan_G:

The expected result from the performance test is this: 

FirstCardNo LastCardNo  Changes
----------- ----------- -----------
11          116         204
196         200         165
...
9994 9971 41 (98 row(s) affected)

Thanks for these.  At this rate, I reckon if I could ever get my V1 to actually finish recursing 10,000+ rows, I might have a good chance of matching the rowcount, at least!!

  • Post Points: 5
Top 10 Contributor
Posts 144
Points 2,645

dishdy:

MisterMagoo:

I might have a play with pseudo-recursion to see if that can help at all with this one...


So you are the second person mentioning this term 'pseudo-recursion'.
I would be curious to know what this refers to.
Stefan mentioned it in the previous challenge but the solutions are not yet posted.
Thus I'm crying out of curiosity. :-)

Pseudo-recursion is my name for a general purpose method of forcing SQL server to insert a spool at almost any point in a execution plan.
This method is quite useful for these challenges but in real life using a temporary table is almost always better.

Lets look at an example

Use the following script to generate some test data.

if object_id('tempdb..#t') is not null drop table #t
create table #t (dt datetime, value decimal(20,4))

declare @i int=0
set nocount on
begin tran
while @i<10000 begin
	insert into #t (dt, value) (select dateadd(second, rand()*1000000, '20110101'), rand()*10)
	set @i=@i+1
end
commit

We now calculate a running sum using a few different methods:

-- triangular join
select dt, value, sum(aval) as rsum
from (
	select b.dt, b.value, a.value as aval
	from #t a, #t b
	where a.dt<=b.dt
) t
group by dt, value
order by dt
--   CPU time = 26582 ms,  elapsed time = 27036 ms.

-- recursion
; with 
cte1 as (
	select dt, value, row_number() over (order by dt) as rn
	from #t
)
, cte2 as (
	select dt, value, value as rsum, rn
	from cte1
	where rn=1
	
	union all
	
	select n.dt, n.value, cast(p.rsum+n.value as decimal(20,4)) as rsum, n.rn
	from cte2 p, cte1 n
	where p.rn+1 = n.rn
)
select * from cte2
option (maxrecursion 0)
--   CPU time = 78999 ms,  elapsed time = 80245 ms.

-- Now add some pseudo-recursion to the recursive solution
; with 
cte1 as (
	select dt, value, row_number() over (order by dt) as rn
	from #t
	
	union all
	
	-- pseudo-recursion. This query should not return any rows
	select * from cte1 where rn<0
)
, cte2 as (
	select dt, value, value as rsum, rn
	from cte1
	where rn=1
	
	union all
	
	select n.dt, n.value, cast(p.rsum+n.value as decimal(20,4)) as rsum, n.rn
	from cte2 p, cte1 n
	where p.rn+1 = n.rn
)
select * from cte2
option (maxrecursion 0)
--   CPU time = 327 ms,  elapsed time = 410 ms.

As you can see pseudo-recursion can in some cases increase performance quite dramatically.

To see what is happening we can look at the query plan for the recursive query and the query with pseudo-recursion:

This query executes the sort to generate row numbers for each iteration in the recursion.

This query first uses pseudo-recursion to store the result of cte1 in a index spool. This means that the sort is perfored only once. The data is then moved to an index spool which is used to find the next row using the join condition in cte2.

The performance increase in this case is dramatic!

  • Post Points: 20
Top 50 Contributor
Posts 83
Points 1,165

Stefan_G:

Pseudo-recursion is my name for a general purpose method of forcing SQL server to insert a spool at almost any point in a execution plan.

Thanks for that last post - it is a very interesting technique - but I figure the hardest part is understanding when it will help ...

  • Post Points: 5
Top 10 Contributor
Posts 831
Points 12,705

MisterMagoo:

Stefan_G:

Good job!
Impressive that you manage to do this with only three scans of the original table.

Thanks - I must admit it was a shock to me too! I had to run it again and again before I believed it!
The sudden change from 8 seconds to 300ms came about from 1 change in the code that I kind of felt might help, but I had no idea it would help this much.
I have since added a little bit to the reads on the worktable as I noticed that chains that consisted of one row of "no change" data (e.g. 77=>77) were not being picked up , but the change to the stats was so small not to bother posting.
I might have a play with pseudo-recursion to see if that can help at all with this one...

Now that I'm down to 8 seconds, I'm looking for that 1 code change...
There was a moment where a 'hash join' hint got me there (under 300ms). Then I put in the final code and it gets rejected.
Right now it's a 'hash group' hint that made me go from 28 seconds to 8 seconds.
Guess it's time to look at that bloody execution plan.

  • Post Points: 5
Top 50 Contributor
Posts 83
Points 1,165

Congratulations Leszek - that was a very close race in the end!

  • Post Points: 20
Top 10 Contributor
Male
Posts 208
Points 1,120

MisterMagoo:

Congratulations Leszek - that was a very close race in the end!

Thanks, and yes, differences are minimal.

 

  • Post Points: 5
Top 200 Contributor
Male
Posts 11
Points 65

I just received a mail that my solution is not accepted because it timed out.

Strange, because when I ran the load test, I have following results :

 

SQL Server parse and compile time:

   CPU time = 47 ms, elapsed time = 53 ms.

 

(296 row(s) affected)

Table 'Worktable'. Scan count 59485, logical reads 754060, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table 'TC47'. Scan count 15, logical reads 1560, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

SQL Server Execution Times:

   CPU time = 4632 ms,  elapsed time = 2107 ms.

 

 

  • Post Points: 5
Page 3 of 3 (41 items) Prev123
| RSS
Contact US

Copyright © Rivera Informatic Private Ltd.