Getting Started with Adobe After Effects - Part 6: Motion Blur

Upload Image Close it
Select File

Welcome to my blog. I work as Database lead at Synaptic Digital. Hope you find some interesting stuff here.
Browse by Tags · View All
BRH 17
SQL Server 15
#SQL Server 11
#BI 10
BI 7
#SQLServer 6
SSRS 2008R2 5

Archive · View All
January 2011 6
December 2010 5
September 2012 4
May 2012 4
March 2011 4
November 2012 2
October 2012 2
January 2012 2
February 2011 2
November 2010 2

Distinct using Group By

Nov 9 2012 12:00AM by Chintak Chhapia   

Adding Distinct keyword hurt the query performance badly, but in real world we need to add Distinct may be because of not properly done normalization or may be we are working on a system which is intentionally designed that way. But, this post is not about that.

We use Distinct to get the unique rows , but we can achieve same goal by adding group by. Optimizer will create same plans for both in most of the cases . But, in case of some computation done on returned columns, the optimizer can choose a better plan for Group By compared to Distinct.

Lets have a look at example

--Create schema and dummy data
use tempdb
if object_id('dbo.addValues','fn') is not null
    drop function dbo.addValues
Create function dbo.addValues( @c1 smallint, @c2 smallint)
returns int
    return @c1 + @c2
if object_id('dbo.dummyTable','u') is not null
    drop table  dbo.dummyTable
create table dbo.dummyTable
      c1 smallint
    , c2 smallint
insert into dbo.dummyTable(c1,c2)
select top (500000) sc.column_id
from master.sys.columns sc
cross apply master.sys.columns sc1


use tempdb
select distinct c1,c2,cast(c1 as int)+c2 c3 
from dbo.dummyTable
select distinct c1,c2,dbo.addValues(c1,c2) c3 
from dbo.dummyTable
select c1,c2,cast(c1 as int)+c2 c3 
from dbo.dummyTable group by c1,c2
select c1,c2,dbo.addValues(c1,c2) c3 
from dbo.dummyTable group by c1,c2

If we run above queries, returned data will be same of each one. But, if we look at the execution plan, in case of distinct the sort is performed after the compute scalar, but in case of group by the sort is performed before compute scalar operator. This is gain in performance as in case of group by scalar function is executed less number of times.  For query with Distinct the scalar function gets executed each time for row, while in group by query scalar function gets executed only for distinct rows.  You can use profiler sp_count template or DMOs to confirm this. As a side note, this is the reason for why we should avoid scalar function if possible.

Distinct By Group By

Apart from the less number of executions, this will also impact the  memory grant of queries as the in case of query using group by, the aggregate operator  will need less memory. This can really helps in production environment where queries need to wait long for memory grant and also spill to tempdb can be avoided in some situations.

Distinct By Group By Memory Grant

Finally, I would like to thank Rob Farly's this for this post, which I have refereed.  

Hope this helps you somewhere. Also, apart from performance gain, it can be a good coding practice to use Group BY in place of Distinct as we have one more change to look into column list to see we are returning any column which is not required in result set and if somebody using select * (which we should not use ever in production code), in group by column name needs to be specified.

Tags: Distinct, Group By, SQL

Chintak Chhapia
40 · 5% · 1477



  • Thanks a lot for sharing, Chintak! Keep them coming! It's fun learning about how SQL Server works under the hood.

    commented on Nov 12 2012 5:01AM
    Nakul Vachhrajani
    4 · 36% · 11648

Your Comment

Sign Up or Login to post a comment.

"Distinct using Group By" rated 5 out of 5 by 3 readers
Distinct using Group By , 5.0 out of 5 based on 3 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]