Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

Browse by Tags · View All
sql_server 217
t-sql 211
tsql 116
sqlserver 96
BRH 78
#SQLServer 66
#TSQL 56
SQL Server 34
function 11
SSMS 9

Archive · View All
August 2007 17
August 2010 8
June 2012 7
June 2011 7
November 2007 7
August 2012 6
May 2012 6
November 2011 6
August 2011 6
October 2011 6

Madhivanan's TSQL Blog

Efficient filtering with HAVING Clause

Jul 15 2013 12:00AM by Madhivanan   

HAVING Clause is used to filter the grouped set of result. It is always used along with GROUP BY Clause. Mostly people use it to find count, minimum or maximum value, sum etc. It can also be used effectively apply many business logic.


Let use consider the following tables
create table #products(product_id char(6) primary key, product_name varchar(75))
insert into #products (product_id,product_name)
select 'PR0001','Television' union all
select 'PR0002','Mobile' union all
select 'PR0003','Air conditioner' 

create table #customers(cust_id char(6) primary key, cust_name varchar(75))
insert into #customers  (cust_id,cust_name)
select 'CU0001','Kumar' union all
select 'CU0002','Sankar' union all
select 'CU0003','Madhivanan' union all
select 'CU0004','Saravanan' 

create table #sales_details
(
	cust_id char(6), product_id char(6),sales_date datetime default getdate(),sales_amount decimal(12,2)
)

insert into #sales_details (cust_id,product_id,sales_amount)
select 'CU0001','PR0001',12000 union all
select 'CU0002','PR0001',12000 union all
select 'CU0002','PR0002',8000 union all
select 'CU0002','PR0003', 25000 union all
select 'CU0003','PR0001', 12000 union all
select 'CU0003','PR0002', 8000 union all
select 'CU0004','PR0002', 8000

Now we will explore how to use HAVING Clause for some specific cases 

1 Find the customers who purchased only one product
select 
	cust_id  
from 
	#sales_details 
group by 
	cust_id 
having 
	min(product_id)=max(product_id)

or
select 
	cust_id  
from 
	#sales_details 
group by 
	cust_id 
having 
	count(distinct product_id)=1
Result is
cust_id
-------
CU0001
CU0004
2 Find the customers who purchased all the products
select 
	cust_id  
from 
	#sales_details 
group by 
	cust_id 
having count(distinct product_id)=(select count(*) from #products) 

Result is
cust_id
-------
CU0002
3 Find the customers who purchased both Television and Mobile and no other products
select 
	sales.cust_id 	
from 
	#sales_details as sales 
	inner join #products as prod on sales.product_id=prod.product_id 
group by 
	sales.cust_id 
having 
	max(case when product_name='Television' then 1 else 0 end)=1 and
	max(case when product_name='Mobile' then 1 else 0 end)=1 and 
	count(distinct product_name)=2
Result is
cust_id
-------
CU0003
4 Find the customers who purchased any product and never purchased Air conditioner
select 
	sales.cust_id  
from 
	#sales_details as sales inner join #products as prod on sales.product_id=prod.product_id 
group by 
	sales.cust_id 
having 
	max(case when product_name='Air conditioner' then 1 else 0 end)=0
Result is
cust_id
-------
CU0001
CU0003
CU0004

Tags: 


Madhivanan
3 · 40% · 12912
4
 
0
Lifesaver
 
 
0
Learned
 
0
Incorrect



Submit

9  Comments  

  • I like a lot of what you've shown, though I have a small quibble with queries 3 and 4. In those, the queries cannot make use of an index on the product_id to gain a seek, but would require a full scan. This would almost certainly be a performance problem. Here's revised query #3 that can possibly get a seek into the sales_details table (depending on indexes):

    select 
        sales.cust_id 	
    from 
        #sales_details as sales 
        inner join #products as prod on sales.product_id=prod.product_id 
    where
       product_name in ('Television', 'Mobile')
    group by 
        sales.cust_id 
    having 
        count(distinct product_name)=2
    ;
    

    Also, when a query is looking for only one occurrence of something, there is no need to count all the occurrences (nor to scan through them) to find this out. Looking at query #4, again there may be a performance improvement by rewriting like so:

    select 
        cust.cust_id 	
    from 
        #customers as cust
    where
       exists (
          select *
          from #sales_details as sales
          where cust.cust_id = sales.cust_id
          and sales.product_id <> (select product_id from #products where cust_name = 'Air conditioner')
       )
    ;
    

    This query could probably use some improvement, but if there are a huge number of sales for many customers, ultimately, with the correct indexes, it should offer a gigantic improvement over any method that requires a full scan in every case.

    commented on Jul 15 2013 1:05PM
    ErikEckhardt
    65 · 3% · 898
  • Hi ErikEckhardt , Thanks for your reply. Your alternate code for method 3 may not work for all set of data. For example, if the customer purchased both Television and Air conditioner and not a mobile, that data will be excluded but your code will still include that customer.

    commented on Jul 16 2013 12:37AM
    Madhivanan
    3 · 40% · 12912
  • Ah, thanks for pointing that out--I missed part of the requirements. I will re-examine as soon as I can.

    commented on Jul 16 2013 3:30AM
    ErikEckhardt
    65 · 3% · 898
  • Thanks. Also can you check your code for method 4? It throws an error. I think "#sales_details as sales" after exists clause is not needed

    commented on Jul 16 2013 4:22AM
    Madhivanan
    3 · 40% · 12912
  • Yes, you made another good catch--it was late and I was in a hurry, I guess. I've updated the query in my post to remove the extraneous snippet.

    About query #3, I did miss the extra requirement "and no other products". So here's a revised query for #3 that still should not scan the entire table.

    select 
       sales.cust_id    
    from 
       #sales_details as sales
       inner join #products as prod on sales.product_id=prod.product_id
    where
       prod.product_name in ('Television', 'Mobile')
       and not exists (
          select *
          from
             #sales_details as sales2
             inner join #products as prod2 on sales2.product_id = prod2.product_id
          where
             sales.cust_id = sales2.cust_id
             and prod2.product_name not in ('Television', 'Mobile')
       )
    group by 
       sales.cust_id 
    having 
       count(distinct product_id) = 2
    ;
    

    I realize this is a much more complicated query in terms of number of clauses and so on, but in a database where there are a huge number of sales records, it should outperform any solution that requires a full scan. Proper indexes are required, of course.

    There are other possibilities--I would especially want to explore queries that could use (SELECT TOP 1) to limit the number of rows searched for for each customer.

    Now, I could be completely wrong, and there could be way more customers than there are sales, so that in fact scanning the sales table would be best. So I'm quite ready to be wrong on this in a particular system. It's just worth it to me to explore other possibilities, too. :)

    commented on Jul 16 2013 11:32AM
    ErikEckhardt
    65 · 3% · 898
  • Hi ErikEckhardt, Thanks for your points on table scan. I think you should use the NOT EXISTS to get the correct result. Here is your modified code

    select 
       sales.cust_id    
    from 
       #sales_details as sales
       inner join #products as prod on sales.product_id=prod.product_id
    where
       prod.product_name in ('Television', 'Mobile')
       and not exists (
          select *
          from
             #sales_details as sales2
             inner join #products as prod2 on sales2.product_id = prod2.product_id
          where
             sales.cust_id = sales2.cust_id
             and prod2.product_name not in ('Television', 'Mobile')
       )
    group by 
       sales.cust_id 
    having 
       count(distinct prod.product_id) = 2
    

    Also, is your method 4 a latest code? It returns all customers for the above data. I think you may need to change your code

    commented on Jul 17 2013 1:25AM
    Madhivanan
    3 · 40% · 12912
  • I'm so glad you're catching my mistakes, because I'm not doing it, obviously! You are 100% correct: I meant "not exists". I should actually be running my queries against your supplied data before posting them, shouldn't I? :)

    About my query number 4, I now see that I am answering a different question than you, but it comes down to a matter of interpretation. The requirement was "Find the customers who purchased any product except Air conditioner". All of the customers qualify because even the customer who purchased an air conditioner also purchased a product that is not an air conditioner.

    Using the new interpretation of the requirements (which I would state more like "Find the customers have made any purchase but have never purchased an air conditioner" would be a different query:

    select 
       cust.cust_id    
    from 
       #customers as cust
       cross apply (
          select product_id from #products where product_name = 'Air conditioner'
       ) x
    where
       exists (
          select *
          from #sales_details as sales
          where
             cust.cust_id = sales.cust_id
             and sales.product_id <> x.product_id
       ) and not exists (
          select *
          from #sales_details as sales
          where
             cust.cust_id = sales.cust_id
             and sales.product_id = x.product_id      
       )
    ;
    

    This time I ran the query to see that it produces the correct results.

    After thinking about this a bit I would like to re-emphasize that the queries I presented are not necessarily better performers. It depends on the nature of the data and the indexes involved. If there are few customers with many purchases, my queries may do well. If there are a ton of customers and they have few purchases, the queries may not do so well. Testing is in order!

    commented on Jul 17 2013 12:15PM
    ErikEckhardt
    65 · 3% · 898
  • Thanks. I actually meant what you said "Find the customers who have made any purchase but have never purchased an air conditioner". If I want to use alternate method, I would use the following

    select 
        cust.cust_id 
    from
        #customers as cust
    left join
    (
        select 
           sales.cust_id    
        from 
           #sales_details as sales
           inner join #products as prod on sales.product_id=prod.product_id
        where
           prod.product_name = 'Air conditioner'
    ) as sales on cust.cust_id=sales.cust_id
    where sales.cust_id is null
    

    I would like to see your testing results

    commented on Jul 17 2013 12:57PM
    Madhivanan
    3 · 40% · 12912
  • That is a perfectly reasonable query, too. Testing is in order, as usual.

    commented on Jul 17 2013 1:44PM
    ErikEckhardt
    65 · 3% · 898

Your Comment


Sign Up or Login to post a comment.

"Efficient filtering with HAVING Clause" rated 5 out of 5 by 4 readers
Efficient filtering with HAVING Clause , 5.0 out of 5 based on 4 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]