Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

Nakul Vachhrajani is a Technical Specialist & Systems development professional with iGATE. He holds a MCTS (SQL Server 2008: Implementation & Maintenance)
Browse by Tags · View All
#SQLServer 234
SQL Server 233
Administration 200
DBA 189
Tips 178
Development 178
T-SQL 173
#TSQL 171
Guidance 115
Tools and Utilities 112

Archive · View All
April 2011 14
March 2012 11
December 2011 11
March 2011 11
December 2012 10
October 2011 10
January 2011 10
January 2013 9
November 2012 9
October 2012 9

#0189-SQL Server-Primary Keys without clustered indexes

Aug 23 2012 12:00AM by Nakul Vachhrajani   

One of my favourite interview questions at an intermediate level has always been: “Can I have a primary key on a table that does not contribute to the clustered index?”. Of course, it’s not always the same question - there are multiple variants, such as :

  • Can a table have a primary key that does not participate in a clustered index?
  • Does designating a column(s) as a primary key automatically create a clustered index on the table based on the key column(s)?
  • Can a clustered index be created using a column(s) other than the primary key of the table?

The answer that most candidates come up with is that there is a direct dependency between the primary key and the clustered index – the primary key has to be the set of columns on which one wishes to create a clustered index.

Contrary to popular belief, the correct answer is “Yes, a table may have a clustered index  defined on a column other than the primary key of the table”. It’s just that most developers are so used to the default design and coding mechanisms that generally the thought of going against the flow may not come to the mind.

Please note that this post is not about the recommended practices for creation of clustered index and primary keys, but is intended to achieve an understanding of the fact that having a primary key does not automatically imply a clustered index in all situations.

Demo

As supporting proof, allow me to create a simple scenario. As part of the demonstration, I would be executing the following steps:

  1. Create a test table without any keys or indexes defined
  2. Define a primary key on the test table, with an explicit NONCLUSTERED keyword added to the definition
  3. Define a CLUSTERED index on the table on a column that’s not the primary key
  4. Check the details of the indexes created using sys.indexes catalog view
USE tempdb
GO

--Safety Check
IF OBJECT_ID('pkTest') IS NOT NULL
    DROP TABLE pkTest
GO

--Create test table definition
CREATE TABLE pkTest (Id INT NOT NULL IDENTITY(1,1),
                     DummyId INT NOT NULL,
                     KeyName VARCHAR(20)
                    )
GO

--Notice the use of NONCLUSTERED 
ALTER TABLE pkTest
    ADD CONSTRAINT pk_pkTest_pkTestId PRIMARY KEY NONCLUSTERED (Id)
GO

--Now create a clustered index on a column other than the Primary Key
CREATE CLUSTERED INDEX idx_pkTest_DummyId ON pkTest (DummyId)
GO

As you can see, we created the Primary Key with a keyword – NONCLUSTERED. This should create a non-clustered index for the primary key and a clustered index for the DummyId column. Let’s run the following query to check:

--Check the types of Indexes that have been created
SELECT si.object_id,
       OBJECT_NAME(si.object_id) AS ObjectName,
       si.name,
       si.index_id,
       si.type,
       si.type_desc,
       si.is_unique,
       si.is_unique_constraint,
       si.is_primary_key
FROM sys.indexes AS si
WHERE si.object_id = OBJECT_ID('pkTest')
GO

Query execution yields the following results:

object_id ObjectName name index_id type type_desc is_unique is_unique_constraint is_primary_key
853578079 pkTest idx_pkTest_DummyId 1 1 CLUSTERED 0 0 0
853578079 pkTest pk_pkTest_pkTestId 2 2 NONCLUSTERED 1 0 1

Conclusion

A clustered index can be defined on any column of a table as long as the column satisfies the criteria for a clustered index. Having a primary key does not imply that clustered index will exist on the primary key under all situations.

Reference

Until we meet next time,

Be courteous. Drive responsibly.

Tags: #SQLServer, SQL Server, Administration, DBA, T-SQL, #TSQL, Myths


Nakul Vachhrajani
4 · 33% · 10587
10
 
0
Lifesaver
 
 
0
Learned
 
0
Incorrect



Submit

32  Comments  

  • Even I ask the same question during interview.

    Primary Key and Clustered Index are two different things. Developers get confused because they usually misunderstand the following line "On a table when a Primary Key defined by default SQL Server creates a Unique Clustered Index on those column(s) where primary key is defined"

    As we know that we can't have 2 clustered index on a table, When we create a Primary key where a Clustered index is already exist in that case , SQL Server creates a Unique Non Clustered Index.

    Sometime I ask a different version of same topic question? i.e. What will happen if I wanted to create a Primary on a column on which already a clustered index exist?

    Thanks & Regards, Sandip Pani

    commented on Aug 23 2012 1:00AM
    sqlcommitted
    1059 · 0% · 24
  • Thank-you for sharing your views, Sandip. I do like the nice twist to the question. Thank-you for taking the time out and reading this post.

    commented on Aug 23 2012 1:24AM
    Nakul Vachhrajani
    4 · 33% · 10587
  • However there is a difference. When you create a table with column with Primary key defined, by default the column which is defined as Primary key will have clustered index. Now see what happens when you create this table

    CREATE TABLE pkTest (Id INT NOT NULL IDENTITY(1,1) PRIMARY KEY,
                         DummyId INT NOT NULL,
                         KeyName VARCHAR(20)
                        )
    

    Column Id is automatically associated with a clustered index. Your case is valid only when you alter the table to have primary key constraint

    commented on Aug 23 2012 6:03AM
    Madhivanan
    3 · 39% · 12441
  • @Madhivanan, You are right,What you mentioned here is the default behavior.

    Kindly refer the Conclusion section, Here It is mentioned that "Having a primary key does not imply that clustered index will exist on the primary key under all situations".

    Thanks & Regards, Sandip Pani link text

    commented on Aug 23 2012 6:21AM
    sqlcommitted
    1059 · 0% · 24
  • @Madhivanan try this:

    CREATE TABLE pkTest
       (
          [id] [int], 
          [DummyId] [int],
          [KeyName] [varchar](20), 
          CONSTRAINT [PK__pkTest__Id] PRIMARY KEY NONCLUSTERED ( [id] )
       )
    

    There is no need for an ALTER TABLE, you can create a non-clustered primary key directly in the CREATE TABLE statement.

    commented on Aug 23 2012 8:38AM
    Marc Jellinek
    97 · 2% · 546
  • Here's the real question: Why have a clustered index on a column or columns OTHER than the Primary Key?

    commented on Aug 23 2012 8:39AM
    Marc Jellinek
    97 · 2% · 546
  • A clustered index can be defined on any column of a table as long as the column satisfies the criteria for a clustered index

    What exactly are the critieria for clustered indexes?

    commented on Aug 23 2012 1:56PM
    Marc Jellinek
    97 · 2% · 546
  • I have posted similar stuff in 2006 @ http://vadivel.blogspot.in/2006/03/primary-keys-without-clustered-index.html ...

    commented on Aug 23 2012 9:32PM
    Vadivel
    474 · 0% · 79
  • Thank-you for sharing @Vadivel!

    commented on Aug 24 2012 12:38PM
    Nakul Vachhrajani
    4 · 33% · 10587
  • @Marc: Criteria for clustered indexes can be found in the references section of my post. The link is repeated here for your kind reference: http://msdn.microsoft.com/en-us/library/ms190639(v=sql.105).aspx

    commented on Aug 25 2012 9:15AM
    Nakul Vachhrajani
    4 · 33% · 10587
  • @Nakul, there are no "criteria" listed, only recommendations or situations where a clustered index should be considered.

    commented on Aug 25 2012 9:30AM
    Marc Jellinek
    97 · 2% · 546
  • @Marc

    Here's the real question: Why have a clustered index on a column or columns OTHER than the Primary Key?

    Well, for example, because primary key is multicolumn and large, so you risk inefficient INSERTs, many page splits, and having large, inefficient clustering key in every nonclustering index on the table. Acquaint yourself with the concept of surrogate keys.

    commented on Aug 27 2012 2:10AM
    chojrak11
    1009 · 0% · 25
  • @chojrak: After working with SQL Server for the past 18 years, including 5 years at Microsoft, I'm very familiar with the concepts (and worst practices) around primary keys, candidate keys and surrogate keys. I've spent a good portion of my career improving the performance of people who blindly create an [int] IDENTITY(1, 1) based prirmary key on every table

    If your primary key is clustered, the risks you list are valid. If the primary key is nonclustered, none of the risks you list are valid.

    Create your clustered index to satisfy your most common sorting requirements. Create your primary key to satisfy your most relevant uniqueness constraints. If they are the same column (or columns), use a clusterer primary key. If they are not, the most common sorting requirement with the least volitile data should be the clustered index.

    commented on Aug 27 2012 8:52AM
    Marc Jellinek
    97 · 2% · 546
  • @Marc

    thus you just answered your very question. Offtopic - if possible, I'd like to know more about problems with those "[int] IDENTITY(1, 1) based prirmary key on every table". I have to confess that, although not blindly, I use them from time to time, and until now successfully :-)

    commented on Aug 27 2012 9:49AM
    chojrak11
    1009 · 0% · 25
  • @Chojrak11: my question was rhetorical... a question designed to get the audience to think. Here are some more rhetorical questions:

    Problems with primary keys based on [int] IDENTITY(1,1)

    Lets's say that you have a customer list, which is most often sorted by name.

    Ask yourself:

    A) What does a sorting operation on [CustomerName] look like when the primary key is clustered and based on [CustomerId]? B) What does a sorting operation on [CustomerName] look like when the primary key is nonclustered and based on [CustomerID] and there is a clustered index on [CustomerName]?

    Both A and B will work. Which will work faster?

    Which will work faster when retrieving records based on a customer name (a lookup)?

    Now let's say that you have a link table that links Vendors to the Products they sell. Products can be provided by multiple Vendors and Vendors sell multiple Products.

    CREATE TABLE [Products]
       (
          [ProductId] [int] IDENTITY (1, 1), 
          [Description] [varchar](max)
       )
    
    CREATE TABLE [Vendors]
       (
          [VendorId] [int] IDENTITY (1,1), 
          [Name] [varchar](100)
       )
    
    CREATE TABLE [lnk_Vendors_Products]
       (
          [recordId] [int] IDENTITY(1,1), 
          [fkVendorId] [int], 
          [fkProductId] [int], 
          [StartDate] [date], -- date vendor starting offering product
          [EndDate] [date]   -- date vendor stopped offering product
       )
    

    Is it going to be best to create a clustered primary key on [lnk_Vendors_Products].[recordId] with a uniqueness constraint on ([fkVendorId], [fkProductId]) or just have a clustered primary key directly on ([fkVendorId], [fkProductId])?

    In order to keep key proliferation to a minimum (any children of [lnk_Vendors_Products], should reference the primary key [recordId]. But the clustered index should go on ([fkVendorId], [fkProductId]). This allows children to reference only the [recordId], but allow for fast retrieval based on [fkVendorId].

    Note: if you are searching for Products more often than Vendors, the clustered index should go on [fkProductId], [fkVendorId]; rather than the other way around.

    commented on Aug 27 2012 10:47AM
    Marc Jellinek
    97 · 2% · 546
  • @Marc Jellinek: Thank-you very much for the questions. They do get me to think hard and think more - which is always exciting. Also, thank-you very much for sharing some of the nuances around key design.

    Regarding your note:

    Note: if you are searching for Products more often than Vendors, the clustered index should go on [fkProductId], [fkVendorId]; rather than the other way around.

    I am sure you remember, but for the benefit of others I would like to refer a previous post here which would highlight one of the many reasons behind the note: SQL Server – T-SQL - Best Practice - Composite Primary Keys – Performance (and accuracy) impact

    This has been a very, very interesting discussion. Thank-you, all.

    commented on Aug 27 2012 12:26PM
    Nakul Vachhrajani
    4 · 33% · 10587
  • @Nakul, I'm not sure I understand the reference to the blog post about composite primary keys. For that matter, I don't understand what best practice it recommends (other than referencing ALL columns in a primary key when making a join)

    commented on Aug 27 2012 1:20PM
    Marc Jellinek
    97 · 2% · 546
  • @Marc: Yes, referencing ALL columns of the composite key is definitely a take-away and the other (I agree, it's not so obvious) point was that switching the order of the columns does have an impact on the way SQL Server would use the index defined on the key.

    commented on Aug 27 2012 2:03PM
    Nakul Vachhrajani
    4 · 33% · 10587
  • @Marc thank you for detailed answer. A lot of additional rhetorical questions in it :-)

    So, this is an additional thing to consider while designing a table. In case of clustered keys on most often sorted columns that aren't ever-increasing on INSERT, we gain performance on /some/ queries, but one must be sure that amount of INSERTs (and associated splits) will not kill overall performance and not cause locks on high concurrency. Some performance can be gained by properly selecting fill factor, but then we increasing # of I/Os for reads. That said such a key may possibly be better for larger tables that are mostly read and sometimes updated, but for smaller transactional tables heavily INSERTed/DELETEd I'd use small, monotonic clustering key. But as always in SQL world, it'd be best to try and measure.

    I'll do my homework and analyze this subject matter thoroughly.

    Thanks again, I'd upvote you more times, but it doesn't let me.

    commented on Aug 28 2012 1:37AM
    chojrak11
    1009 · 0% · 25
  • Returning to the orignal theme of "default behaviour" for a moment... Adding a primary key to a table which is already clustered creates a new non-clustered index by default.

    -- The NONCLUSTERED keyword is optional in the DDL statement:
    ALTER TABLE pkTest
      ADD CONSTRAINT pk_pkTest_pkTestId PRIMARY KEY /* NONCLUSTERED */ (Id)
    GO
    

    This is true even when the primary key is added to the clustering key:

    -- A redundant nonclustered index is created on the clustering key, DummyId:
    ALTER TABLE pkTest
      ADD CONSTRAINT pk_pkTest_pkTestId PRIMARY KEY /* NONCLUSTERED */ (DummyId)
    GO
    

    As you'd expect, you can't "promote" an already-existing clustering key to a primary key without dropping it first. eg:

    ALTER TABLE pkTest
      ADD CONSTRAINT pk_pkTest_pkTestId PRIMARY KEY CLUSTERED (DummyId)
    GO
    

    Fails with error:

    Msg 1902, Level 16, State 3, Line 3
    Cannot create more than one clustered index on table 'pkTest'. Drop the existing clustered index 'idx_pkTest_DummyId' before creating another.
    Msg 1750, Level 16, State 0, Line 3
    Could not create constraint. See previous errors.
    
    commented on Aug 28 2012 3:57AM
    a.diniz
    310 · 0% · 137
Previous 1 | 2 Next

Your Comment


Sign Up or Login to post a comment.

"#0189-SQL Server-Primary Keys without clustered indexes" rated 5 out of 5 by 10 readers
#0189-SQL Server-Primary Keys without clustered indexes , 5.0 out of 5 based on 10 ratings
    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]