By this time everyone are aware that the next version of SQL Server i.e DENALI comes with ColumnStore Indexes which is a new concept in SQL Server. If you haven’t tried this feature yet then its about time to try it out on CTP3. I am pretty sure you will like it and this may be one of the few features that gets implemented in production quickly and get good returns for the time invested.
I am NOT going to regurgitate the basic details which are already covered and if you are NOT familiar with the concept then I would encourage to follow these articles listed at the bottom of this post.
In this post, we will create few columnstore indexes and look at some differences and some internals data. Using the example on BOL, let’s create 2 tables, a partitioned and a non-partitioned table based on dbo.FactResellerSales from AdventureWorksDWDenali database from codeplex. And add some data to highlight a specific case and create column store indexes on both the partitioned and non-partitioned table. For complete script, look at the bottom of this post.
INSERT INTO dbo.FactResellerSalesPtnd
(ProductKey,OrderDateKey,DueDateKey,ShipDateKey,ResellerKey,EmployeeKey,PromotionKey,CurrencyKey,SalesTerritoryKey,SalesOrderNumber
,SalesOrderLineNumber,RevisionNumber,OrderQuantity,UnitPrice,ExtendedAmount,UnitPriceDiscountPct,DiscountAmount,ProductStandardCost
,TotalProductCost,SalesAmount,TaxAmt,Freight,CarrierTrackingNumber,CustomerPONumber)
SELECT ProductKey,OrderDateKey,DueDateKey,ShipDateKey,ResellerKey,EmployeeKey,PromotionKey,CurrencyKey,SalesTerritoryKey,SalesOrderNumber
,SalesOrderLineNumber,RevisionNumber,OrderQuantity,UnitPrice,ExtendedAmount,UnitPriceDiscountPct,DiscountAmount,ProductStandardCost
,TotalProductCost,SalesAmount,TaxAmt,Freight,CarrierTrackingNumber,CustomerPONumber
FROM dbo.FactResellerSales
WHERE OrderDateKey IN ( '20030901', '20030801', '20031201', '20040601', '20020801')
GO 400
INSERT INTO dbo.FactResellerSalesNonPtnd
(ProductKey,OrderDateKey,DueDateKey,ShipDateKey,ResellerKey,EmployeeKey,PromotionKey,CurrencyKey,SalesTerritoryKey,SalesOrderNumber
,SalesOrderLineNumber,RevisionNumber,OrderQuantity,UnitPrice,ExtendedAmount,UnitPriceDiscountPct,DiscountAmount,ProductStandardCost
,TotalProductCost,SalesAmount,TaxAmt,Freight,CarrierTrackingNumber,CustomerPONumber)
SELECT ProductKey,OrderDateKey,DueDateKey,ShipDateKey,ResellerKey,EmployeeKey,PromotionKey,CurrencyKey,SalesTerritoryKey,SalesOrderNumber
,SalesOrderLineNumber,RevisionNumber,OrderQuantity,UnitPrice,ExtendedAmount,UnitPriceDiscountPct,DiscountAmount,ProductStandardCost
,TotalProductCost,SalesAmount,TaxAmt,Freight,CarrierTrackingNumber,CustomerPONumber
FROM dbo.FactResellerSales
WHERE OrderDateKey IN ( '20030901', '20030801', '20031201', '20040601', '20020801')
GO 400
-- Create the columnstore index
CREATE NONCLUSTERED COLUMNSTORE INDEX csindx_FactResellerSalesPtnd
ON FactResellerSalesPtnd
( ProductKey,OrderDateKey,DueDateKey,ShipDateKey,ResellerKey,EmployeeKey,PromotionKey,CurrencyKey,SalesTerritoryKey
,SalesOrderNumber,SalesOrderLineNumber,RevisionNumber,OrderQuantity,UnitPrice,ExtendedAmount,UnitPriceDiscountPct
,DiscountAmount,ProductStandardCost,TotalProductCost,SalesAmount,TaxAmt,Freight,CarrierTrackingNumber,CustomerPONumber
);
CREATE NONCLUSTERED COLUMNSTORE INDEX csindx_FactResellerSalesNonPtnd
ON FactResellerSalesNonPtnd
( ProductKey,OrderDateKey,DueDateKey,ShipDateKey,ResellerKey,EmployeeKey,PromotionKey,CurrencyKey,SalesTerritoryKey
,SalesOrderNumber,SalesOrderLineNumber,RevisionNumber,OrderQuantity,UnitPrice,ExtendedAmount,UnitPriceDiscountPct
,DiscountAmount,ProductStandardCost,TotalProductCost,SalesAmount,TaxAmt,Freight,CarrierTrackingNumber,CustomerPONumber
);
GO
Now, let’s take a look at the some DMV’s and see the behavior for the partitioned and non-partitioned tables. For the same amount of data, and the same schema with & without partitioning the number of segments as well as the row counts in the segments are different.
SELECT object_name(p.object_id) as TableName, i.name, type_desc, rows, partition_number, css.hobt_id, column_id, segment_id, row_count, min_data_id, max_data_id, on_disk_size, *
FROM sys.column_store_segments css JOIN sys.partitions p
ON css.hobt_id = p.hobt_id
JOIN sys.indexes i
ON p.object_id = i.object_id
AND p.index_id = i.index_id
WHERE p.object_id = object_id('FactResellerSalesPtnd')
SELECT object_name(p.object_id) as TableName, i.name, type_desc, rows, partition_number, css.hobt_id, column_id, segment_id, row_count, min_data_id, max_data_id, on_disk_size, *
FROM sys.column_store_segments css JOIN sys.partitions p
ON css.hobt_id = p.hobt_id
JOIN sys.indexes i
ON p.object_id = i.object_id
AND p.index_id = i.index_id
WHERE p.object_id = object_id('FactResellerSalesNonPtnd')

Look at number of segments for partitioned table

Look at number of segments for Non-partitioned table
What is a Segment?
The columnstore index consists of small chunks known as segments, which are stored as a LOB and can span multiple pages. A Segment contains about 1 million rows and will have data from only one column. i.e Column Segments are NOT shared between columns. From the pictures above, take a note that the number of segments and the row counts within the segments are different even with the same schema and with same amount of data also. Additional details on the segments will be shared in the follow-up posts. For now, make a note that this is very important in understanding how the columnstore indexes work.
Using DBCC PAGE and DBCC IND on Columnstore indexes
Yes, it is possible to use DBCC IND as well as the DBCC PAGE on the columnstore indexes. From DBCC IND, we get a bunch of LOB pages as columnstore is stored in LOB data.
DBCC IND ('AdventureWorksDWDENALI', 'FactResellerSalesPtnd', 3)
DBCC IND ('AdventureWorksDWDENALI', 'FactResellerSalesNonPtnd', 3)

DBCC IND output for ColumnStore
From the above results, use one of the pages and look at them using the below. Since the data in columnstores is compressed heavily the results from the below seems like mumbo jumbo apart from the page header.
DBCC PAGE (AdventureWorksDWDENALI, 1, 363776, 3);
GO
DBCC PAGE (AdventureWorksDWDENALI, 1, 379041, 3);
GO

DBCC PAGE output for ColumnStore
A bug in ColumnStore CTP3?
When creating the columnstore index on the non-partitioned table in this post above, a message is displayed like below. I have been told this is already identified and fixed in the next builds.
Column 24 was omitted from VertiPaq during column index build
That's it for this short first post on this series. In the follow-up posts, we will delve into Extended events and get some additional insights into the columnstores.
References
http://msdn.microsoft.com/en-us/library/gg492088(v=SQL.110).aspx
http://download.microsoft.com/download/8/C/1/8C1CE06B-DE2F-40D1-9C5C-3EE521C25CE9/Columnstore%20Indexes%20for%20Fast%20DW%20QP%20SQL%20Server%2011.pdf
http://channel9.msdn.com/Events/TechEd/NorthAmerica/2011/DBI312
http://social.technet.microsoft.com/wiki/contents/articles/sql-server-columnstore-index-faq.aspx
http://channel9.msdn.com/posts/SQLServerColumnStoreIndex
http://rusanu.com/2011/07/13/how-to-use-columnstore-indexes-in-sql-server/
Acknowledgments
Eric Hanson and Susan Price from MSFT SQL team.
Republished from Sankar Reddy [39 clicks].
Read the original version here [32134 clicks].