As DBAs, why do we rebuild indexes? Fragmentation of course, both logical and
physical, causing performance degradation as the fragmentation increases (often
significant, but generally only an issue on systems where index scans exist –
singleton lookups should not be affected much, if at all). We slow the fragmentation by setting our
FILLFACTOR to an appropriate level, minimising page splits, but ultimately we
must implement a ‘fix’. Often, we set thresholds to determine how we reduce the
index fragmentation; for example we may choose to REORGANIZE the index if
fragmentation is between 5 and 30%, and REBUILD for anything over 30%. I have even been known to drop and recreate
indexes on occasion where REBUILDs have not been effective (FYI if you’re
interested in learning more about index fragmentation, there are many good
articles out there explaining the details in depth, as well as the differences
between REORGANIZE and REBUILD, so I’ll not go into any depth in this post).
In SQL Azure the architecture is very different, and Microsoft
has attempted to abstract the physical operations away from us as users. For the DBA this has made life both interesting
and frustrating. Many DMVs are hidden
from us, and the way we approach the monitoring and investigation of database
issues has had to change. With indexes
we have lost various options, the key ones are:
- FILLFACTOR (it is defaulted to 100%, and cannot
be changed)
- REORGANIZE (we have REBUILD only)
- SORT_IN_TEMPDB
- DATA_COMPRESSION
There are also some other important factors to take into
account. Firstly, each session is
limited to 1GB of transaction log space, where actions are fully logged. So, creating, altering and dropping indexes
must fit within the 1GB transaction space or you will get a severe error,
40552, meaning you have reached your transaction log limit. Secondly, we have no control over the
architecture on Azure, therefore your database could share a node with any
number of other systems. The activity
generated by those systems can have a direct effect on the success/failure of
your transactions, as throttling can occur.
Dealing With Index
Fragmentation in SQL Azure
Microsoft has made it possible to look at index
fragmentation in SQL Azure, by providing us with the Dynamic Management Function
sys.dm_db_index_physical_stats. It wasn’t always available, but has recently been
added to the list of allowable DMVs & DMFs.
This enables us to monitor the fragmentation levels of our indexes in
the same way as our on-premise systems, an example query is shown below:
SELECT
DB_NAME() AS DBName
--,CONVERT(bigint,DATABASEPROPERTYEX(DB_Name(), 'MaxSizeInBytes')) AS DBMAXSizeBytes
,OBJECT_NAME(ps.object_id) AS TableName
,i.name AS IndexName
,ips.index_type_desc
,ps.reserved_page_count*8.0*1024 AS IndexSpaceUsedBytes
,ips.avg_fragmentation_in_percent
,ips.fragment_count
,ips.avg_fragment_size_in_pages
,ips.page_count
FROM sys.dm_db_partition_stats ps
INNER JOIN sys.indexes i
ON ps.object_id = i.object_id
AND ps.index_id = i.index_id
CROSS APPLY sys.dm_db_index_physical_stats(DB_ID(), ps.object_id, ps.index_id, null, 'LIMITED') ips
ORDER BY ps.object_id, ps.index_id
Note, cross-database queries cannot be run in SQL Azure so
the database ID is generated using the DB_ID() function.
In theory then, a similar mechanism to the on-premise
solution could be constructed in SQL Azure, but looking at REBUILDs only. However, in reality we must consider the
impact of the limited 1GB transaction log.
We have one of two options; firstly, we can avoid REBUILDs for indexes
over 1GB in size thus removing the issue of the transaction log size, or
secondly we can perform an online REBUILD:
ALTER INDEX <index> ON
<table> REBUILD WITH (ONLINE = ON)
An online REBUILD breaks down the task into the smallest transactions
possible. An index of any reasonable
size will likely fail if you try and REBUILD it OFFLINE, whereas using the
ONLINE option breaks up the task into smaller transactions giving a higher
likelihood of success.
However, an online rebuild does not guarantee success! When dealing with large indexes it is still
very possible to fill the allocated transaction log space even when using the
ONLINE option. Also, as with on-premise (pre-2012),
BLOBs are not supported in online rebuilds.
So if your index includes any columns of type [n]varchar(max), image, [n]text,
xml or varbinary(max), an online rebuild just isn’t possible.
Microsoft are recommending against index REBUILDs in SQL
Azure – this was a verbal recommendation, with the reasoning that in a
multi-tenant environment there is little likelihood of seeing a performance
improvement, as things like sequential disk access and disk fragmentation
cannot be controlled. Coupled with the
potential issues around performing a REBUILD, there seems little point in going
the extra mile to set up index defrag routines – and with a lack of SQL Agent
in the cloud this is more difficult than it first seems.
So, with reduced index options, limited transaction log
space, and little guarantee of improved performance, what is the
conclusion? Ignore index rebuilds in SQL Azure.
If you have to rebuild an index, perform it ONLINE, it’s slower but has
more guarantees of successful completion.
Otherwise, accept your fragmentation levels and concentrate on fixing
other “problems”.