Some time ago, I was asked by a client if they could use indexed views in a staging environment and what are the things that they should consider before doing so? Would that affect data loads? It was an interesting question and it took me a few seconds to respond as I had never created indexed views on staging databases in a data warehousing environment. Also, another question that was asked by them, will the index on the view get updated when the data in the underlying table gets updated? That was easy to answer as we all know indexes right? And an Indexed View is no different in how it works. The answer is YES, the index will get updated. Here is the proof of that
Look at the Actual Execution Plan. There is only one table insert. So far so good.
Create a View on Test_Table1, then create a unique clustered index on the newly created view and then insert two rows as shown below.
The execution plan will show an insert in the table (Test_Table1) and another insert in the index (idx_View1_Test_Table1). This holds true for update and delete
There is a whole list of requirements a view must meet before we go about creating indexes on the views. You can read all about it here – http://msdn.microsoft.com/en-us/library/ms191432.aspx
So here is what I have answered so far
- Can Indexed views be created on staging databases? Answer – Ofcourse, if there is a need for it. There are no rules againt indexed views in staging databases.
- Do indexes on the views get updated when the data is updated? Answer – Yes
Lets look at the original question – Would having indexed views affect data refresh?
Answer – It depends on how the data load process (OR ETL) is written. The load process would definitely slow down íf we are talking about millions of rows in each load. The impact will be inevitable. I don’t have any numbers at the moment to prove this but based on my experience, a few hundred or a few thousand rows would not be that big a problem. What’s interesting to know here is that if you are doing parallel inserts (by the same session or different sessions) to a table with an indexed view with an index on it you will experience locking issues.
Let’s use the test table, test view and the index we created previously. Do the following insert in the same SSMS tab as before
Open another SSMS tab and perform the following update statement
You will notice that this insert statement does not complete because it is blocked by the previous insert (which is not committed yet) which is updating the corresponding entry in the indexed view and the insert acquires a lock on it.
The point here is not to highlight that we should refrain from creating indexed views, the point is that we should create them when we really need them so they don’t become COUNTERPRODUCTIVE.
To avoid locking as shown above, a simple alternative is to either drop the indexes before the load process and re-create them after the load process (assuming we are talking reasonable size tables) OR disable the indexes before the load process and enable them after the load process (this would be extremely helpful for large indexes). Please remember enabling an index here means an Index Rebuild. If you are not clear about how enabling/disabling indexes work, here is a great post which explains this
Also, another important point here to remember is that disabling a clustered index on a TABLE (pay attention here TABLE) makes the table inaccessible as the clustered index is a table itself whereas disabling a clustered index on a view DOES NOT make the view inaccessible because a clustered index on a view is the view’s data materialized – stored on disk. When the clustered index on a view is diabled the view still returns data by executing the query defning the view.
To understand it better, take a look at the difference in the execution plans with and without the index on the view.
Select * from [V1_Test_Table1]
If the index on the is active or enabled, the optimizer directly accesses the data from the index.
If the index on the view is disabled, the optimizer takes a longer route in fetching the data, accessing the primary key index, sorts the data and then aggregates it as the view has an aggregate calculation.
This excellent post by Tony Rogerson should give you a great insight on using views carefully
VIEWS – they offer no optimisation benefits; they are simply inline macros – use sparingly
Hope this helps!
Republished from Simran Jindal's Blog [4 clicks].
Read the original version here [1 clicks].