SQL Server Indexing strategies part 2 – clustered index

Oopps! Upgrade your browser pretty please.Oopps! Upgrade your browser pretty please.

Indexing strategy

There are some different types of indexes in SQL server:

– Clustered index
– Non clustered index

    • Covering
    • Filtered

– Indexed views
– Column Store (SQL server 2012)

The most important index is the clustered index and you should choose your clustered index carefully. If you make the wrong decision you may end up with some terrible performance and maintenance issues. What is a wrong decision and what’s a right decision? Well, there is no right answer, but you should not create the clustered index without a thought. By creating a primary key with the table designer in SQL server, you automatically create a clustered index.

Is the primary key the best clustered index? Well, it depends! It depends on the datatypes, number of columns and how it’s being used.

It’s a common practice to use a surrogate key that doesn’t have any relation to the data as the primary key, like an IDENTITY column. Is the surrogate key a good candidate? It’s only one column, ever increasing and use 4byte for integer or 8byte for bigint. I keep repeating, “It depends”

How often do you search a range on the surrogate key? Do you keep getting singleton lookups on the surrogate key because you search on other columns, like a date range?

You can look at how your indexes are used with the following code:

SELECT * FROM sys.dm_db_index_usage_stats

SELECT * FROM sys.dm_db_index_operational_stats(DB_ID(),null,null,null)

 

This will tell you how many user seeks, user scans, user lookups and user updates. If you have a lot of lookups and few seeks/scans on the clustered index and a lot seek/scans on another index that might be a more useful index.

The other candidate for clustered index might not be ever increasing, but does it have to be? Well no, but you may experience more fragmentation then with an identity column. BUT using the identity column might give you some other headaches like:

– The identity column has no “real” meaning and you probably have to create an alternate key to avoid duplicate records. That gives you two indexes just to keep database consistency.
– If you need to insert data into an identity column, you need to use “set identity insert on” and that gives you a table lock.
– Indexes on increasing columns might give you issues with statistics, which in turn can give you a plan that is terrible from a performance perspective.

The main reason to keep the index column as narrow as possible is because the index columns of the clustered index will be stored in the leaf pages of the non clustered index to create a pointer to the rest of the columns, which is stored in the leaf pages of the clustered index.  If you use many and/or wide columns in the clustered index, you will increase the size of the non clustered indexes, the number of IO operations and memory usage.

As you can see, there is a lot to think of when you design your tables, and you should test the best design for your database.

If you want to know more about indexing strategies for SQL server, you are welcome back next week or contact any of our SQL server consultants.