A SQL Azure tip a day (7) – Rebuild index

Oopps! Upgrade your browser pretty please. Oopps! Upgrade your browser pretty please.

As I mentioned in an earlier blog post, In SQL Azure you can’t set the Fill factor of the indexes and you get the server default for every index in all your databases. Let’s take a look by executing the following statement: SELECT * FROM sys.indexes

As you can see, the fill factor is 0, and that is the same as fill factor 100! But wait a minute, that means that the indexes are filled to 100% and the inserts will cause page splits. Well if the IO operations of the page splits doesn’t cause any performance issues, is this a problem? Yes of course! The page splits causes fragmentations of the indexes and that will cause performance issues. Well this can’t be an issue! Rebuild the index and the fragmentation is gone.

Of course, but there are some other issues with this fragmentation:

– You can’t see how much fragmenati0n you’ve got, because you don’t have the DMV:s to do that
– You can’t schedule the index maintenance in SQL Agent, because you don’t have one.
– The fragmentation consumes the database space and when you run out of space you can’t add more data
– You can’t rebuild in tempdb
– Every table must have  clustered index and you can’t drop that index

But wait a minute! If you run out of space and you can’t rebuild in your database because you ran out of space, can’t rebuild in tempdb and can’t drop the clustered index, how do you solve that? Well you have to increase the database size.  BUT if I already have use the largest database in SQL Azure, what then! Sorry, but then you have to make some space in your database, that is, delete something! What?!! Is this true? Well try for your self. Create a table with the clustered index on a guid column, to create the worst fragmentation of all, and populate the table with the maximum possible number of records. If you use fixed column sizes you should be able to calculate how many records that should fit in your table (if its the only table). Assume you have a 1Gb database, then you should be able to use the formula ((databasesize_in_gb*1024*1024)/8)*records_per_page, where records_per_page = 8kb/recordsize rounded down to the nearest integer. You probably can’t add that number of records because of the fragmentation. Why? You’ll get page splits in the insert, meaning that a new 8kb page is created and half of the records are moved to the new page, but because of the random insert of GUID:s, you have no guarantee that the two pages will be filled to 100%. And then you get some half filled pages, but you can’t create more pages than databasesize_in_gb*1024*1024)/8) before you run out of space.

Well, you now have successfully filled your db to the limit so that your db becames “read only”, and you want to reclaim your space from the half filled pages by doing a rebuild index. Well, thats not possible, you don’t have space for a rebuild index, can’t sort in tempdb and you have to increase db size or delete data. As this was only a sample you may decide to delete the records, but the db will not be availible until all the pages are deallocated. How do you prevent this? Select the clustered index wisely to prevent fragmentation of the clustered index, use as small data types as possible, keep track of the space required for your tables, rebuild indexes before its too late. If you can use filtered index, an excellent feature to create small specialized indexes.