SQL server performance – death by bad statistics

Oopps! Upgrade your browser pretty please. Oopps! Upgrade your browser pretty please.

It has happened before and will happen again. I have said it before and I will probably say it again! We all have seen it before and we will see it again! Your procedures used to work very well, execute fast and everybody is happy, but one day one of your procedures executes forever and you think your SQL server database has passed on into eternity. You start digging in the server and try to find the performance killer.  You find  the procedure fast, but what is wrong?

The code is the same, the tables are all the same and the indexes are all the same, but why is the procedure killing the server? Maybe you take the code from the procedure and try to execute the steps one by one and realizes that the code from the procedure  executes fast and the execution plan looks fine. But why! You execute the procedure once again and the procedure is extremely slow. But wait a minute! If the code executes fast and the procedure is slow, do they execute in the same way? Well, the answer is NO!

By digging in the procedure cache you finally find an execution plan that doesn’t look the same as the execution plan from the code you just executed. You can find the execution plan for your procedure by using the DMV:s that was introduced in SQL server 2005. ;WITH ctePS AS( SELECT * FROM sys.[dm_exec_procedure_stats] AS DEPS WHERE object_id=OBJECT_ID(‘databasename.schemaname.procedurename’) AND [DEPS].[database_id] = DB_ID(‘databasename’) ) SELECT DB_NAME(ps.database_id), objname = OBJECT_NAME(ps.[object_id], ps.[database_id]), SUBSTRING(qt.text,qs.statement_start_offset/2 +1, (CASE WHEN qs.statement_end_offset = -1 THEN LEN(CONVERT(nvarchar(max), qt.text)) * 2 ELSE qs.statement_end_offset end – qs.statement_start_offset )/2 ) AS query_text, qs.[total_worker_time] / qs.[execution_count]/1000000, qs.*, ps.*, ph.* FROM ctePS ps INNER JOIN sys.[dm_exec_query_stats] AS qs ON ps.[sql_handle] = qs.[sql_handle] CROSS APPLY sys.[dm_exec_sql_text](qs.[sql_handle]) AS qt CROSS APPLY sys.[dm_exec_query_plan](qs.[plan_handle]) AS ph ORDER BY ps.[last_worker_time] DESC, qs.[statement_start_offset] ASC

The procedure is using a cached plan, but your code created a new plan based on actual statistics. When you look into the cached execution plan you may find a index seek/scan that is estimated to return less records than whats actually returned. The index seek/scan is followed by a nested loop to some other tables and the estimated number of executions is far less than actual number of executions. As you probably already know, a nested loop is like a hidden cursor and is very slow when the number of execution is high. When the number of execution is high, it’s much more efficient with a merge or hash join.

This bad execution plan is caused by bad statistics and parameter sniffing! When the procedure was recompiled for some reason, the statistics where not accurate and the parameters cheated the optimizer to think that you didn’t have as many records as you really had. The optimizer gave you an unefficient execution plan and placed it into the cache to be reused for every call to the procedure.

The solution is to make sure your statistics are updated on a regular basis. But wait a minute, I have auto update statistics activated on my indexes! Well, it doesn’t help you in all cases, because auto update statistics are only triggered when more than 20% of your records in the index have been updated. If you have large tables, you need to trigger update statistics manually on regular basis, either by ALTER INDEX indexname ON tablename REBUILD or UPDATE STATISTICS tablename

If you need some help to optimize your Microsoft SQL server database, you can call any of our SQL server experts and we will help you. We take assignments from all of scandinavia.