In the last couple of weeks I’ve been helping one of our clients with some performance issues in a stored procedure. Some days it works well and some days it doesn’t. After some investigations I found two different execution plans, and the unefficient plan indicated that only one record was returned for some of the tables, but in the reality they returns a couple of hundreds of thousands.
This is a common problem when the statistics are out of date and you can read more about the problem at my earlier blog post, but what about auto update stats. Doesn’t it work? Well it does, the statistics are updated when more than 20% of the data is changed, but in this case we loads a datawarehouse with new data every day and the procedure that doesn’t perform well only operates on new data. Some of the tables contains >3oo ooo ooo records and 20% of the data is a 60 000 000 records, and that means that the statistics are not updated every day.
The problem is that the procedure recompiles everytime and is using out of date statistics to create an execution plan. If you want to learn more about auto update stats you can try the attached script.
CREATE FUNCTION [dbo].[fnNumsTable](@pStartValue BIGINT= 1,@pEndValue BIGINT= 1000000,@pIncrement BIGINT= 1) RETURNS TABLE AS — + ————————————————————————————————————— — ! O b j e c t i v e : Return a list of all numbers between startvalue to end value — ! The table may be used to avoid cursors, generate a list of dates etc. — +—————————————————————————————————————-
–Select n from dbo.[fnNumsTable](1000,2000,1) RETURN( WITH BaseNum(N) AS ( SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 ), L1(N) AS ( SELECT bn1.N FROM BaseNum bn1 CROSS JOIN BaseNum bn2 ), L2(N) AS ( SELECT a1.N FROM L1 a1 CROSS JOIN L1 a2 ), L3(N) AS ( SELECT TOP ((ABS(CASE WHEN @pStartValue < @pEndValue THEN @pEndValue ELSE @pStartValue END – CASE WHEN @pStartValue < @pEndValue THEN @pStartValue ELSE @pEndValue END))/ABS(@pIncrement)+ 1) a1.N FROM L2 a1 CROSS JOIN L2 a2 ), Tally(N) AS ( SELECT row_number() OVER (ORDER BY a1.N) FROM L3 a1 ) SELECT ((N – 1) * @pIncrement) + @pStartValue AS N FROM Tally ); GO CREATE TABLE dbo.statsSample ( ID int IDENTITY(1,1) NOT NULL, repDate DATE NOT NULL ); GO CREATE CLUSTERED INDEX ix ON dbo.statsSample ( ID )WITH (FILLFACTOR=100, ALLOW_ROW_LOCKS=ON, ALLOW_ROW_LOCKS=ON); GO CREATE PROCEDURE dbo.spStatsTest @records INT, @count int out, @statsdate datetime out –WITH recompile AS SET NOCOUNT ON; INSERT INTO dbo.statsSample ( repDate ) SELECT DATEADD(D,N,GETDATE()) FROM SQLServiceTools.dbo.fnNumsTable(1,@records,-1); SELECT @count = COUNT(1) FROM dbo.statsSample WHERE ID >0; SELECT @statsdate=STATS_DATE(i.object_id, i.index_id) FROM sys.indexes i WHERE i.object_id=OBJECT_ID(‘dbo.statsSample’); GO DECLARE @x INT = 0; DECLARE @initialrecords INT = 10000; DECLARE @increment INT =2000; DECLARE @STATSDATE DATETIME; DECLARE @COUNT INT; DECLARE @PREVSTATSDATE DATETIME; SET NOCOUNT ON; EXEC dbo.spStatsTest @initialrecords,@COUNT OUT,@STATSDATE OUT; SELECT @PREVSTATSDATE = ISNULL(@STATSDATE, GETDATE()); WHILE @x <@initialrecords BEGIN EXEC dbo.spStatsTest @increment,@COUNT OUT,@STATSDATE OUT; –INSERT INTO dbo.statsSample –( — repDate –) –SELECT DATEADD(D,N,GETDATE()) –FROM dbo.fnNumsTable(1,@increment,-1) — SELECT @count = COUNT(1) — FROM dbo.statsSample — WHERE ID >0 –SELECT @statsdate=STATS_DATE(i.object_id, i.index_id) –FROM sys.indexes i –WHERE i.object_id=OBJECT_ID(‘dbo.statsSample’); IF @PREVSTATSDATE<> @STATSDATE BEGIN –return the recordcount and the time of the stats update SELECT @COUNT,@STATSDATE; SELECT @PREVSTATSDATE = @STATSDATE END SET @x+=@increment/10; END DROP TABLE dbo.statsSample; DROP PROCEDURE dbo.spStatsTest;
As you can see in this test is that the statistics are not updated in the procedure as long as the procedure doesn’t recompile. If you add the WITH RECOMPILE in the procedure, the statistics are auto updated when you select from the table and 20% of the data have been updated/inserted. The more records you add, the less frequent statistics updates you get. The worst part in this test is the bad execution plan. Take a look at the execution plan when you don’t recompile the procedure, you’ll notice that SQL server reports estimated number of records =1, but in the worst case, you have 100 000 records. Should you depend on AUTO UPDATE STATS? Well, get back next week, and I’ll post a new blog post about auto update stats.