SQL Server statistics – local variables

Oopps! Upgrade your browser pretty please. Oopps! Upgrade your browser pretty please.

If you have been following my blog serie about SQL server statistics, you have seen how statistics is used by the sql optimizer to create an execution plan. When the statistics is used in the wrong way, you’ll end up with a bad execution plan.

This blog post is about another scenario when the statistics is not used correct. You can try this code in the AdventureWorks database:

SET STATISTICS IO ON;

SELECT
soh.[SalesOrderID],
[sod].[SalesOrderID],
[sod].[SalesOrderDetailID],
[sod].[CarrierTrackingNumber],
[sod].[OrderQty],
[sod].[ProductID],
[sod].[SpecialOfferID]
FROM [Sales].[SalesOrderHeader] AS soh
INNER JOIN [Sales].[SalesOrderDetail] AS sod
ON [soh].[SalesOrderID]=[sod].[SalesOrderID]
WHERE soh.OrderDate=‘2008-07-31’

DECLARE @orderDate DATE=‘2008-07-31’

SELECT
soh.[SalesOrderID],
[sod].[SalesOrderID],
[sod].[SalesOrderDetailID],
[sod].[CarrierTrackingNumber],
[sod].[OrderQty],
[sod].[ProductID],
[sod].[SpecialOfferID]
FROM [Sales].[SalesOrderHeader] AS soh
INNER JOIN [Sales].[SalesOrderDetail] AS sod
ON [soh].[SalesOrderID]=[sod].[SalesOrderID]
WHERE soh.[OrderDate]=@orderDate

SET STATISTICS IO OFF;

If you look at the execution plan for these queries, they look exactly the same, but the estimated cost for the first one is higher than the second one. BUT if you look closer, you’ll see that there is a difference in “estimated number of records”. In the first execution plan actual and estimated number of records are equal, but not in the second execution plan where estimated are less than actual.

Is this a problem? Well…. YES it is! If the table would have been big, you might have got a bad execution plan and SQL engine wouldn’t have aquired enough memory for the query.

Is the first query actually more expensive than the second one? In this case, no. The cost is based on the estimated number of records and the first one did a much better estimate than the second on.

Why is the estimated incorrect in the second query? Values of local variables are not known in compile time when the execution plan is created, only in runtime and that leaves SQL server guessing.