Joins, CAST and UDFs are a bad mix
Transactional data is usually stored with some sort of time stamp. This time stamp is more often than not of type DATETIME, the data will look like this: ‘2016-02-05 15:22:15.167’
When you want to sum or group this data per day you run into problem because ‘2016-02-05 15:22:15.167’ and ‘2016-02-05 15:23:44.543’ is the same day but it’s not the same time.
A quick fix is to CAST the data to DATE, CAST(@MyTimeStamp as DATE) or you could define a function that does it. The problem with this approach is that the query optimizer can not use the indexes but has to resort to a table scan which is VERY slow.
Solution
One solution is to define another column of type DATE where the date is stored without the time part. In the example in the code below the same query time goes from 2200 ms to 33 ms on my machine. That is 67 times faster.
A solution with a cache
If the data arrives in batches, usually early in the morning, and the same query is run over and over during the day in various reports, the result of the query can be cached.
In the example below the query time goes from 33ms to 0 ms. Yep, ZERO ms.
IF OBJECT_ID( 'dbo.Dates1' , 'U' ) IS NOT NULL |
IF OBJECT_ID( 'dbo.Dates2' , 'U' ) IS NOT NULL |
CREATE TABLE Dates1(CalendarDateTime DATETIME,CalendarDate DATE ); |
CREATE TABLE Dates2(CalendarDateTime DATETIME,CalendarDate DATE ); |
INSERT INTO Dates1(CalendarDateTime) |
SELECT DATEADD( SECOND ,-10 * 365 * 24 * 60 * 60 * RAND(),GETDATE()); |
INSERT INTO Dates2(CalendarDateTime) |
SELECT DATEADD( SECOND ,-10 * 365 * 24 * 60 * 60 * RAND(),GETDATE()); |
SET CalendarDate = CalendarDateTime; |
SET CalendarDate = CalendarDateTime; |
CREATE NONCLUSTERED INDEX IX_Caldt ON dbo.Dates1(CalendarDateTime); |
CREATE NONCLUSTERED INDEX IX_Caldt ON dbo.Dates2(CalendarDateTime); |
CREATE NONCLUSTERED INDEX IX_Cald ON dbo.Dates1(CalendarDate); |
CREATE NONCLUSTERED INDEX IX_Cald ON dbo.Dates2(CalendarDate); |
IF OBJECT_ID( 'dbo.DateOnlySlow' , 'FN' ) IS NOT NULL |
DROP FUNCTION dbo.DateOnlySlow; |
CREATE FUNCTION dbo.DateOnlySlow(@CalendarDateTime DATETIME) |
RETURN @CalendarDateTime; |
SELECT TOP 10 dbo.DateOnlySlow(s1.CalendarDateTime), COUNT (*) |
JOIN Dates2 AS s2 ON dbo.DateOnlySlow |
(s2.CalendarDateTime) = dbo.DateOnlySlow(s1.CalendarDateTime) |
GROUP BY dbo.DateOnlySlow(s1.CalendarDateTime) |
ORDER BY COUNT (*) DESC ,dbo.DateOnlySlow(s1.CalendarDateTime); |
SELECT TOP 10 CAST (s1.CalendarDateTime AS DATE ), COUNT (*) |
JOIN Dates2 AS s2 ON CAST (s2.CalendarDateTime AS DATE ) = CAST (s1.CalendarDateTime AS DATE ) |
GROUP BY CAST (s1.CalendarDateTime AS DATE ) |
ORDER BY COUNT (*) DESC , CAST (s1.CalendarDateTime AS DATE ); |
SELECT TOP 10 CAST (s1.CalendarDateTime AS DATE ), COUNT (*) |
JOIN Dates2 AS s2 ON s2.CalendarDate = CAST (s1.CalendarDateTime AS DATE ) |
GROUP BY CAST (s1.CalendarDateTime AS DATE ) |
ORDER BY COUNT (*) DESC , CAST (s1.CalendarDateTime AS DATE ); |
SELECT TOP 10 s1.CalendarDate, COUNT (*) |
JOIN Dates2 AS s2 ON s2.CalendarDate = s1.CalendarDate |
ORDER BY COUNT (*) DESC ,s1.CalendarDate; |
IF OBJECT_ID('tempdb..##IncludedSalesDates ',' U') IS NOT NULL |
DROP TABLE ##IncludedSalesDates; |
SELECT s1.CalendarDate, COUNT (*) AS num |
INTO ##IncludedSalesDates |
JOIN Dates2 AS s2 ON s2.CalendarDate = s1.CalendarDate |
GROUP BY s1.CalendarDate; |
CREATE CLUSTERED INDEX IX_Cald ON dbo.##IncludedSalesDates(num DESC ,CalendarDate); |
FROM ##IncludedSalesDates |
ORDER BY num DESC ,CalendarDate; |
Läs liknande inlägg
SQL Server Query StoreBy looking at Extended Events i found some interesting Events in a category called Query…
-
-