SELECT Id, MyDate FROM #CompareDates WHERE MyDate = GETDATE()
Although this is a very simple syntax, I will show you how it can be a performance hit.
We'll start by filling a table with some random dates(and one row of a specific date just to make sure it exists) and add a clustered index on this column.
The data we are looking for is all the rows with today's date.
DECLARE @myNow DATETIME SET @myNow = '2010-09-06 22:47' SELECT * FROM #CompareDates WHERE MyDate = @myNow
This query will work very fast, but, because the column 'MyDate' contains time and not just date - the result won't be as expected, only rows with the exact date & time will appear.
We can see in the execution plan that it uses 'clustered index seek' and results with one row which is the row we added manually with exact time.
We can try using DATEDIFF to eliminate the time in the column:
SELECT * FROM #CompareDates WHERE DATEDIFF(day,MyDate,@myNow)=0
This will get the exact result, but, using a function over the 'MyDate' column causes the sql server to scan the entire index!!
If the column does not contain time, we can eliminate the time in the '@myNow' parameter:
DECLARE @myNowNoTime DATETIME SET @myNowNoTime = dateadd(dd,datediff(dd,0,GETDATE()),0) SELECT * FROM #CompareDates WHERE MyDate = @myNowNoTime
This will also perform well as the 1st query, but again...what if 'myDate' column contains time??
The solution is to avoid doing functions (like DATEDIFF, DATEADD or any user defined function) over 'MyDate' column.
The solution is very simple - date range.
First I've created two functions that one returns the begining of the day (time will be '00:00:00') and the other the end of the day (time will be 23:59:59:997)
CREATE FUNCTION [dbo].[fnGetBeginOfDay] (@myDate DATETIME) RETURNS DATETIME BEGIN RETURN CAST(FLOOR(CAST(@myDate AS DECIMAL(12, 5))) AS DATETIME) END GO CREATE FUNCTION [dbo].[fnGetEndOfDay] (@myDate DATETIME ) RETURNS DATETIME BEGIN RETURN DATEADD(ms, -3, DATEADD(day, 1, (dbo.fnGetBeginOfDay(@myDate)))) END GO
Than the query will look like this:
DECLARE @beginOfDay DATETIME, @endOfDay DATETIME SET @beginOfDay = dbo.fnGetBeginOfDay(GETDATE()) SET @endOfDay = dbo.fnGetEndOfDay(GETDATE()) SELECT * FROM #CompareDates with (index(idx_MyDate)) WHERE MyDate BETWEEN @beginOfDay AND @endOfDay
And the execution plan looks like this, back to the wonderfull 'index seek' and returns the expected amount of rows:
Additional facts:
1. Testing the same queries with NONCLUSTERED index shows that the date range solution is as bad as DATEDIFF - both result with table scan
2. Adding to (1) a table hint (with(index(idx_MyDate))) to both queries shows that while datediff uses 'index scan', date range uses 'index seek' (which is obviously better).
Conclusions:
1. When querying using a date column which is included in an index - use 'date range' solution.
2. If the index is nonclustered - check if a table hint makes the difference.
Till next time....
Diego
Full sample test source code:
--fill table with random dates CREATE TABLE #CompareDates (Id int identity(1,1), MyDate DATETIME) DECLARE @myNow DATETIME SET @myNow = '2010-09-06 22:47' DECLARE @i int, @myDate DATETIME SET @i = 0 WHILE @i < 10000 BEGIN --random date (10x to Ben Nadel for the useful & short function :-) SET @myDate = CAST(CAST( @myNow AS INT ) -5 * RAND( CAST( CAST( NEWID() AS BINARY(8) ) AS INT ) )AS DATETIME) INSERT INTO #CompareDates (MyDate) SELECT @myDate SET @i=@i+1 IF @i=500 INSERT INTO #CompareDates (MyDate) VALUES(@myNow) END CREATE CLUSTERED INDEX idx_MyDate ON #CompareDates(MyDate) SELECT * FROM #CompareDates -----TESTING: --fast, looks for specific date including time, not quite we are looking for.. SELECT * FROM #CompareDates WHERE MyDate = @myNow --exact results, scans entire index SELECT * FROM #CompareDates WHERE DATEDIFF(day,MyDate,@myNow)=0 --if the column does not contain time, you can eliminate time in the parameter DECLARE @myNowNoTime DATETIME SET @myNowNoTime = dateadd(dd,datediff(dd,0,@myNow),0) SELECT * FROM #CompareDates WHERE MyDate = @myNowNoTime --if column contains time, use range DECLARE @beginOfDay DATETIME, @endOfDay DATETIME SET @beginOfDay = dbo.fnGetBeginOfDay(@myNow) SET @endOfDay = dbo.fnGetEndOfDay(@myNow) SELECT * FROM #CompareDates with (index(idx_MyDate)) WHERE MyDate BETWEEN @beginOfDay AND @endOfDay
One other solution is to create a computed column and an index, but that is usually an overhead...
ReplyDeletehttp://blog.sqlauthority.com/2010/08/22/sql-server-computed-columns-index-and-performance/
Gil.
Detailed explanation and elegant solution - thank you.
ReplyDeleteAlso refer this post for more examples and shortcuts
ReplyDeletehttp://beyondrelational.com/modules/2/blogs/70/posts/10899/understanding-datetime-column-part-iii.aspx
Madhivanan, thank you for your contribution
Delete