TSQL - Solve it YOUR Way - Finding the MAX Value from Multiple Columns
As part of the blog series TSQL - Solve it YOUR Way, today's topic will cover a question recently asked in the Transact-SQL MSDN Forums here, followed by four different solutions from three of the more helpful and creative contributors in the TSQL MSDN forums, Jingyang Li, Steven Wang, and Kent Waldrop.
Topic: I have a table with three datetime columns and want to return the most recent date for each row.
In the following image, we see an example of our three datetime columns, Date1, Date2, and Date3, followed by a "MaxDate" column containing the latest date of the three. The following solutions will show us various ways to calculate this "MaxDate" value.
This is a great question for our blog series because there are so many ways to tackle this question. In fact, the forum post included his original solution using a CASE statement, but he was looking for additional ways to solve the problem. Let's take a look at four additional ways to accomplish this.
Solution #1 and Solution #2 - Provided by Jingyang Li
create table Dates (id int identity(1,1), date1 datetime, date2 datetime, date3 datetime)
insert into Dates (date1,date2,date3)
--Solution #1 - UNPIVOT approach
SELECT a.id, b.latestdate FROM Dates a LEFT JOIN (SELECT id, Max(date123) AS latestdate
FROM (SELECT id,date1,date2, date3 FROMDates) src
FOR dates IN ([date1],
[date3])) unpvt GROUP BY id ) b on a.id=b.id
--Solution #2 - Using the new SQL Server 2012 IIF function
SELECT id,Isnull(date1, '1/1/1900') date1, Isnull(date2, '1/1/1900') date2, Isnull(date3, '1/1/1900') date3 FROM Dates) t
Explanation of Jingyang's solutions:
Solution #1: My first thought was to unpivot these columns and use max function to get the latest date for each row. Since you cannot easily take the max value of multiple columns in a table, we can use the unpivot function to accomplish this. To include all NULL dates, we can use a LEFT JOIN to generate the result set.
Solution #2: Later on I found that we can use SQL Server 2012 inline IIF function to compare all cases to get the latest date. Within the query, we use ISNULL and NULLIF to handle NULL values.
Solution #3 - Provided by Steven Wang
--Test Data Preparation
Declare @MyTable Table (ID Int Identity(1,1), Date1 Date, Date2 Date, Date3 Date);
Insert Into @MyTable
Select ID, Date1, Date2, Date3, MaxDate
from @MyTable A
( Select Max(MyDate) As MaxDate
From (Values (Date1), (Date2), (Date3)) B(MyDate)
Explanation of Steven's solution:
As we need to find the latest date among 3 columns, the method using the value comparison between 2 different columns is a little clumsy. How about if we need to do the same when we have
more than 3 columns to compare?
One logical way to perform this comparison is to UNPIVOT the 3 columns’ values into one column and then use the MAX aggregation function to get the latest date value.
As Jingyang showed, it is easy to use the UNPIVOT clause to transpose the multiple column data into one row. However, UNPIVOT normally has performance issues so we will look for a different solution.
Here, I will use a table value constructor to perform the UNPIVOT. In SQL Server 2008 or later, we can use VALUES clause in a standard way as a table value constructor to create a derived table.
The idea is to pass three base table date columns into a derived table constructed by VALUES clause by using the CROSS APPLY clause. Each column passed into the derived table creates a row. We can then perform an MAX aggregation to get the latest date for each row from the base table inside the derived table.
This technically is fun and performance-wise will be better than using the UNPIVOT clause to transpose columns and perform aggregation.
Solution #4 - Provided by Kent Waldrop
declare @test table
(test_Id int, date1 date, date2 date, date3 date);
insert into @test
select 1, null, null, null union all
select 2, null, '20120102', null union all
select 3, '20120101', '20120102', '20120103' union all
select 4, '20130322', '20120603', '20130311' union all
select 5, '20120912', '20120725', '20120606'
case when date12 >= coalesce(date3, date12) then date12 else date3
end as greatest_Date
( select case when date1 >= coalesce(date2, date1) then date1 else date2
end as date12
) as xa1;
Explanation of Kent's solution:
My solution uses a CROSS APPLY to provide re-use of a derived column "date12". Date12 is the greater of date1 and date2.
A key trick that is involved with this query is the following two comparisons:
date1 >= coalesce(date2, date1)
date12 >= coalesce(date3, date12)
These two comparisons above are logically equivalent to these two comparisons:
date1 >= date2 or date1 is not null and date2 is null
date12 >= date3 or date12 is not null and date3 is null
The greatest date is selected by fetching the greater of date1 and date2 in the cross apply and then selecting the greater of date3 and the results of the date that was selected in the cross apply.
The main ideas driving this particular form of the query are (1) to make my additional operator a "Compute Scalar" operation rather than something else and (2) aim at code that is small in size to make the code easier to understand.
• Use a CTE or a derived table instead of the APPLY operator since the APPLY operator is not ANSI standard
• Use the expanded OR / AND comparisons rather than COALESCE
As you can see, all four of the above solutions provide the result we were looking for, but do so in creatively different styles. Each of these solutions leverages different SQL Server language constructs and includes different considerations in the final solutions, including one solution that leverage new constructs from SQL Server 2012. I hope that you are able to learn a lot by trying out the problem yourself and then reading through the additional solutions.
Special thanks to Jingyang, Steven, and Kent for their valuable forums contribution and for contributing to this series!
Hope that helps,
Sam Lester (MSFT)
Steven Wang has worked with SQL server for more than 10 years. Steven is an active SQL server community participant and speaks at events such as TechEd, CodeCamp, SQL User Group etc.
Jingyang Li has been working with SQL Server since he began his IT career as an ASP.NET developer in 2001. He enjoys working with T-SQL and recently took a full time job as a DBA. He has been participating in the Microsoft forums with alias Limno.
Kent Waldrop started working with Sybase Transact SQL in 1989 as an application developer and continued working with Sybase until 1995 when SQL Server 6 was released. At that time, he became a Microsoft SQL Server database administrator and has continued to work with Microsoft SQL Server ever since. Currently he is a database architect working with Microsoft SQL Server, Oracle and UDB/DB2.