Generated SQL Improvements for TPT Queries (June CTP)

 


The information in this post is out of date.

Visit msdn.com/data/ef for the latest information on current and past releases of EF.


 

Last year we wrote about some performance considerations when using TPT inheritance in the Entity Framework.  We are pleased to announce that with the Microsoft Entity Framework June 2011 CTP we have released the first round of improvements, resulting in dramatic improvements in queries against TPT hierarchies.  While there is still room for improvement, we have completed some of the most difficult changes.  In many cases we are able to exclude tables from the generated query that do not contribute to the result.  Furthermore, many of the UNION ALL occurrences have been replaced with LEFT OUTER JOINs or eliminated completely, resulting in generated SQL that closely resembles what a developer might handcraft.

For example, this model was used in the original blog post:

 

 

 

Two sample queries were used in that blog post – a simple projection against properties exclusively on the base type and an even simpler query that selects all of the guests.  Let’s look at the effect of the improvements on those two queries:

 

 

var query = from g in db.Guests
select new { Id = g.GuestId, Name = g.Name };

 

Before

SELECT

[Extent1].[GuestId] AS [GuestId],

[Extent1].[Name] AS [Name]

FROM  [dbo].[Guests] AS [Extent1]

LEFT OUTER JOIN  (SELECT

       [Extent2].[GuestId] AS [GuestId]

       FROM [dbo].[Guests_USGuest] AS [Extent2]

UNION ALL

       SELECT

       [Extent3].[GuestId] AS [GuestId]

       FROM [dbo].[Guests_UKGuest] AS [Extent3]) AS [UnionAll1] ON [Extent1].[GuestId] = [UnionAll1].[GuestId]

After

SELECT

[Extent1].[GuestId] AS [GuestId],

[Extent1].[Name] AS [Name]

FROM [dbo].[Guests] AS [Extent1]

 

In this query we removed an unnecessary left outer join and a union all, resulting in a query that is 33% of the original size and much easier to read.

 

var query = db.Guests;

Before

SELECT

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN '0X' WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1],

[Extent1].[GuestId] AS [GuestId],

[Extent1].[Name] AS [Name],

[Extent1].[Address] AS [Address],

[Extent1].[City] AS [City],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN [UnionAll1].[State] END AS [C2],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN [UnionAll1].[Zip] END AS [C3],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN [UnionAll1].[PhoneNumber] END AS [C4],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll1].[C1] END AS [C5],

CASE WHEN (( NOT (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL))) AND ( NOT (([UnionAll1].[C4] = 1) AND ([UnionAll1].[C4] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([UnionAll1].[C3] = 1) AND ([UnionAll1].[C3] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [UnionAll1].[C2] END AS [C6]

FROM  [dbo].[Guests] AS [Extent1]

LEFT OUTER JOIN  (SELECT

      [Extent2].[GuestId] AS [GuestId],

      [Extent2].[State] AS [State],

      [Extent2].[Zip] AS [Zip],

      [Extent2].[PhoneNumber] AS [PhoneNumber],

      CAST(NULL AS varchar(1)) AS [C1],

      CAST(NULL AS varchar(1)) AS [C2],

      cast(1 as bit) AS [C3],

      cast(0 as bit) AS [C4]

      FROM [dbo].[Guests_USGuest] AS [Extent2]

UNION ALL

      SELECT

      [Extent3].[GuestId] AS [GuestId],

      CAST(NULL AS varchar(1)) AS [C1],

      CAST(NULL AS varchar(1)) AS [C2],

      CAST(NULL AS varchar(1)) AS [C3],

      [Extent3].[PostalCode] AS [PostalCode],

      [Extent3].[PhoneNumber] AS [PhoneNumber],

      cast(0 as bit) AS [C4],

      cast(1 as bit) AS [C5]

      FROM [dbo].[Guests_UKGuest] AS [Extent3]) AS [UnionAll1] ON [Extent1].[GuestId] = [UnionAll1].[GuestId]

After

SELECT

CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN '0X' WHEN (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)) THEN '0X0X' ELSE '0X1X' END AS [C1],

[Extent1].[GuestId] AS [GuestId],

[Extent1].[Name] AS [Name],

[Extent1].[Address] AS [Address],

[Extent1].[City] AS [City],

CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)) THEN [Project2].[PostalCode] END AS [C2],

CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)) THEN [Project2].[PhoneNumber] END AS [C3],

CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [Project1].[State] END AS [C4],

CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [Project1].[Zip] END AS [C5],

CASE WHEN (( NOT (([Project1].[C1] = 1) AND ([Project1].[C1] IS NOT NULL))) AND ( NOT (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)))) THEN CAST(NULL AS varchar(1)) WHEN (([Project2].[C1] = 1) AND ([Project2].[C1] IS NOT NULL)) THEN CAST(NULL AS varchar(1)) ELSE [Project1].[PhoneNumber] END AS [C6]

FROM   [dbo].[Guests] AS [Extent1]

LEFT OUTER JOIN  (SELECT

      [Extent2].[State] AS [State],

      [Extent2].[Zip] AS [Zip],

      [Extent2].[PhoneNumber] AS [PhoneNumber],

      [Extent2].[GuestId] AS [GuestId],

      cast(1 as bit) AS [C1]

      FROM [dbo].[Guests_USGuest] AS [Extent2] ) AS [Project1] ON [Extent1].[GuestId] = [Project1].[GuestId]

LEFT OUTER JOIN  (SELECT

      [Extent3].[PostalCode] AS [PostalCode],

      [Extent3].[PhoneNumber] AS [PhoneNumber],

      [Extent3].[GuestId] AS [GuestId],

      cast(1 as bit) AS [C1]

      FROM [dbo].[Guests_UKGuest] AS [Extent3] ) AS [Project2] ON [Extent1].[GuestId] = [Project2].[GuestId]

 

At first glance this query may not show much improvement, however, the UNION ALL has been replaced by a LEFT OUTER JOIN.  Using a LEFT OUTER JOIN rather than a UNION ALL allows tables that don’t contribute to the result to be removed from the query entirely (as in the first example).

 

Query Improvements Summary

In this round of improvements we focused on restricting generated SQL to tables that actually contribute to the query and replacing UNION ALL with LEFT OUTER JOIN.  As mentioned earlier these were some of the most difficult challenges.

 

Conclusion

We are excited to be able to deliver these improvements as part of the Microsoft Entity Framework June 2011 CTP and we look forward to continued efforts in improving generated SQL.  As always we encourage and appreciate your feedback on these features.  What can we do better?  Are there other notable areas where SQL optimization could significantly improve the Entity Framework?  Please leave your thoughts and comments below.

 

ADO.NET Entity Framework Team