Using LINQ to SharePoint

The LINQ to SharePoint provider is a new feature in SharePoint 2010 that allows you to use a strongly-typed entity model and the language integrated query (LINQ) query syntax to query list data. Essentially, LINQ to SharePoint hides the complexity of developing CAML queries from developers, which can reduce development time and make code more readable. The LINQ to SharePoint provider converts the LINQ expressions into CAML queries at run time.

Using LINQ to SharePoint in your own solutions consists of three main steps:

  • Generate the entity classes. Before you can start writing LINQ queries against your SharePoint lists, you must create or generate the strongly-typed entity classes that represent your list data and lookup column relationships.
  • Develop the solution. After you add the entity classes to your Visual Studio 2010 project, you can write LINQ queries against the strongly-typed entities that represent your data model.
  • Run the solution. At run time, the LINQ to SharePoint provider dynamically converts your LINQ expressions into CAML queries, executes the CAML, and then maps the returned items to your strongly-typed data entities.

The LINQ to SharePoint Process

Although you can manually develop your entity classes, in most cases, you will want to use the SPMetal command line tool. This is included in SharePoint Foundation 2010 and can be found in the BIN folder in the SharePoint root. The SPMetal tool targets an individual SharePoint site and, by default, generates the following code resources:

  • A data context class that derives from DataContext. This is the top-level entity class. It represents the content of your site and provides methods that allow you to retrieve list entities. The data context class uses the EntityList<TEntity> class to represent the lists in your site, where TEntity is a class that represents a content type.
  • Classes that represent content types. These are marked with the ContentTypeAttribute. Content type classes are generated for implicit content types as well as content types that are explicitly defined on the site. For example, if a user adds a column to an existing list, the user is creating an implicit content type and a representative class will be generated.
  • Classes and properties that represent relationships between lists. SPMetal can detect relationships based on lookup columns. Within the entity class that represents a content type, SPMetal uses the EntityRef<TEntity> class to represent the singleton side of a one-to-many relationship and the EntitySet<TEntity> class to represent the "many" side of one-to-many or many-to-many relationships (known as a reverse lookup). Properties that are mapped to a field in a related list are decorated with the AssociationAttribute.

Note

You can configure the SPMetal tool to generate entity classes for specific lists, instead of for all the content in your site, by creating a parameters file. For more information, see Overriding SPMetal Defaults with a Parameters XML File on MSDN.

After the entity classes are generated, you can write LINQ queries against strongly-typed entities instead of creating CAML queries. Under the covers, the LINQ to SharePoint provider converts your LINQ queries into CAML at run time and executes the CAML against your SharePoint lists. For more information about the entity classes of the LINQ to SharePoint provider, see Entity Classes on MSDN.

The following code example, adapted from the sandbox reference implementation, illustrates some of the key aspects of using LINQ with entity classes.

using (ManufacturingSiteDataContext context = new 
         ManufacturingSiteDataContext(SPContext.Current.Web.Url))
{
  string sponsor = "David Pelton";

  var results = from projectItem in context.PriorityProjects
                  where projectItem.ExecutiveSponsor == sponsor
                  select projectItem;

  foreach (var proj in results)
  {
      output.AppendFormat("Title: {0}  Sponsor: {1}  Leader: {2} \n", 
      proj.Title, proj.ExecutiveSponsor, proj.Project.Leader);
  }
}

All the entity classes in this example were generated by the SPMetal tool. The example illustrates the following key points:

  • The query uses a data context class. The ManufacturingSiteDataContext class inherits from the DataContext class and includes strongly-typed properties for each list on the manufacturing site, such as the PriorityProjects list.
  • The content type class that represents the entities within the list includes strongly-typed properties for each column value, such as Title and ExecutiveSponsor.
  • The entity classes understand the relationships defined by lookup columns—the Project.Leader property retrieves a Leader column value from a related Project entity.
  • You should always dispose of the data context instance after use. The DataContext base class implements the IDisposable interface, and thereby ensures that the data context instance is released when execution passes beyond the scope of the using statement.

For more information about using LINQ to SharePoint, see Managing Data with LINQ to SharePoint on MSDN.

Note

You can extend the entity classes produced by the SPMetal command-line tool in order to expose additional functionality to the LINQ to SharePoint provider, for example, to handle custom field data types. This guidance does not explore this area. For more information, see Extending the Object-Relational Mapping on MSDN.

Execution of LINQ to SharePoint Queries

LINQ to SharePoint uses deferred loading—commonly known as lazy loading—of result sets to improve query efficiency. If you create a query that returns a collection of entities, the query won't actually execute until you commence an action that uses the result set—such as iterating over the results or converting the result set to an array. In the preceding code example, the LINQ query is only converted to CAML and executed when the foreach statement starts enumerating the result set.

LINQ to SharePoint also uses the deferred loading approach for related entities. Any related entities are only loaded when the entity is actually accessed, in order to reduce unnecessary calls to the content database. In the preceding code example, the Project entity is only loaded when the foreach statement reads the Project.Leader property.

When a query is executed in the context of an HTTP request, LINQ to SharePoint uses the SPContext.Current property to load the data context. This makes the process of loading the data context relatively efficient. However, if you use a LINQ query outside the context of an HTTP request, such as in a command line application or a PowerShell script, the LINQ to SharePoint provider must construct context objects, such as the SPWeb and the SPSite, in order to build the data context instance. In this case, the process becomes more resource intensive. Any create, update, or delete operations within your LINQ queries are automatically batched by the data context instance and applied when the DataContext.SubmitChanges method is called by your code. For more information, see How to: Write to Content Databases Using LINQ to SharePoint.

Generating Entities for Content Types

The SPMetal command line tool generates entity classes for the content types defined in a SharePoint site. Content types have various characteristics that can make this process difficult to understand:

  • Content types support inheritance.
  • Content types can be defined at the site level or at the list level. When a content type is added to a list, SharePoint creates a local copy of the content type which can be modified.
  • A list can have multiple content types associated with it.

SPMetal uses the following rules when it generates content types:

  • An entity class is generated for every content type on the site (SPWeb).
  • If a content type inherits from another content type, the entity class that represents the child content type will inherit from the entity class that represents the parent content type. For example, in the sandbox reference implementation, the SOW content type inherits from the built-in Document content type, which in turn inherits from the built-in Item content type. SPMetal generates entity classes for SOW, Document, and Item, and builds an inheritance relationship between the classes.
  • If a list content type has been modified from the corresponding site content type, SPMetal will generate a new entity class for the list content type. If the list content type is identical to the corresponding site content type, SPMetal will simply use the entity class for the site content type instead. Entities created from list content types are named by preceding the content type name with the list name. For example, if you add a StartDate column to the SOW content type in the Estimates list, an entity class named EstimatesSOW will be generated to represent the list content type. Conversely, if you have not modified the SOW content type in the Estimates list, an entity class named SOW will be generated to represent the site content type.
  • If a column is removed from a list content type, the corresponding property is made virtual in the entity class that represents the site content type. The entity class that represents the list content type overrides this method and will throw an InvalidOperationException if you attempt to access the property. For example, if you remove the VendorID column from the SOW content type in the Estimates list, the VendorID property is made virtual in the SOW entity class, and the EstimatesSOW entity will throw an exception if you attempt to access the property.
  • If a list contains a single content type, the EntityList<TEntity> class that represents that list in the data context class will use that content type entity as its type parameter. For example, if the Estimates list contained only documents based on the SOW content type, the list would be represented by an EntityList<SOW> instance.
  • If a list contains more than one content type, the EntityList<TEntity> class that represents that list will use the closest matching base content type as its type parameter. For example, the Estimates list actually contains the SOW content type and the Estimate content type, which both inherit from the built-in Document content type. In this case, the list is represented by an EntityList<Document> instance. Because SOW entities and Estimate entities both inherit from the Document entity, the list can contain entities of both types.

Modeling Associations in Entity Classes

When you use the SPMetal command-line tool to generate entity classes, it automatically detects relationships between lists based on lookup columns, and it adds properties to the entity classes to enable you to navigate these relationships. For example, in the SharePoint List Data Models ReferenceIimplementation, the Inventory Locations list includes a lookup column named Part that retrieves values from the Parts list. In the InventoryLocation class, this is reflected by the inclusion of a Part property that allows you to navigate to the associated entity instance in the Parts list.

private Microsoft.SharePoint.Linq.EntityRef<Part> _part;

[Microsoft.SharePoint.Linq.AssociationAttribute(Name="PartLookup",  
  Storage="_part", 
  MultivalueType=Microsoft.SharePoint.Linq.AssociationType.Single, 
  List="Parts")]
public Part Part 
{
  get { return this._part.GetEntity(); }
  set { this._part.SetEntity(value); }
} 

Note

The InventoryLocation class also includes event handlers that ensure the Part reference remains up to date if the associated entity instance is changed.

The SPMetal tool also adds properties to the Parts list that enable you to navigate to the Inventory Locations list. This is known as a reverse lookup association. The Parts class includes an InventoryLocation property that returns the set of inventory locations that are associated with a specific part—in other words, each InventoryLocation instance that links to the specified part through its Part lookup column.

private Microsoft.SharePoint.Linq.EntitySet<InventoryLocation> _inventoryLocation;

[Microsoft.SharePoint.Linq.AssociationAttribute(Name="PartLookup", 
  Storage="_inventoryLocation", ReadOnly=true, 
  MultivalueType=Microsoft.SharePoint.Linq.AssociationType.Backward, 
  List="Inventory Locations")]
public Microsoft.SharePoint.Linq.EntitySet<InventoryLocation> InventoryLocation   
{
  get { return this._inventoryLocation; }
  set { this._inventoryLocation.Assign(value); }
}

Note

The Part class also includes event handlers that ensure the InventoryLocation references remain up to date if the associated entity instance is changed.

However, there is a limitation in the way the current version of SPMetal builds reverse lookups:

  • If a site lookup column is used by one list, SPMetal will generate a reverse lookup association for the relationship.
  • If a site lookup column is used by more than one list, SPMetal will not generate reverse lookup associations for any of the relationships based on that lookup column.

In many scenarios, you will want to use a lookup column in more than one list. For example, in the reference implementation, there are three lists that use lookup columns to retrieve values from the Parts list. In some cases, depending on how you intend to query your data, you may not require reverse lookup associations. However, if you do need to traverse the relationship in the reverse direction, your LINQ to SharePoint queries will be far less efficient if you proceed without a reverse lookup association in place. Consider the relationship between Parts and Inventory Locations. If you need to find all the inventory locations associated with a specified part, you would need to retrieve every inventory location instance, check the value of the Part lookup column, and build a collection of inventory locations. In this case, the reverse lookup association simplifies the LINQ expressions and reduces the processing overhead.

There are various approaches you can use to work around this limitation of SPMetal, each of which has drawbacks:

  1. Create a new site column for each list that requires a lookup column for a particular list. This results in multiple site columns that retrieve information from the same list—the columns are duplicates in everything but name. This has several negative consequences:
    • If a developer uses a site lookup column that is already in use, reverse lookups will not be generated for that column the next time you use SPMetal, and some existing code will break.
    • Site administrators will need to manage multiple site columns for the same value, which will be confusing. This drawback can be mitigated by hiding the duplicate lookup fields.
    • The site columns are not really reusable, which is the main purpose of using site columns in the first place.
  2. Create lookup columns at the list level. This eliminates the problems associated with duplicate site columns. This has the following negative consequences:
    • Your content types will no longer represent your data model, because the lookup columns are now pushed into individual lists. This makes information management more challenging. It also reduces the effectiveness of search and queries that retrieve items from different lists, because the information from the lookup column is not included in the content type.
  3. Create duplicate site columns and use them in content types or list definitions to generate the entity classes with SPMetal, as in option 1. After you generate the entity classes, delete the duplicate site lookup columns and manually edit the entity classes to use a single lookup column. This keeps your data model clean because you do not need to maintain duplicate site columns, and it avoids the problems associated with option 2 because the lookup column is included in the relevant content types. This is the preferred approach in most scenarios. However, it has the following negative consequences:
    • Extra effort is required to create the duplicate site columns, create the content type definitions, remove the duplicate site columns, and edit the entity classes.
    • Manual editing of the entity classes can be error-prone and difficult to debug. However, the edit should only involve straightforward renaming of properties.
  4. Avoid using reverse lookup associations in cases where more than one list or content type uses a particular site lookup column. Although this approach is simple, you will need to use more complex and less efficient LINQ queries if you need to navigate the association in the reverse direction without reverse lookup properties.

Query Efficiency with LINQ to SharePoint

Although LINQ to SharePoint makes it quick and easy to query SharePoint lists, you still need to consider whether your LINQ expressions will translate into efficient CAML queries. If your LINQ code translates into efficient CAML queries, the performance overhead of the LINQ to SharePoint provider can be considered negligible in all but the most extreme cases—in fact, you may actually see better performance with LINQ to SharePoint because it can be difficult to manually create efficient CAML queries. This section describes how nuances in your LINQ expressions can have substantial effects on the efficiency of the generated queries.

In some cases, LINQ to SharePoint prevents you from executing queries that contain certain inefficiencies. The LINQ to SharePoint provider is not always able to convert a LINQ expression into a single CAML query—for example, if you use a join predicate to query across two lists that are not connected by a lookup column, the LINQ to SharePoint provider would actually need to submit two queries in order to return a result set. In cases like this where LINQ to SharePoint cannot perform an operation using a single CAML query, the runtime will throw a NotSupportedException. In other cases, the LINQ to SharePoint provider cannot translate the entire LINQ code into an efficient CAML query. In these cases the provider will first execute a CAML query to retrieve items from the list and then perform a LINQ to Objects query on the list item collection results to satisfy the portions of the LINQ query that could not be translated to CAML. For more information see Unsupported LINQ Queries and Two-stage Queries.

As an example, suppose you want to review orders for every customer. You might use the following LINQ expression.

dataContext.Customers.Select(c=>c.Orders).ToArray();

In this example, the LINQ to SharePoint provider would need to submit an additional query for every customer in order to retrieve their orders. As a result, the runtime would throw an exception. Similarly, suppose you want to aggregate data from two different lists of customers. You might use the following LINQ expression.

dataContext.Customers.Union(dataContext.MoreCustomers).ToArray();

In this case, the LINQ to SharePoint provider would need to submit two queries—one for each list. Again, the runtime would throw an exception. The remainder of this section describes ways in which you can perform this type of query and other common operations without compromising on efficiency.

Reviewing the CAML Output

In many cases, it can be useful to review the CAML output that is generated by your LINQ queries. The DataContext class includes a Log property that exposes a TextWriter object. You can use this property to log the generated CAML query to a text file or to the user interface. For example, the following code shows how you can modify the previous example to view the generated CAML query. In this example, the CAML query is appended to the query results in a Literal control named displayArea.

using (ManufacturingSiteDataContext context = new 
         ManufacturingSiteDataContext(SPContext.Current.Web.Url))
{
  var sb = new StringBuilder();
  var writer = new StringWriter(sb);
  context.Log = writer;

  string sponsor = "David Pelton";
  var results = from projectItem in context.PriorityProjects
                where projectItem.ExecutiveSponsor == sponsor
                select projectItem;

  foreach (var proj in results)
  {
    output.AppendFormat("Title: {0}  Sponsor: {1}  Leader: {2}", 
      proj.Title, proj.ExecutiveSponsor, proj.ProjectsLookup.Leader);
  }

  output.Append("\n Query: " + sb.ToString());
  displayArea.Mode = LiteralMode.Encode;
  displayArea.Text = output.ToString();
}

After you set the Log property to a TextWriter implementation, the DataContext class will write the CAML query to the underlying stream or string as the LINQ expression is executed. You can then view the CAML query that is generated by the LINQ to SharePoint provider.

<View>
  <Query>
    <Where>
      <And>
        <BeginsWith>
          <FieldRef Name="ContentTypeId" />
          <Value Type="ContentTypeId">0x0100</Value>
        </BeginsWith>
        <Eq>
          <FieldRef Name="Executive_x0020_Sponsor" />
          <Value Type="Text">David Pelton</Value>
        </Eq>
      </And>
    </Where>
  </Query>
  <ViewFields>
    <FieldRef Name="Executive_x0020_Sponsor" />
    <FieldRef Name="ProjectsLookup" LookupId="TRUE" />
    <FieldRef Name="ID" />
    <FieldRef Name="owshiddenversion" />
    <FieldRef Name="FileDirRef" />
    <FieldRef Name="Title" />
  </ViewFields>
  <RowLimit Paged="TRUE">2147483647</RowLimit>
</View>

There are several interesting observations about the automatically generated CAML query:

  • Notice the BeginsWith element in the Where clause. This stipulates that the content type ID of the items returned must begin with 0x0100. Effectively, this means that the content type of the items returned must be a custom content type that inherits from the built-in Item content type—which is true of the Project content type. The LINQ to SharePoint provider includes this provision in addition to the where clause specified by the LINQ query.
  • The CAML query returns a view that contains all the fields in the PriorityProjects list, including fields that aren't required by the LINQ expression.
  • The query returns a lookup field for the Projects list, instead of an entity. The LookupId attribute indicates that the referenced item in the Projects list will be retrieved by its internal ID value.

During the development process, you should take time to examine the CAML that is generated by your LINQ queries, in order to proactively identify poorly performing queries. This is especially important when you query lists that you expect to be sizeable. For example, you should take care to catch the obviously offending cases where the LINQ to SharePoint provider is unable to translate some or all of the query into CAML and must resort to LINQ to Objects.

In the preceding example, PriorityProjects is a simple list, and returning the complete set of fields causes little adverse effect on performance. As the number of items returned increases and the items grow in complexity, the performance overheads can become more substantial. For more information about how to constrain the view fields returned by a query, see the section, "Using Anonymous Types."

Where Clause Efficiency

When you create a LINQ expression, you typically use operators within a where clause to constrain your result set. However, the LINQ to SharePoint provider is unable to translate every LINQ operator into CAML. For example, the Equals operator and the HasValue operator have no CAML equivalent. The LINQ to SharePoint provider will translate as many where clause operators as possible into CAML, and then it will use LINQ to Objects to fulfill the remaining criteria.

The following table shows the operators that are supported by the LINQ to SharePoint provider and their equivalent expressions in CAML.

LINQ Operator

CAML Translation

&&

And

||

Or

==

Eq

>=

Geq

>

Gt

<=

Leq

<

Lt

!=

Neq

== null

IsNull

!= null

IsNotNull

String.Contains

Contains

String.StartsWith

BeginsWith

You should avoid using operators that are not listed in this table in your LINQ to SharePoint queries. Using unsupported operators causes the LINQ to SharePoint provider to return a larger result set and then process the outstanding where clauses on the client by using LINQ to Objects. This can create substantial performance overheads. For example, consider the following LINQ expression. The where clause includes an Equals operator and a StartsWith operator.

var results = from projectItem in context.PriorityProjects
              where projectItem.ExecutiveSponsor.Equals(sponsor)  && 
                    projectItem.Title.StartsWith("Over")
              select projectItem;

The resulting CAML query includes a Where clause that reflects the StartsWith operator. However, it makes no mention of the unsupported Equals operator.

<View>
  <Query>
    <Where>
      <And>
        <BeginsWith>
          <FieldRef Name="ContentTypeId" />
          <Value Type="ContentTypeId">0x0100</Value>
        </BeginsWith>
        <BeginsWith>
          <FieldRef Name="Title" />
          <Value Type="Text">Over</Value>
        </BeginsWith>
      </And>
    </Where>
  </Query>
  <ViewFields>
    …
  </ViewFields>
  <RowLimit Paged="TRUE">2147483647</RowLimit>
</View>

In this case, the LINQ to SharePoint provider would return a results set that includes project items with a Title field that begins with "Over," as defined by the CAML query. It would then use LINQ to Objects on the client to query the results set for project items with a matching ExecutiveSponsor field, as defined by the unsupported Equals operator.

The following XML shows what it looks like if you rewrite the LINQ expression to use the supported == operator instead of the unsupported Equals operator.

var results = from projectItem in context.PriorityProjects
              where projectItem.ExecutiveSponsor == sponsor && 
                    projectItem.Title.StartsWith("Over")
              select projectItem;

This time, the resulting CAML query reflects the LINQ expression in its entirety.

<View>
  <Query>
    <Where>
      <And>
        <BeginsWith>
          <FieldRef Name="ContentTypeId" />
          <Value Type="ContentTypeId">0x0100</Value>
        </BeginsWith>
        <And>
          <Eq>
            <FieldRef Name="Executive_x0020_Sponsor" />
            <Value Type="Text">David Pelton</Value>
          </Eq>
          <BeginsWith>
            <FieldRef Name="Title" />
            <Value Type="Text">Over</Value>
          </BeginsWith>
        </And>
      </And>
    </Where>
  </Query>
  <ViewFields>
    …
  </ViewFields>

In this case, the LINQ to SharePoint provider returns only relevant results to the client; no post-processing steps are required.

Using View Projections

In many cases, you can substantially improve query efficiency by using view projections. A view projection queries a specific set of fields from one or more entities. When you want to retrieve a read-only view of a set of data, using a view projection restricts the number of fields returned by the query and ensures that joins are added to the CAML query instead of performed as a post-processing step. You can create a view projection in various ways:

  • You can select a single field, such as projectItem.Title.
  • You can build an anonymous type by selecting a specific set of fields from one or more entities.
  • You can instantiate a known type and set the property values in your LINQ expression.

View projections are limited to certain field types. Valid field types for projections are Text (single line of text only), DateTime, Counter (internal Ids), Number, and ContentTypeId. All remaining field types are not supported; an InvalidOperationException will be thrown if a column of that field type is used in the projection. For a list of all field types, see SPFieldType.

In the following example, the new keyword in the LINQ expression creates an anonymous type that contains fields named Title, ExecutiveSponsor, and Leader.

using (ManufacturingSiteDataContext context = new ManufacturingSiteDataContext(SPContext.Current.Web.Url))
{
    string sponsor = "David Pelton";

    var results = from projectItem in context.PriorityProjects
                  where projectItem.ExecutiveSponsor == sponsor
                  select new { projectItem.Title, 
                               projectItem.ExecutiveSponsor, 
                               projectItem.Project.Leader };

    foreach (var proj in results)
    {
        output.AppendFormat("Title: {0}  Sponsor: {1}  Leader: {2}", 
                  proj.Title, proj.ExecutiveSponsor, proj.Leader);
    }
}

In this case, the LINQ to SharePoint provider creates a view that contains only the columns that correspond to the fields in the anonymous type.

<View>
  <Query>
    <Where>
      <And>
        <BeginsWith>
          <FieldRef Name="ContentTypeId" />
          <Value Type="ContentTypeId">0x0100</Value>
        </BeginsWith>
        <Eq>
          <FieldRef Name="Executive_x0020_Sponsor" />
          <Value Type="Text">David Pelton</Value>
        </Eq>
      </And>
    </Where>
  </Query>
  <ViewFields>
    <FieldRef Name="Title" />
    <FieldRef Name="Executive_x0020_Sponsor" />
    <FieldRef Name="ProjectLeader" />
  </ViewFields>
  <ProjectedFields>
    <Field Name="ProjectLeader" Type="Lookup" List="Project" ShowField="Leader" />
  </ProjectedFields>
  <Joins>
    <Join Type="LEFT" ListAlias="Project">
      <!--List Name: Projects-->
      <Eq>
        <FieldRef Name="Project" RefType="ID" />
        <FieldRef List="Project" Name="ID" />
      </Eq>
    </Join>
  </Joins>
  <RowLimit Paged="TRUE">2147483647</RowLimit>
</View>

The alternative approach, in which you instantiate a known type and set property value in your LINQ expression, is illustrated by the following example.

public class PriorityProjectView
{
  public string Title { get; set; }
  public string ExecutiveSponsor { get; set; }
  public string Leader { get; set; }
}

using (ManufacturingSiteDataContext context = new 
         ManufacturingSiteDataContext(SPContext.Current.Web.Url))
{
  IEnumerable<PriorityProjectView> proirityProjects = 
      from projectItem in context.PriorityProjects
      where projectItem.ExecutiveSponsor == sponsor
      select new PriorityProjectView
      {
        Title = projectItem.Title,
        ExecutiveSponsor = projectItem.ExecutiveSponsor,
        Leader = projectItem.Project.Leader
      };
}
...

Retrieving only the columns that you actually require will clearly improve the efficiency of your queries; in this regard, the use of view projections can provide a significant performance boost. This example also illustrates how the use of view projections forces the LINQ to SharePoint provider to perform the list join within the CAML query instead of retrieving a lookup column and using the deferred loading approach described earlier. The LINQ to SharePoint provider will only generate CAML joins when you use view projections. This is a more efficient approach when you know in advance that you will need to display data from two or more entities, because it reduces the number of round trips to the content database.

Note

View projections cannot be used for create, update, or delete operations. You must retrieve the full entity instances if you want use LINQ to SharePoint to perform create, update, or delete operations.

LINQ to SharePoint can only generate CAML joins from join projections for a limited number of data types. An InvalidOperationException will be thrown if the projection contains a disallowed data type. The permitted data types are Text, Number, DateTime, Count, and Content Type ID. All remaining field types cannot be projected, including Boolean, multi-line text, choice, currency, and calculated fields.

On a final note for view projections, recall that the LINQ to SharePoint provider will block certain LINQ expressions because they cannot be translated into a single CAML query. For example, the following LINQ expression attempts to retrieve a collection of orders for each customer. However, LINQ to SharePoint is unable to translate the LINQ expression into a single CAML query.

dataContext.Customers.Select(c=>c.Orders).ToArray();

Suppose you modify the expression to use anonymous types, as shown in the following example.

var results = dataContext.Customers.Select(c => new { Description = 
                c.Order.Description, CustomerId = c.Order.CustomerId }).ToArray();

In this case, the LINQ to SharePoint provider is able to translate the expression into a single CAML query, and the runtime will not throw an exception. As you can see, view projections can provide a valuable resource when you develop LINQ to SharePoint expressions.

Using List Joins across Sites

In many common SharePoint scenarios, a list will include a lookup column that retrieves data from another list in a parent site within the site collection. However, the SPMetal command-line tool generates entity classes for a single site, and LINQ to SharePoint expressions operate within a data context that represents a single site. By default, when a list includes a lookup column that refers to a list on another site, SPMetal will generate an ID value for the item in the related list instead of constructing the related entity itself. If you were to write queries against this data model, you would need to retrieve the related entity yourself in a post-processing step. As a result, if you want to use LINQ to SharePoint to query cross-site list relationships effectively, you must perform some additional steps:

  1. Temporarily move every list onto a single site before you run the SPMetal tool, so that SPMetal generates a full set of entity classes.
  2. When you create a LINQ expression, use the DataContext.RegisterList method to inform the runtime of the location of lists that are not on the current site.

Consider the earlier example of a PriorityProjects list. The list includes a lookup column that retrieves information from a central Projects list on a parent site, as shown in the following illustration.

Lookup column relationship across sites in a site collection

Ff798478.4f016986-e422-4fd4-b637-abf57427c3ac(en-us,PandP.10).png

In order to generate entities for both lists using the SPMetal tool, you should create a copy of the PriorityProjects list on the root site, as shown in the following illustration.

Temporary list to build entity classes

Ff798478.0b687e1a-73f0-4c80-a840-3be2e69fbcf1(en-us,PandP.10).png

SPMetal will now build a full set of entities and entity relationships. After you finish building entity classes, you can remove the duplicate lists from the site. When you run a query in the context of the Construction team site, you must use the RegisterList method to tell the runtime where to find the Projects list. The following code example shows this.

using (ManufacturingSiteDataContext context = new ManufacturingSiteDataContext("https://localhost/sites/manufacturing/construction"))
{
  context.RegisterList<Construction.ProjectsItem>("Projects",   
    "/sites/Manufacturing", "Projects");

  var results = from projectItem in context.PriorityProjects
                select new { projectItem.Title, 
                             projectItem.ExecutiveSponsor, 
                             projectItem.Project.Leader };

  foreach (var item in results)
  {
    output.AppendFormat("Title: {0}  Sponsor: {1}  Leader: {2}", 
                            item.Title, item.ExecutiveSponsor, item.Leader);
  }
}

There are various ways in which you could approach setting up your lists for entity generation. In most cases, you will want to generate your entity classes from the site that contains any lists that are referenced by lookup columns, because lookup columns reference a specific list on a specific site. In other words, if a lookup column retrieves data from a list on the root site, you should move all your lists onto the root site and build the entity model from there. If you build an entity model that uses a lookup column to retrieve data from a list on one site and then move that list to another site, you will need to manually update your entity classes.

The key options for building your entity model are as follows:

  • Create copies of all your lists on the root site, and use SPMetal to build a single, comprehensive entity model from the root site. This approach is recommended for most scenarios, because it is usually the simplest and does not require you to modify the entity classes after creation.
  • Use SPMetal with a parameters file to build a specialized entity model for one specific entity relationship. For example, suppose you have a lookup column that retrieves data from a specific team site instead of the root site. In this case, you should consider replicating all related lists on that specific team site and building your entity model from there, in order to avoid having to manually edit the lookup relationship in your entity classes. You might also consider this approach if you have a large number of lists in your site collection, because it may not be worth the extra effort involves in replicating and maintaining every single list on the root site.
  • When you have a list on a subsite that includes a lookup column that retrieves values from a list on the root site, you may be tempted to reproduce the root site list on the subsite and generate entity classes from there. However, this approach should generally be avoided. First, you would need to generate temporary lookup columns, because the actual lookup columns you want to use are associated with the specific list instance on the root site. Second, you would need to manually edit the associations in the entity classes in order to use the actual lookup columns instead of the temporary lookup columns.

Finally, remember that the SPQuery class supports CAML-based list joins. LINQ to SharePoint is primarily designed to expedite the development process. If the time it takes to set up and maintain copies of lists in order to build a representative entity model outweighs the time savings you derive from writing LINQ expressions instead of CAML queries, you might want to consider whether SPQuery is a better choice for your application scenario.

Additional Performance Considerations

LINQ expressions define a generic IEnumerable<T> collection of objects. The Enumerable class provides a set of extension methods that you can use to query and manipulate this collection. These methods have varying efficiency when you use them in your LINQ to SharePoint expressions. The following table briefly describes some of the performance issues for the operations that have not already been described. Operations that are marked as efficient are translated into CAML and do not require post-processing steps after the list data is retrieved.

Operation

Performance and behavior

Contains

Efficient

OrderBy

Efficient

OrderByDescending

Efficient

ThenBy

Efficient

ThenByDescending

Efficient

GroupBy

Efficient when used in conjunction with OrderBy

Sum

Returns all elements that satisfy the where clause and then uses LINQ to Objects to compute the sum of the elements

Aggregate

Returns all elements that satisfy the where clause and then uses LINQ to Objects to apply an accumulator function to the elements

Average

Returns all elements that satisfy the where clause and then uses LINQ to Objects to calculate the average value

Max

Returns all elements that satisfy the where clause and uses LINQ to Objects to calculate the maximum value

Min

Returns all elements that satisfy the where clause and then uses LINQ to Objects to calculate the minimum value

Skip

Returns all elements that satisfy the where clause and then uses LINQ to Objects to perform the Skip operation

SkipWhile

Returns all elements that satisfy the where clause and then uses LINQ to Objects to perform the SkipWhile operation

ElementAt

Unsupported; use the Take method instead

ElementAtOrDefault

Unsupported; use the Take method instead

Last

Returns all items that satisfy the where clause, and then gets the last

LastOrDefault

Returns all items that satisfy the where clause, and then gets the last or returns default if no items

All

Returns all elements that satisfy the where clause, and then uses LINQ to Objects to evaluate the condition

Any

Returns all elements that satisfy the where clause, and then uses LINQ to Objects to evaluate the condition

AsQueryable

Efficient

Cast

Efficient

Concat

Efficient

DefaultIfEmpty

Efficient

Distinct

Performed across two collections; returns all elements that satisfy the where clause and then uses LINQ to Objects to filter out duplicates

Except

Performed across two collections; returns all elements that satisfy the where clause and then uses LINQ to Objects to calculate the set difference

First

Efficient

FirstOrDefault

Efficient

GroupJoin

Efficient

Intersect

Performed across two collections; returns all elements that satisfy the where clause and then uses LINQ to Objects to calculate the set intersection

OfType

Efficient

Reverse

Returns all elements that satisfy the where clause, and then uses LINQ to Objects to reverse the order of the sequence

SelectMany

Efficient

SequenceEqual

Performed across two collections; returns all elements that satisfy the where clause and then uses LINQ to Objects to calculate whether the two sets are equal

Single

Efficient

SingleOrDefault

Efficient

Take

Efficient

TakeWhile

Efficient

Union

Efficient

The Repository Pattern and LINQ to SharePoint

The Repository pattern is an application design pattern that provides a centralized, isolated data access layer. The repository retrieves and updates data from an underlying data source and maps the data to your entity model. This approach allows you to separate your data access logic from your business logic.

In some ways, the advent of LINQ to SharePoint may appear to obviate the need for repositories. However, there are many good reasons to continue to use the Repository pattern with LINQ to SharePoint:

  • Query optimization. LINQ to SharePoint substantially reduces the effort involved in developing queries, when compared to creating CAML queries directly. However, it is still easy to write LINQ to SharePoint queries that perform poorly, as described earlier in this topic. Developing your queries in a central repository means that there are fewer queries to optimize and there is one place to look if you do encounter issues.
  • Maintainability. If you use LINQ to SharePoint directly from your business logic, you will need to update your code in multiple places if your data model changes. The Repository pattern decouples the consumer of the data from the provider of the data, which means you can update your queries in response to data model changes without impacting business logic throughout your code.
  • Testability. The repository provides a substitution point at which you can insert fake objects for unit testing.
  • Flexibility. The repository pattern promotes layering and decoupling, which leads to more flexible, reusable code.

In practice, you will encounter tradeoffs between the advantages of the Repository pattern and the practicalities of implementing a solution. In the SharePoint Guidance reference implementations, the following practices were established:

  • Encapsulate all LINQ to SharePoint queries in a repository. This provides a central point of management for the queries.
  • Configure the repository class to return the entity types generated by the SPMetal command-line tool. This avoids the additional overhead of creating custom business entities and mapping them to the SPMetal entity classes, which would be the purist approach to implementing the Repository pattern. However, on the negative side, this results in a tighter coupling between the data model and the data consumers.
  • Add view objects to the repository in order to return composite projections of entities. A view object combines fields from more than one entity, and using view projections can make LINQ to SharePoint queries across multiple entities more efficient, as described earlier in this topic. This approach was used in the reference implementations, even though it deviates from the Repository pattern, because the views are relatively simple and the entities that are represented in the views are all owned by the same repository. If the views were more complex, or the entities involved spanned multiple repositories, the developers would have implemented a separate class to manage views to provide a cleaner division of responsibilities.

To view this repository implementation in action, see the SharePoint List Data Models Reference Implementation.