查询表达式基础Query expression basics

本文介绍与 C# 中的查询表达式相关的基本概念。This article introduces the basic concepts related to query expressions in C#.

查询是什么及其作用是什么?What is a query and what does it do?

查询是一组指令,描述要从给定数据源(或源)检索的数据以及返回的数据应具有的形状和组织。A query is a set of instructions that describes what data to retrieve from a given data source (or sources) and what shape and organization the returned data should have. 查询与它生成的结果不同。A query is distinct from the results that it produces.

通常情况下,源数据按逻辑方式组织为相同类型的元素的序列。Generally, the source data is organized logically as a sequence of elements of the same kind. 例如,SQL 数据库表包含行的序列。For example, a SQL database table contains a sequence of rows. 在 XML 文件中,存在 XML 元素的“序列”(尽管这些元素在树结构按层次结构进行组织)。In an XML file, there is a "sequence" of XML elements (although these are organized hierarchically in a tree structure). 内存中集合包含对象的序列。An in-memory collection contains a sequence of objects.

从应用程序的角度来看,原始源数据的特定类型和结构并不重要。From an application's viewpoint, the specific type and structure of the original source data is not important. 应用程序始终将源数据视为 IEnumerable<T>IQueryable<T> 集合。The application always sees the source data as an IEnumerable<T> or IQueryable<T> collection. 例如在 LINQ to XML 中,源数据显示为 IEnumerable<XElement>。For example, in LINQ to XML, the source data is made visible as an IEnumerable<XElement>.

对于此源序列,查询可能会执行三种操作之一:Given this source sequence, a query may do one of three things:

  • 检索元素的子集以生成新序列,而不修改各个元素。Retrieve a subset of the elements to produce a new sequence without modifying the individual elements. 查询然后可能以各种方式对返回的序列进行排序或分组,如下面的示例所示(假定 scoresint[]):The query may then sort or group the returned sequence in various ways, as shown in the following example (assume scores is an int[]):

    IEnumerable<int> highScoresQuery =
        from score in scores
        where score > 80
        orderby score descending
        select score;
    
  • 如前面的示例所示检索元素的序列,但是将它们转换为新类型的对象。Retrieve a sequence of elements as in the previous example but transform them to a new type of object. 例如,查询可以只从数据源中的某些客户记录检索姓氏。For example, a query may retrieve only the last names from certain customer records in a data source. 或者可以检索完整记录,然后用于构造其他内存中对象类型甚至是 XML 数据,再生成最终的结果序列。Or it may retrieve the complete record and then use it to construct another in-memory object type or even XML data before generating the final result sequence. 下面的示例演示从 intstring 的投影。The following example shows a projection from an int to a string. 请注意 highScoresQuery 的新类型。Note the new type of highScoresQuery.

    IEnumerable<string> highScoresQuery2 =
        from score in scores
        where score > 80
        orderby score descending
        select $"The score is {score}";
    
  • 检索有关源数据的单独值,如:Retrieve a singleton value about the source data, such as:

    • 与特定条件匹配的元素数。The number of elements that match a certain condition.

    • 具有最大或最小值的元素。The element that has the greatest or least value.

    • 与某个条件匹配的第一个元素,或指定元素集中特定值的总和。The first element that matches a condition, or the sum of particular values in a specified set of elements. 例如,下面的查询从 scores 整数数组返回大于 80 的分数的数量:For example, the following query returns the number of scores greater than 80 from the scores integer array:

      int highScoreCount =
          (from score in scores
           where score > 80
           select score)
           .Count();
      

      在前面的示例中,请注意在调用 Count 方法之前,在查询表达式两边使用了括号。In the previous example, note the use of parentheses around the query expression before the call to the Count method. 也可以通过使用新变量存储具体结果,来表示此行为。You can also express this by using a new variable to store the concrete result. 这种方法更具可读性,因为它使存储查询的变量与存储结果的查询分开。This technique is more readable because it keeps the variable that stores the query separate from the query that stores a result.

      IEnumerable<int> highScoresQuery3 =
          from score in scores
          where score > 80
          select score;
      
      int scoreCount = highScoresQuery3.Count();
      

在上面的示例中,查询在 Count 调用中执行,因为 Count 必须循环访问结果才能确定 highScoresQuery 返回的元素数。In the previous example, the query is executed in the call to Count, because Count must iterate over the results in order to determine the number of elements returned by highScoresQuery.

查询表达式是什么?What is a query expression?

查询表达式是以查询语法表示的查询。A query expression is a query expressed in query syntax. 查询表达式是一流的语言构造。A query expression is a first-class language construct. 它如同任何其他表达式一样,可以在 C# 表达式有效的任何上下文中使用。It is just like any other expression and can be used in any context in which a C# expression is valid. 查询表达式由一组用类似于 SQL 或 XQuery 的声明性语法所编写的子句组成。A query expression consists of a set of clauses written in a declarative syntax similar to SQL or XQuery. 每个子句进而包含一个或多个 C# 表达式,而这些表达式可能本身是查询表达式或包含查询表达式。Each clause in turn contains one or more C# expressions, and these expressions may themselves be either a query expression or contain a query expression.

查询表达式必须以 from 子句开头,且必须以 selectgroup 子句结尾。A query expression must begin with a from clause and must end with a select or group clause. 在第一个 from 子句与最后一个 selectgroup 子句之间,可以包含以下这些可选子句中的一个或多个:whereorderbyjoinlet,甚至是其他 from 子句。Between the first from clause and the last select or group clause, it can contain one or more of these optional clauses: where, orderby, join, let and even additional from clauses. 还可以使用 into 关键字,使 joingroup 子句的结果可以充当相同查询表达式中的其他查询子句的源。You can also use the into keyword to enable the result of a join or group clause to serve as the source for additional query clauses in the same query expression.

查询变量Query variable

在 LINQ 中,查询变量是存储查询而不是查询结果的任何变量。In LINQ, a query variable is any variable that stores a query instead of the results of a query. 更具体地说,查询变量始终是可枚举类型,在 foreach 语句或对其 IEnumerator.MoveNext 方法的直接调用中循环访问时会生成元素序列。More specifically, a query variable is always an enumerable type that will produce a sequence of elements when it is iterated over in a foreach statement or a direct call to its IEnumerator.MoveNext method.

下面的代码示例演示一个简单查询表达式,它具有一个数据源、一个筛选子句、一个排序子句并且不转换源元素。The following code example shows a simple query expression with one data source, one filtering clause, one ordering clause, and no transformation of the source elements. 该查询以 select 子句结尾。The select clause ends the query.

static void Main()
{
    // Data source.
    int[] scores = { 90, 71, 82, 93, 75, 82 };

    // Query Expression.
    IEnumerable<int> scoreQuery = //query variable
        from score in scores //required
        where score > 80 // optional
        orderby score descending // optional
        select score; //must end with select or group

    // Execute the query to produce the results
    foreach (int testScore in scoreQuery)
    {
        Console.WriteLine(testScore);
    }                  
}
// Outputs: 93 90 82 82      

在上面的示例中,scoreQuery 是查询变量,它有时仅仅称为查询。In the previous example, scoreQuery is a query variable, which is sometimes referred to as just a query. 查询变量不存储在 foreach 循环生成中的任何实际结果数据。The query variable stores no actual result data, which is produced in the foreach loop. 并且当 foreach 语句执行时,查询结果不会通过查询变量 scoreQuery 返回。And when the foreach statement executes, the query results are not returned through the query variable scoreQuery. 而是通过迭代变量 testScore 返回。Rather, they are returned through the iteration variable testScore. scoreQuery 变量可以在另一个 foreach 循环中进行循环访问。The scoreQuery variable can be iterated in a second foreach loop. 只要既没有修改它,也没有修改数据源,便会生成相同结果。It will produce the same results as long as neither it nor the data source has been modified.

查询变量可以存储采用查询语法、方法语法或是两者的组合进行表示的查询。A query variable may store a query that is expressed in query syntax or method syntax, or a combination of the two. 在以下示例中,queryMajorCitiesqueryMajorCities2 都是查询变量:In the following examples, both queryMajorCities and queryMajorCities2 are query variables:

//Query syntax
IEnumerable<City> queryMajorCities =
    from city in cities
    where city.Population > 100000
    select city;


// Method-based syntax
IEnumerable<City> queryMajorCities2 = cities.Where(c => c.Population > 100000);

另一方面,以下两个示例演示不是查询变量的变量(即使各自使用查询进行初始化)。On the other hand, the following two examples show variables that are not query variables even though each is initialized with a query. 它们不是查询变量,因为它们存储结果:They are not query variables because they store results:

int highestScore =
    (from score in scores
     select score)
    .Max();

// or split the expression
IEnumerable<int> scoreQuery =
    from score in scores
    select score;

int highScore = scoreQuery.Max();
// the following returns the same result
int highScore = scores.Max();

List<City> largeCitiesList =
    (from country in countries
     from city in country.Cities
     where city.Population > 10000
     select city)
       .ToList();

// or split the expression
IEnumerable<City> largeCitiesQuery =
    from country in countries
    from city in country.Cities
    where city.Population > 10000
    select city;

List<City> largeCitiesList2 = largeCitiesQuery.ToList();

有关表示查询的不同方式的详细信息,请参阅 LINQ 中的查询语法和方法语法For more information about the different ways to express queries, see Query syntax and method syntax in LINQ.

查询变量的显式和隐式类型化Explicit and implicit typing of query variables

本文档通常提供查询变量的显式类型以便显示查询变量与 select 子句之间的类型关系。This documentation usually provides the explicit type of the query variable in order to show the type relationship between the query variable and the select clause. 但是,还可以使用 var 关键字指示编译器在编译时推断查询变量(或任何其他局部变量)的类型。However, you can also use the var keyword to instruct the compiler to infer the type of a query variable (or any other local variable) at compile time. 例如,本主题中前面演示的查询示例也可以使用隐式类型化进行表示:For example, the query example that was shown previously in this topic can also be expressed by using implicit typing:

// Use of var is optional here and in all queries.
// queryCities is an IEnumerable<City> just as 
// when it is explicitly typed.
var queryCities =
    from city in cities
    where city.Population > 100000
    select city;

有关详细信息,请参阅隐式类型化局部变量LINQ 查询操作中的类型关系For more information, see Implicitly typed local variables and Type relationships in LINQ query operations.

开始查询表达式Starting a query expression

查询表达式必须以 from 子句开头。A query expression must begin with a from clause. 它指定数据源以及范围变量。It specifies a data source together with a range variable. 范围变量表示遍历源序列时,源序列中的每个连续元素。The range variable represents each successive element in the source sequence as the source sequence is being traversed. 范围变量基于数据源中元素的类型进行强类型化。The range variable is strongly typed based on the type of elements in the data source. 在下面的示例中,因为 countriesCountry 对象的数组,所以范围变量也类型化为 CountryIn the following example, because countries is an array of Country objects, the range variable is also typed as Country. 因为范围变量是强类型,所以可以使用点运算符访问该类型的任何可用成员。Because the range variable is strongly typed, you can use the dot operator to access any available members of the type.

IEnumerable<Country> countryAreaQuery =
    from country in countries
    where country.Area > 500000 //sq km
    select country;

范围变量一直处于范围中,直到查询使用分号或 continuation 子句退出。The range variable is in scope until the query is exited either with a semicolon or with a continuation clause.

查询表达式可能会包含多个 from 子句。A query expression may contain multiple from clauses. 在源序列中的每个元素本身是集合或包含集合时,可使用其他 from 子句。Use additional from clauses when each element in the source sequence is itself a collection or contains a collection. 例如,假设具有 Country 对象的集合,其中每个对象都包含名为 CitiesCity 对象集合。For example, assume that you have a collection of Country objects, each of which contains a collection of City objects named Cities. 若要查询每个 Country 中的 City 对象,请使用两个 from 子句,如下所示:To query the City objects in each Country, use two from clauses as shown here:

IEnumerable<City> cityQuery =
    from country in countries
    from city in country.Cities
    where city.Population > 10000
    select city;

有关详细信息,请参阅 from 子句For more information, see from clause.

结束查询表达式Ending a query expression

查询表达式必须以 group 子句或 select 子句结尾。A query expression must end with either a group clause or a select clause.

group 子句group clause

使用 group 子句可生成按指定键组织的组的序列。Use the group clause to produce a sequence of groups organized by a key that you specify. 键可以是任何数据类型。The key can be any data type. 例如,下面的查询会创建包含一个或多个 Country 对象并且其键是 char 值的组的序列。For example, the following query creates a sequence of groups that contains one or more Country objects and whose key is a char value.

var queryCountryGroups =
    from country in countries
    group country by country.Name[0];

有关分组的详细信息,请参阅 group 子句For more information about grouping, see group clause.

select 子句select clause

使用 select 子句可生成所有其他类型的序列。Use the select clause to produce all other types of sequences. 简单 select 子句只生成类型与数据源中包含的对象相同的对象的序列。A simple select clause just produces a sequence of the same type of objects as the objects that are contained in the data source. 在此示例中,数据源包含 Country 对象。In this example, the data source contains Country objects. orderby 子句只按新顺序对元素进行排序,而 select 子句生成重新排序的 Country 对象的序列。The orderby clause just sorts the elements into a new order and the select clause produces a sequence of the reordered Country objects.

IEnumerable<Country> sortedQuery =
    from country in countries
    orderby country.Area
    select country;

select 子句可以用于将源数据转换为新类型的序列。The select clause can be used to transform source data into sequences of new types. 此转换也称为投影。This transformation is also named a projection. 在下面的示例中,select 子句对只包含原始元素中的字段子集的匿名类型序列进行投影。In the following example, the select clause projects a sequence of anonymous types which contains only a subset of the fields in the original element. 请注意,新对象使用对象初始值设定项进行初始化。Note that the new objects are initialized by using an object initializer.

// Here var is required because the query
// produces an anonymous type.
var queryNameAndPop =
    from country in countries
    select new { Name = country.Name, Pop = country.Population };

有关可以使用 select 子句转换源数据的所有方法的详细信息,请参阅 select 子句For more information about all the ways that a select clause can be used to transform source data, see select clause.

使用“into”进行延续Continuations with "into"

可以在 selectgroup 子句中使用 into 关键字创建存储查询的临时标识符。You can use the into keyword in a select or group clause to create a temporary identifier that stores a query. 如果在分组或选择操作之后必须对查询执行其他查询操作,则可以这样做。Do this when you must perform additional query operations on a query after a grouping or select operation. 在下面的示例中,countries 按 1000 万范围,根据人口进行分组。In the following example countries are grouped according to population in ranges of 10 million. 创建这些组之后,附加子句会筛选出一些组,然后按升序对组进行排序。After these groups are created, additional clauses filter out some groups, and then to sort the groups in ascending order. 若要执行这些附加操作,需要由 countryGroup 表示的延续。To perform those additional operations, the continuation represented by countryGroup is required.

// percentileQuery is an IEnumerable<IGrouping<int, Country>>
var percentileQuery =
    from country in countries
    let percentile = (int) country.Population / 10_000_000
    group country by percentile into countryGroup
    where countryGroup.Key >= 20
    orderby countryGroup.Key
    select countryGroup;

// grouping is an IGrouping<int, Country>
foreach (var grouping in percentileQuery)
{
    Console.WriteLine(grouping.Key);
    foreach (var country in grouping)
        Console.WriteLine(country.Name + ":" + country.Population);
}

有关详细信息,请参阅 intoFor more information, see into.

筛选、排序和联接Filtering, ordering, and joining

在开头 from 子句与结尾 selectgroup 子句之间,所有其他子句(wherejoinorderbyfromlet)都是可选的。Between the starting from clause, and the ending select or group clause, all other clauses (where, join, orderby, from, let) are optional. 任何可选子句都可以在查询正文中使用零次或多次。Any of the optional clauses may be used zero times or multiple times in a query body.

where 子句where clause

使用 where 子句可基于一个或多个谓词表达式,从源数据中筛选出元素。Use the where clause to filter out elements from the source data based on one or more predicate expressions. 以下示例中的 where 子句具有一个谓词及两个条件。The where clause in the following example has one predicate with two conditions.

IEnumerable<City> queryCityPop =
    from city in cities
    where city.Population < 200000 && city.Population > 100000
    select city;

有关详细信息,请参阅 where 子句For more information, see where clause.

orderby 子句orderby clause

使用 orderby 子句可按升序或降序对结果进行排序。Use the orderby clause to sort the results in either ascending or descending order. 还可以指定次要排序顺序。You can also specify secondary sort orders. 下面的示例使用 Area 属性对 country 对象执行主要排序。The following example performs a primary sort on the country objects by using the Area property. 然后使用 Population 属性执行次要排序。It then performs a secondary sort by using the Population property.

IEnumerable<Country> querySortedCountries =
    from country in countries
    orderby country.Area, country.Population descending
    select country;

ascending 关键字是可选的;如果未指定任何顺序,则它是默认排序顺序。The ascending keyword is optional; it is the default sort order if no order is specified. 有关详细信息,请参阅 orderby 子句For more information, see orderby clause.

join 子句join clause

使用 join 子句可基于每个元素中指定的键之间的相等比较,将一个数据源中的元素与另一个数据源中的元素进行关联和/或合并。Use the join clause to associate and/or combine elements from one data source with elements from another data source based on an equality comparison between specified keys in each element. 在 LINQ 中,联接操作是对元素属于不同类型的对象序列执行。In LINQ, join operations are performed on sequences of objects whose elements are different types. 联接了两个序列之后,必须使用 selectgroup 语句指定要存储在输出序列中的元素。After you have joined two sequences, you must use a select or group statement to specify which element to store in the output sequence. 还可以使用匿名类型将每组关联元素中的属性合并到输出序列的新类型中。You can also use an anonymous type to combine properties from each set of associated elements into a new type for the output sequence. 下面的示例关联其 Category 属性与 categories 字符串数组中一个类别匹配的 prod 对象。The following example associates prod objects whose Category property matches one of the categories in the categories string array. 筛选出其 Category 不与 categories 中的任何字符串匹配的产品。select 语句会投影其属性取自 catprod 的新类型。Products whose Category does not match any string in categories are filtered out. The select statement projects a new type whose properties are taken from both cat and prod.

var categoryQuery =
    from cat in categories
    join prod in products on cat equals prod.Category
    select new { Category = cat, Name = prod.Name };

还可以通过使用 into 关键字将 join 操作的结果存储到临时变量中来执行分组联接。You can also perform a group join by storing the results of the join operation into a temporary variable by using the into keyword. 有关详细信息,请参阅 join 子句For more information, see join clause.

let 子句let clause

使用 let 子句可将表达式(如方法调用)的结果存储在新范围变量中。Use the let clause to store the result of an expression, such as a method call, in a new range variable. 在下面的示例中,范围变量 firstName 存储 Split 返回的字符串数组的第一个元素。In the following example, the range variable firstName stores the first element of the array of strings that is returned by Split.

string[] names = { "Svetlana Omelchenko", "Claire O'Donnell", "Sven Mortensen", "Cesar Garcia" };
IEnumerable<string> queryFirstNames =
    from name in names
    let firstName = name.Split(' ')[0]
    select firstName;

foreach (string s in queryFirstNames)
    Console.Write(s + " ");
//Output: Svetlana Claire Sven Cesar

有关详细信息,请参阅 let 子句For more information, see let clause.

查询表达式中的子查询Subqueries in a query expression

查询子句本身可能包含查询表达式,这有时称为子查询。A query clause may itself contain a query expression, which is sometimes referred to as a subquery. 每个子查询都以自己的 from 子句开头,该子句不一定指向第一个 from 子句中的相同数据源。Each subquery starts with its own from clause that does not necessarily point to the same data source in the first from clause. 例如,下面的查询演示在 select 语句用于检索分组操作结果的查询表达式。For example, the following query shows a query expression that is used in the select statement to retrieve the results of a grouping operation.

var queryGroupMax =
    from student in students
    group student by student.GradeLevel into studentGroup
    select new
    {
        Level = studentGroup.Key,
        HighestScore =
            (from student2 in studentGroup
             select student2.Scores.Average())
             .Max()
    };

有关详细信息,请参阅如何:对分组操作执行子查询For more information, see How to: perform a subquery on a grouping operation.

请参阅See also