全文搜索入门Get Started with Full-Text Search

适用于: 是SQL Server是Azure SQL 数据库否Azure SQL 数据仓库否并行数据仓库APPLIES TO: yesSQL Server yesAzure SQL Database noAzure SQL Data Warehouse noParallel Data Warehouse

SQL Server 数据库默认已启用全文搜索。SQL Server databases are full-text enabled by default. 但是,在运行全文查询之前,必须先创建全文目录,然后在要搜索的表或索引视图中创建全文索引。Before you can run full-text queries, however, you must create a full text catalog and create a full-text index on the tables or indexed views you want to search.

通过两个步骤设置全文搜索Set up full-text search in two steps

可通过两个基本步骤来设置全文搜索:There are two basic steps to set up full-text search:

  1. 创建全文目录。Create a full-text catalog.
  2. 在要搜索的表或索引视图上创建全文索引。Create a full-text index on tables or indexed view you want to search.

期待您的反馈 :如果在本文中发现过时或不正确的内容(如步骤或代码示例),请告诉我们。We are listening: If you find something outdated or incorrect in this article, such as a step or a code example, please tell us. 可以单击此页底部的“反馈” 部分中的“本页” 按钮。You can click the This page button in the Feedback section at the bottom of this page. 我们通常在第二天阅读有关 SQL 的每项反馈。We read every item of feedback about SQL, typically the next day. 谢谢。Thanks.

每个全文索引必须属于一个全文目录。Each full-text index must belong to a full-text catalog. 您可以为每个全文索引创建一个单独的文本目录,也可以将多个全文索引与给定目录关联起来。You can create a separate text catalog for each full-text index, or you can associate multiple full-text indexes with a given catalog. 全文目录是虚拟对象,不属于任何文件组。A full-text catalog is a virtual object and does not belong to any filegroup. 该目录是表示一组全文索引的逻辑概念。The catalog is a logical concept that refers to a group of full-text indexes.

备注

这些步骤假设安装 SQL Server 时已安装可选的全文搜索组件。These steps assume that you installed the optional Full-Text Search components when you installed SQL Server. 如果未安装,必须再次运行 SQL Server 安装程序来添加该组件。If not, you have to run SQL Server Setup again to add them.

使用向导设置全文搜索Set up full-text search with a wizard

若要使用向导设置全文搜索,请参阅使用全文索引向导To set up full-text search by using a wizard, see Use the Full-Text Indexing Wizard.

使用 Transact-SQL 设置全文搜索Set up full-text search with Transact-SQL

下面的示例由两个部分组成,首先在 AdventureWorks 示例数据库中创建名为 AdvWksDocFTCat 的全文目录,然后在示例数据库中的 Document 表内创建全文索引。The following two-part example creates a full-text catalog named AdvWksDocFTCat on the AdventureWorks sample database and then creates a full-text index on the Document table in the sample database. 此语句在安装 SQL Server 期间指定的默认目录中创建全文目录。This statement creates the full-text catalog in the default directory specified during SQL Server setup. 将在默认目录下创建名为 AdvWksDocFTCat 的文件夹。The folder named AdvWksDocFTCat is in the default directory.

  1. 为了创建名为 AdvWksDocFTCat的全文目录,此示例使用了 CREATE FULLTEXT CATALOG 语句:To create a full-text catalog named AdvWksDocFTCat, the example uses a CREATE FULLTEXT CATALOG statement:

    USE AdventureWorks;  
    GO  
    CREATE FULLTEXT CATALOG AdvWksDocFTCat;  
    

    有关详细信息,请参阅创建和管理全文目录For more info, see Create and Manage Full-Text Catalogs.

  2. 在对 Document 表创建全文索引之前,请确保该表具有唯一的、不可为 Null 的单列索引。Before you can create a full-text index on the Document table, ensure that the table has a unique, single-column, non-nullable index. 下面的 CREATE INDEX 语句可对 Document 表的 DocumentID 列创建唯一索引 ui_ukDocThe following CREATE INDEX statement creates a unique index, ui_ukDoc, on the DocumentID column of the Document table:

    CREATE UNIQUE INDEX ui_ukDoc ON Production.Document(DocumentID);  
    
  3. 具有唯一键后,即可使用下面的 Document CREATE FULLTEXT INDEX 语句对 表创建全文索引。After you have a unique key, you can create a full-text index on the Document table by using the following CREATE FULLTEXT INDEX statement.

    CREATE FULLTEXT INDEX ON Production.Document  
    (  
        Document                         --Full-text index column name   
            TYPE COLUMN FileExtension    --Name of column that contains file type information  
            Language 2057                 --2057 is the LCID for British English  
    )  
    KEY INDEX ui_ukDoc ON AdvWksDocFTCat --Unique index  
    WITH CHANGE_TRACKING AUTO            --Population type;  
    GO  
    
    

    此示例中定义的 TYPE COLUMN 指定表中的类型列,该列包含“Document”列(为二进制类型)的每一行中文档的类型。The TYPE COLUMN defined in this example specifies the type column in the table that contains the type of the document in each row of the column 'Document' (which is of binary type). 此类型列存储给定行中文档的、由用户提供的文件扩展名 -“.doc”、“.xls”等等。The type column stores the user-supplied file extension - ".doc", ".xls", and so forth - of the document in a given row. 全文引擎使用给定行中的文件扩展名调用正确的筛选器,以用于分析该行中的数据。The Full-Text Engine uses the file extension in a given row to invoke the correct filter to use for parsing the data in that row. 筛选器分析行的二进制数据后,指定的分词系统将分析内容。After the filter has parsed the binary data of the row, the specified word breaker parses the content. (此示例中使用了英国英语的分词系统。)有关详细信息,请参阅 配置和管理搜索筛选器(In this example, the word breaker for British English is used.) For more information, see Configure and Manage Filters for Search.

    有关详细信息,请参阅创建和管理全文索引For more info, see Create and Manage Full-Text Indexes.

选择全文检索的选项Choose options for a full-text index

选择语言Choose a language

有关选择列语言的信息,请参阅 创建全文索引时选择语言For information about choosing the column language, see Choose a Language When Creating a Full-Text Index.

选择文件组Choose a filegroup

生成全文本索引的过程会消耗相当多的 I/O 资源。The process of building a full-text index is fairly I/O intensive. 简而言之,该过程包括从 SQL ServerSQL Server 读取数据,然后将筛选的数据传播到全文索引。In summary, it consists of reading data from SQL ServerSQL Server, and then propagating the filtered data to the full-text index. 最佳做法是将全文索引置于最适于最大程度地提高 I/O 性能的数据库文件组中,或者将全文索引置于另一个卷的其他文件组中。As a best practice, locate a full-text index in the database filegroup that is best for maximizing I/O performance or locate the full-text indexes in a different filegroup on another volume.

选择全文目录Choose a full-text catalog

建议将具有相同更新特征的表(如更改次数少的与更改次数多的,或者在一天中某个特定时段内频繁更改的表)关联在一起,并置于同一全文目录下。We recommend associating tables with the same update characteristics (such as small number of changes versus large number of changes, or tables that change frequently during a particular time of day) together under the same full-text catalog. 通过设置全文目录填充计划,会使全文索引与表保持同步,且在数据库活动较多时不会对数据库服务器的资源使用产生负面影响。By setting up full-text catalog population schedules, full-text indexes stay synchronous with the tables without adversely affecting the resource usage of the database server during periods of high database activity.

请考虑以下原则:Consider the following guidelines:

  • 如果创建索引的表有数百万行,请将该表分配到其自身的全文目录。If you are indexing a table with millions of rows, assign the table to its own full-text catalog.

  • 考虑要进行全文索引的表中发生的更改量以及总行数。Consider the amount of change occurring in the tables being full-text indexed, as well as the total number of rows. 如果要更改的总行数与上次全文填充期间表中出现的行数合起来达到了数百万行,请将该表分配到其自身的全文目录。If the total number of rows being changed, together with the number of rows in the table present during the last full-text population, represents millions of rows, assign the table to its own full-text catalog.

关联唯一索引Associate a unique index

始终选择可用于全文唯一键的最小唯一索引。Always select the smallest unique index available for your full-text unique key. (最好是 4 个字节、基于整数的索引。)这将显著减少文件系统中 MicrosoftMicrosoft Search 服务所需要的资源。(A 4-byte, integer-based index is optimal.) This significantly reduces the resources required by MicrosoftMicrosoft Search service in the file system. 如果主键较大(超过 100 个字节),可以考虑选择表中的另一个唯一索引(或创建另一个唯一索引)来作为全文唯一键。If the primary key is large (over 100 bytes), consider choosing another unique index in the table (or creating another unique index) as the full-text unique key. 否则,如果全文唯一键的大小超过所允许的最大值(900 个字节),全文填充将无法继续进行。Otherwise, if the full-text unique key size exceeds the maximum size allowed (900 bytes), full-text population will not be able to proceed.

关联非索引字表Associate a stoplist

“非索引字表” 是非索引字(也称为“干扰词”)的列表。A stoplist is a list of stopwords, also known as noise words. 非索引字表与每个全文索引相关联,因而该非索引字表中的词会应用于对该索引的全文查询。A stoplist is associated with each full-text index, and the words in that stoplist are applied to full-text queries on that index. 默认情况下,系统非索引字表与新的全文索引相关联。By default, the system stoplist is associated with a new full-text index. 也可以创建和使用你自己的非索引字表。You can create and use your own stoplist too.

例如,下面的 CREATE FULLTEXT STOPLIST Transact-SQLTransact-SQL 语句可通过从系统非索引字表进行复制来创建名为 myStoplist 的新全文非索引字表:For example, the following CREATE FULLTEXT STOPLIST Transact-SQLTransact-SQL statement creates a new full-text stoplist named myStoplist by copying from the system stoplist:

CREATE FULLTEXT STOPLIST myStoplist FROM SYSTEM STOPLIST;  
GO  

下面的 ALTER FULLTEXT STOPLIST Transact-SQLTransact-SQL 语句可更改名为 myStoplist 的非索引字表,首先为西班牙语添加词“en”,再为法语添加词“en”:The following ALTER FULLTEXT STOPLIST Transact-SQLTransact-SQL statement alters a stoplist named myStoplist, adding the word 'en', first for Spanish and then for French:

ALTER FULLTEXT STOPLIST myStoplist ADD 'en' LANGUAGE 'Spanish';  
ALTER FULLTEXT STOPLIST myStoplist ADD 'en' LANGUAGE 'French';  
GO  

有关详细信息,请参阅 为全文搜索配置和管理非索引字和非索引字表For more information, see Configure and Manage Stopwords and Stoplists for Full-Text Search.

更新全文索引Update a full-text index

与普通 SQL ServerSQL Server 索引一样,全文索引可以在相关表中的数据修改时自动更新。Like regular SQL ServerSQL Server indexes, full-text indexes can be automatically updated as data is modified in the associated tables. 这是默认行为。This is the default behavior. 另外,还可以手动或在预定的间隔更新全文索引。Alternatively, you can keep your full-text indexes up-to-date manually, or at specified scheduled intervals. 填充全文索引可能很耗时且要占用大量资源。Populating a full-text index can be time-consuming and resource-intensive. 因此,索引更新通常作为异步进程执行,该进程在后台运行,在基表中进行修改后使全文索引保持最新。Therefore, index updating is usually performed as an asynchronous process that runs in the background and keeps the full-text index up to date after modifications in the base table.

在基表中进行每次更改后立即更新全文索引也可能会占用大量的资源。Updating a full-text index immediately after each change in the base table is also resource-intensive. 因此,如果更新/插入/删除操作非常频繁,你可能会发现查询性能有所降低。Therefore, if you have a high update/insert/delete rate, you may experience some degradation in query performance. 如果出现这种情况,可以考虑制定一个手动的更改跟踪更新计划,以便按一定的间隔更新大量的更改,从而避免与查询争用资源。If this occurs, consider scheduling manual change tracking updates to keep up with the numerous changes from time to time, rather than competing with queries for resources.

有关详细信息,请参阅填充全文索引For more info, see Populate Full-Text Indexes.

后续步骤Next steps

设置 SQL Server 全文搜索后,便可以开始运行全文查询。After you set up SQL Server Full-Text Search, you're ready to run full-text queries. 有关详细信息,请参阅使用全文搜索查询For more info, see Query with Full-Text Search.