Share via


Behavior Changes to Full-Text Search

This topic describes behavior changes in full-text search. Behavior changes affect how features work or interact in SQL Server 2014 as compared to earlier versions of SQL Server.

Behavior Changes in Full-Text Search in SQL Server 2014

Information to come later.

Behavior Changes in Full-Text Search in SQL Server 2012

SQL Server 2012 installs a new version of the word breakers and stemmers for US English (LCID 1033) and UK English (LCID 2057). However you can switch to the previous version of these components if you want to retain the previous behavior. For more information, see Change the Word Breaker Used for US English and UK English.

New Word Breakers and Stemmers Installed

SQL Server 2012 updates all the word breakers and stemmers used by Full-Text Search and Semantic Search. For consistency between the contents of indexes and the results of queries, we recommend that you repopulate existing full-text indexes.

  1. There are new word breakers for English. If you have to retain the previous behavior, see Change the Word Breaker Used for US English and UK English.

  2. The third-party word breakers for Danish, Polish, and Turkish that were included with previous releases of SQL Server have been replaced with Microsoft components. The new components are enabled by default.

  3. There are new word breakers for Czech and Greek. Previous releases of SQL Server Full-Text Search did not include support for these two languages.

Behavior Changes of New Word Breakers and Stemmers

The new components might return different results than the older components when you populate and query full-text indexes. The following tables demonstrate some of the differences that can be expected in English results.

If you have to retain the previous behavior of the word breakers and stemmers, see the following topics:

In some cases, the new components return more results:

Term Results with previous word breaker and stemmer Results with new word breaker and stemmer
cat-dog cat

dog
cat

cat-dog

dog
cat@dog.com cat

com

dog
cat

cat@dog.com

com

dog
12/11/2011

(where the term is a date)
12/11/2011

dd20111211
11

12

12/11/2011

2011

dd20111211

In some cases, the new components return similar results:

Term Results with previous word breaker and stemmer Results with new word breaker and stemmer
100$ 100$

nn100$
100$

nn100usd
022 022

nn022
022

nn22
10:49AM

(where the term is a time)
10:49am

tt1049
10:49am

tt24104900

In some cases the new components return fewer results or results that may be unexpected by applications:

Term Results with previous word breaker and stemmer Results with new word breaker and stemmer
jěˊÿqℭžl

(where the terms are not valid English characters)
'jěˊÿqℭžl' je yq zl
table's table's

table
table's
cat- cat

cat-
cat
v-z*(where v and z are noise words)* (no results) v-z
$100 000 USD $100

000

nn000

nn100$

usd
$100 000 usd

nn100000usd
beautiful U.S land beautiful

land

u.s

us
beautiful

land
Mt. Kent and Mt Challenger challenger

kent

mt

mt.
mt

kent

challenger

Behavior Changes in Full-Text Search in SQL Server 2008

In SQL Server 2008 and later versions, the Full-Text Engine is integrated as a database service into the relational database as part of the server query and storage engine infrastructure. The new full-text search architecture achieves the following goals:

  • Integrated storage and management-Full-text search is now integrated directly with the inherent storage and management features of SQL Server, and the MSFTESQL service no longer exists.

    • Full-text indexes are stored inside the database filegroups, rather than in the file system. Administrative operations on a database, such as creating a backup, automatically affect its full-text indexes.

    • A full-text catalog is now a virtual object that does not belong to any filegroup; it is a logical concept that refers to a group of full-text indexes. Therefore, many catalog-management features have been deprecated, and deprecation has created breaking changes for some features. For more information, see Deprecated Database Engine Features in SQL Server 2014 and Breaking Changes to Full-Text Search.

      Note

      SQL Server 2005 Transact-SQL DDL statements that specify full-text catalogs work correctly.

  • Integrated query processing-The new full-text search query processor is part of the Database Engine and is fully integrated with the SQL Server Query processor. This means that, the query optimizer recognizes full-text query predicates and automatically executes them as efficiently as possible.

  • Enhanced administration and troubleshooting-Integrated full-text search provides tools to help you analyze search structures such as the full-text index, the output of a given word breaker, stopword configuration, and so forth.

  • Stopwords and stoplists have replaced noise words and noise-word files. A stoplist is a database object that facilitates manageability tasks for stopwords and improves the integrity between different server instances and environments. For more information, see Configure and Manage Stopwords and Stoplists for Full-Text Search.

  • SQL Server 2008 and later versions include new word breakers for many of the languages that exist in SQL Server 2005. Only the word breakers for English, Korean, Thai, and Chinese (all forms) remain the same. For other languages, if a full-text catalog was imported when a SQL Server 2005 database was upgraded to SQL Server 2008 or a later version, one or more languages used by the full-text indexes in full-text catalog might now be associated with new word breakers that might behave slightly differently from the imported word breakers. For more information about how to ensure consistency between queries and the full-text index content, see Upgrade Full-Text Search.

  • A new FDHOST Launcher (MSSQLFDLauncher) service has been added. For more information, see Get Started with Full-Text Search.

  • Full-text indexing works with a FILESTREAM column in the same way that it does with a varbinary(max) column. The FILESTREAM table must have a column that contains the file name extension for each FILESTREAM BLOB. For more information, see Query with Full-Text Search,Configure and Manage Filters for Search, and sys.fulltext_document_types (Transact-SQL).

    The full-text engine indexes the contents of the FILESTREAM BLOBs. Indexing files such as images might not be useful. When a FILESTREAM BLOB is updated it is reindexed.

See Also

Full-Text Search
Full-Text Search Backward Compatibility
Upgrade Full-Text Search
Get Started with Full-Text Search