Creating stoplist on database of compatibility 80/90 in SQL server 2008

Introduction:

In this blog we will see how to add a stopword in a stoplist and also look at one of the common scenario which DBA face when they upgrade from SQL 2000\2005 to 2008 keeping compatibility level 80.

So to introduce the topic lets know few terms:

Stopwords: These are the common words in various languages which we generally do not want to search because of their common usage. So SQL FTS ignores these words while searching.

Stoplist: A collection of stopwords is called as stoplists. It varies with language. FTS searches for stopwords in stoplists when searching for a string.

Noise Words: These are another term of stopwords used till SQL 2005.

With this information lets discuss the scenario.

Scenario:

You are running on SQL 2008 which you have recently upgraded from SQL 2000. But due to application constraint you can’t update your database compatibility level to 100. And, you want to add stopwords on such databases.

When you try to do this you end up with this error:

Cannot execute changes.

------------------------------

ADDITIONAL INFORMATION:

An exception occurred while executing a Transact-SQL statement or batch. (Microsoft.SqlServer.ConnectionInfo)

------------------------------

Incorrect syntax near 'STOPLIST'. (Microsoft SQL Server, Error: 102)

For help, click: https://go.microsoft.com/fwlink?ProdName=Microsoft+SQL+Server\&ProdVer=10.50.1600\&EvtSrc=MSSQLServer\&EvtID=102\&LinkId=20476

------------------------------

So we will have to follow some workaround to fulfill our criteria and be able to create stoplists.

Problem:

Let say I am designing a website (for example: Job Portal). As you would be aware there are some default stopwords which SQL does not uses it in FTS.

For example: "It" is a commonly used word in English so SQL FTS stores this word in DMV: sys.fulltext_stopwords so that it does not search such commonly used word. But that could be a problem since in a job portal website we would not be able to search for "IT" related jobs as this word would be blocked by SQL.

To read more about stoplist and stopwords please refer:
https://msdn.microsoft.com/en-us/library/ms142551.aspx

Till 2005 this could be easily changed by modifying them in text file located in:

C:\Program Files\Microsoft SQL Server\MSSQL.<Instance ID>\MSSQL\FTData

In 2008 we can create a stoplist using the management studio or using the commands. I am using the GUI which is what makes SQL so easy but those who want to try commands please follow this link:

https://msdn.microsoft.com/en-us/library/cc280405.aspx

But in this scenario we were not able to do that and were getting the error messages mentioned in the scenario section.

Workaround:

Step 1) Since our database is in 80 we will not be able to add stoplist but we will follow a workaround. First in database properties option we will change the compatibility level to 100

Step 2) Then Expand Database, Go to storage and right click on Full-Text Stoplist to create a new stoplist

Step 3) In our case we will use system stoplist to get the pre-determined set of stopwords and modify it as per our convenience

clip_image001

Step 4) So now you would see a "test" stoplist present in Full-Text Stoplist.

To find the words present in this stoplist we again use the DMV:

Select * from sys.fulltext_stopwords where language_id=1033

To know more about language id:

https://msdn.microsoft.com/en-us/library/ms190303.aspx

Now as we remember we are working in a Job Portal so we may want to remove the word IT so that it is enable in our search. And we may also want to add some words which we don’t want in our search say "Fast" so we will add in Stopwords.

This can be done by double clicking on our stoplist and adding a stopword:

clip_image002

You can also delete a stopword from changing the Action and removing the stopword "IT"

This way we will make the changes that we want and after we have done that query the stoplist to have a final check if all stopwords are as required.

select * from sys.fulltext_stopwords where language_id=1033

Step 5) At the end we can change the compatibility level to 80 as required. But every time whenever I make changes I will have to again make those changes by repeating all the steps

For understanding SQL more about stoplist and stopwords you can use this MSDN link: https://msdn.microsoft.com/en-us/library/cc721269(v=sql.100).aspx#_Toc202506235

Written By : - Divya Jotwani,Support Engineer,Microsoft SQL server support

Reviewed By: – Robert Beene,Sr. Support Escalation Engineer,Microsoft SQL server support