question

VENKATRAMALINGAM-2212 avatar image
0 Votes"
VENKATRAMALINGAM-2212 asked VENKATRAMALINGAM-2212 commented

Fulltext Search conflicting with currency Keyword

We have a situation where fulltext search index column has one of the thailand currency word 'THB' conflicting with the regular search and not resulting in the query outcome ,when the command CONTAINS is used.Instead of treating as regular word , it treats it as currency.But when we change the word to lower case in the database, it finds it.I want the search to work as regular word search instead of sensing it as a currency.
Any help is appreciated to resolve this issue.
VR

sql-server-general
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Cathyji-msft avatar image
0 Votes"
Cathyji-msft answered

Hi @VENKATRAMALINGAM-2212,

Full-text queries are not case-sensitive.

What is the data type of this column? A full-text index includes one or more character-based columns in a table. These columns can have any of the following data types: char, varchar, nchar, nvarchar, text, ntext, image, xml, or varbinary(max) and FILESTREAM.

Refer to MS document Full-Text Search.


If the response is helpful, please click "Accept Answer" and upvote it, as this could help other community members looking for similar queries.




5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

VENKATRAMALINGAM-2212 avatar image
0 Votes"
VENKATRAMALINGAM-2212 answered

The fulltext search indexed column is of varchar datatype.I understand Full text queries aren't case sensitive, but when I change the word case from 'TBH' to 'tbh' I get the search results using CONTAINS command.Basically,I think it invalidates the inbuilt currency recognition and acts like a regular string search when changed to lower case.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

ErlandSommarskog avatar image
0 Votes"
ErlandSommarskog answered

Hm, this seems to return THB:

CREATE FULLTEXT CATALOG katha

CREATE TABLE slaskis(id int CONSTRAINT pk_slaskis PRIMARY KEY (id), mytext varchar(200))
INSERT slaskis (id, mytext)
   VALUES(1, 'Detta är en text med THB'), (2, 'Detta är en text med SEK')

CREATE FULLTEXT INDEX ON slaskis (mytextLANGUAGE 1053) KEY INDEX pk_slaskis  ON Katha

SELECT *FROM slaskis WHERE CONTAINS(mytext, 'THB')

Do you have a similar repro to demonstrate your issue?

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

VENKATRAMALINGAM-2212 avatar image
0 Votes"
VENKATRAMALINGAM-2212 answered

Thanks for the example :)
please try the same with below changes. I have numbers added before the currency symbol and query the numbers and see what you get.
And try the same with inserting lower case.

CREATE FULLTEXT CATALOG katha
CREATE TABLE slaskis(id int CONSTRAINT pk_slaskis PRIMARY KEY (id), mytext varchar(200))
SELECT * FROM slaskis
INSERT slaskis (id, mytext)
VALUES(1, '1111 THB'), (2, '2222 SEK')
CREATE FULLTEXT INDEX ON slaskis ([mytext] LANGUAGE 'English') KEY INDEX pk_slaskis ON Katha
SELECT *FROM slaskis WHERE CONTAINS(mytext, '"1111"')

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Cathyji-msft avatar image
0 Votes"
Cathyji-msft answered Viorel-1 commented

Hi @VENKATRAMALINGAM-2212,

It seems you misunderstood the CONTAINS function.

Quote from MS document CONTAINS (Transact-SQL).

Searches for precise or fuzzy (less precise) matches to single words and phrases, words within a certain distance of one another, or weighted matches in SQL Server.

CONTAINS can search for:

• A word or phrase.
• The prefix of a word or phrase.
• A word near another word.
• A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
• A word that is a synonym of another word using a thesaurus (for example, the word "metal" can have synonyms such as "aluminum" and "steel").

Below is a test in my environment.

108010-screenshot-2021-06-22-172855.jpg


If the response is helpful, please click "Accept Answer" and upvote it, as this could help other community members looking for similar queries.





· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.


Does not seem to work when the row is (6, '1 THB'). The word is not found.

0 Votes 0 ·
ErlandSommarskog avatar image
0 Votes"
ErlandSommarskog answered

I don't think there is much can be done about this. If you run:

SELECT * FROM sys.dm_fts_index_keywords_by_document(db_id(), object_id('slaskis'))

You will see that 1111 THB and 2222 SEK is considered to be "words", whereas 1111 ABC is not.


5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Viorel-1 avatar image
1 Vote"
Viorel-1 answered Viorel-1 edited

Maybe you can use your finding as a workaround. Add a calculable column like this:

alter table slaskis add mytext2 as lower(mytext)

It does not have to be populated. Then add the full-text index to mytext2 instead of mytext. Probably this will not require more space.

Then use mytext2 for searches and mytext for displaying the results.

Both seem to work:

SELECT * FROM slaskis WHERE CONTAINS(mytext2, 'THB')
SELECT * FROM slaskis WHERE CONTAINS(mytext2, '1111')


· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Good thinking, Viorel!

0 Votes 0 ·
VENKATRAMALINGAM-2212 avatar image
0 Votes"
VENKATRAMALINGAM-2212 answered VENKATRAMALINGAM-2212 commented

Thanks for the reply Viorel and love the approach!
But there following challenges I have on this solution:
The table is huge , atleast 400 million records and has space constraints
We basically have a indexed view created on top of this base table on which the Full text Index has been created and searches are based on a combined field( concatenate multiple columns into one search column on the indexes view)

· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

So if the column you have full-text indexed is a computed column in a view, can't you change the formula to include lower()?

Or are you displaying that concatenated column? In that case, I guess you need two of them.

I think Viorel's suggestion is the best one. I doubt that there is a secret switch to change this behaviour.

0 Votes 0 ·

Apparently , looks like there is no other way other than changing the indexes view and rebuilding the indexes. Which we were trying to avoid with this huge table size.
But thanks for all your inputs and appreciate the feedback so far.

0 Votes 0 ·