question

Thomas-4898 avatar image
0 Votes"
Thomas-4898 asked ·

How to filter any field that contains a # or does not end with .htm

Hi all,

I have an URL field in my index but would filter out any entries where the field either contains a '#' or does not end with '.htm'.

Not sure how the $filter query string needs to be.

https://xxx.search.windows.net/indexes/xxx/docs?api-version=2020-06-30-Preview&$filter=???&search=*

Any suggestions?

Thomas

azure-cognitive-search
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

brtrachMSFT-0711 avatar image
0 Votes"
brtrachMSFT-0711 answered ·

@Thomas-4898 Thank you for your question.

The first item is that you will need to convert the # in your URL field to %23 as # is considered an unsafe character. More information on encoding unsafe URL characters can be found here.

Since it sounds like you'll want this to be applied to every query, regardless of the full-text search terms, I'd suggest using something like an OData filter and using an operator such as not to have it not include %23 or .htm. More information on OData filters can be found here.

Please review this content and let us know if there are any further questions on the matter.

· 1 ·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

@brtrachMSFT-0711 thanks for your suggestion.

Unfortunately, I cannot change the data in the index so the field 'url' will always contain stuff like this:

34700.htm#o34790
34300.htm#o32340
etc.

Those are the ones I would like to ignore.

I tried the following with no success:

$filter=url ne '%23'
$filter=url ne '#'
$filter=url eq '*.htm'
$filter=endswith(url,'.htm')

And yes, this is independent of the search term.

Thanks for your help!

0 Votes 0 ·