Getting a custom IFilter working with SQL Server 2008/R2 (IFilterSample)

In this post I will cover how to get a custom IFilter working on SQL Server 2008/R2 (64-bit or 32-bit). For the purpose of this demo, I am using the IFilterSample that is available with Windows Server 2008 SDK. Once you have installed the SDK you can find the sample here -> C:\Program Files\Microsoft SDKs\Windows\v6.1\Samples\winui\wsearch\IFilterSample

The IFilterSample implements a sample IFilter for a fictitious file format called “ .filtersample” which is actually an XML file. This sample will work on Windows Vista and Windows 7. 

Developers out there might already be familiar with using the IFilter interface to develop the filters. For those who aren’t you can start from here,

Windows Search Interfaces
https://msdn.microsoft.com/en-us/library/ee872063(v=VS.85).aspx

Data Add-in Interfaces
https://msdn.microsoft.com/en-us/library/ee872090(v=VS.85).aspx

IFilter Interface
https://msdn.microsoft.com/en-us/library/bb266451(v=vs.85).aspx

I am going to use IFilterSample as reference and must warn you this post is NOT a how-to-write-custom-IFilter lesson :).
The IFilterSample is a native C++ COM based DLL and as part of its registration provides a CLSID that can be registered on the system. Here are the step-by-step instructions you came here for:-

  1. Download and install the Windows Server 2008 SDK on your machine.
  2. Copy the sample from C:\Program Files\Microsoft SDKs\Windows\v6.1\Samples\winui\wsearch\FilterSample, onto your desktop which is the copy we will use to build the project.
  3. Open the FilterSample.sln file and  in Additional Dependencies in Linker Properties, add xmllite.lib

                 
     
  4. Build the project for x64 or Win32 as appropriate (this would be dependent on the architecture of your SQL Server this IFilter will be used on).
  5. Once the project is built you will have FilterSample.DLL in your Debug or Release folder. Make sure that the SQL Server service account has FULL CONTROL permissions on this folder.
  6. From an elevated command prompt browse to the location of the DLL and register it using regsvr32 FilterSample.dll

     
     
     
  7. Now we need to create some registry keys in SQL Server to make it use this filter DLL.
    1. The keys I created below are for a default instance of SQL Server 2008, so change the InstanceID as appropriate.

    2. The CLSID given below is specific to this IFilterSample only.

    3. Notice that the file extension I provided is called “.filtersample”.

    4. The path provided has to match the path from where it was registered in step #6.

      [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSearch\Filters\.filtersample]
      @="{6C337B26-3E38-4F98-813B-FBA18BAB64F5}"

      [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSearch\CLSID\{6C337B26-3E38-4F98-813B-FBA18BAB64F5}]
      @="C:\\Users\\sudarn\\Desktop\\FilterSample\\Release\\FilterSample.dll"
      "ThreadingModel"="Both"
      "Flags"=dword:00000001

 

8. Now, connect to the SQL instance as a sysadmin and run the following sequence of commands

  EXEC sp_fulltext_service @action='load_os_resources', @value=1;
 EXEC sp_fulltext_service 'verify_signature', 0
 EXEC sp_fulltext_service 'update_languages'; 
 EXEC sp_fulltext_service 'restart_all_fdhosts';
 

 9. To verify the FilterSample.DLL has been successfully loaded and recognised by SQL Server, you can check the DMV sys.fulltext_document_types or use sp_help_fulltext_system_components 'filter'

 
That was all, folks! You can now do a test to verify that the search is working as expected. There are 2 ways of testing the IFilter.

 1.       Using SQL Server

    1. Create a test table and enable it for full-text indexing.
 CREATE TABLE [dbo].[testTBL](
 [doctype] [nvarchar](50) NOT NULL,
 [document] [varbinary](max) NULL,
 [docname] [varchar](50) NOT NULL,
 CONSTRAINT [PK_testTBL] PRIMARY KEY CLUSTERED 
 (
 [docname] ASC
 )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
 ) ON [PRIMARY]
 
 GO
 CREATE FULLTEXT CATALOG [CatalogTEST] WITH ACCENT_SENSITIVITY = OFF
 AUTHORIZATION [dbo]
 
 GO
 CREATE FULLTEXT INDEX ON [dbo].[testTBL](
 [docname] LANGUAGE [English], 
 [doctype] LANGUAGE [English], 
 [document] TYPE COLUMN [doctype] LANGUAGE [English])
 KEY INDEX [PK_testTBL]ON ([CatalogTEST], FILEGROUP [PRIMARY])
 WITH (CHANGE_TRACKING = AUTO, STOPLIST = SYSTEM)
 GO
 

  b. Insert a sample document into the table. The IFilterSample comes with a file called “SampleFile.filtersample”, which can be used for test purposes. 

 INSERT INTO dbo.testTBL(doctype,docname,document) 
 SELECT '.filtersample' AS doctype, 'SampleFile.filtersample' AS docname, 
 * FROM OPENROWSET(BULK N'C:\Users\sudarn\Desktop\FilterSample\SampleFile.filtersample', SINGLE_BLOB) AS Document

c. Perform a test search

 select * from dbo.testTBL where contains(document, 'emergency')

d. You can also check to see the document was indexed by querying DMV,

 SELECT * FROM sys.dm_fts_index_keywords(db_id('test'), object_id('dbo.testTBL));

 
2.       Using FILTDUMP.exe

Filtdump.exe is a sample tool that is available again in WinSDK and can be used to test a particular filter by loading it and then getting the text chunk out of any sample document. It is available in both 32-bit and 64-bit versions. It can be found here once you install WinSDK,

C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\x64\filtdump.exe
C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\filtdump.exe

For example,
C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\x64>filtdump.exe -b "C:\Users\sudarn\Desktop\FilterSample\SampleFile.filtersample"
C:\Program Files\Microsoft SDKs\Windows\v6.1\Bin\x64>filtdump.exe "C:\Users\sudarn\Desktop\FilterSample\SampleFile.filtersample"

<snippet of the output>

----------------------------------------------------------------------

Type = 31 (0x1f): AuthorXYZ

----------------------------------------------------------------------

                                Attribute = B725F130-47EF-101A-A5F102608C9EEBAC\19

                                idChunk = 4

                                BreakType = 0 (No Break)

                                Flags(chunkstate) = 0x1 (Text)

                                Locale = 1033 (0x409)

                                IdChunkSource = 0

                                cwcStartSource = 0

                                cwcLenSource = 0

----------------------------------------------------------------------

 this is a test of the emergency broadcasting system.

 This is the content of the document.

 I'm most pleased that it is the content of the document.

 </snippet>

If you see the above then the IFilter is working fine as you are able to read it and get the text chunks out.

Possible Issue(s)

You might receive this error message in the SQLFT logs when trying to index a newly inserted row. This indicates that the IFilter was not recognised and there was an issue in loading it.

“Warning: No appropriate filter was found during full-text index population for table or indexed view '[test].[dbo].[Documents]' (table or indexed view ID '1003150619', database ID '6'), full-text key value 'SampleFile.filtersample'. Some columns of the row were not indexed.”

 (OR)

Error '0x8004170c: The document format is not recognized by the filter.' occurred during full-text index population for table or indexed view '[test].[dbo].[Documents]' (table or indexed view ID '1003150619', database ID '6'), full-text key value 'SampleFile.filtersample'. Failed to index the row.

 The possible things to check for above error are:-

  1. Ensure that the registration and other steps given above are done in the same sequence as indicated.
  2. If you are using the Debug build ensure that the VCRuntime Debug dependency DLL is present (msvcr90d.dll or msvcr100d.dll).
  3. Use Process Monitor to ensure that your DLL is getting loaded
  4. Use Dependency Walker to ensure there are no other missing dependencies for your DLL.
  5. Verify that the DLL architecture (x64/x86) matches your SQL Server architecture. You cannot mix & match this.

 

I must point out one important thing here if you are going to use this DLL on other machines. You will need to have the Visual Studio Redistributable package installed on the target machine where you plan to deploy this. You will need the corresponding redistributable package for the VS version you used to build the DLL.

Have fun IFiltering and stay tuned for more on this blog.