Developing Protocol Handler Add-ins

You can extend Windows Search to include new data stores by implementing a custom protocol handler.

Indexing Data Stores with Protocol Handlers

A data store is a content source (a database system, a directory, a file system) where data is stored and can be crawled by the Windows Search Indexer. The store can be hierarchical (like a database) or link-based (like a Web site). A protocol handler enables indexing applications like Windows Search to systematically crawl the nodes of a data store to extract relevant information to include in the index. Each protocol handler is used to index a specific type of data store. Windows Search ships with protocol handlers for file system stores and for both Microsoft Outlook and Outlook Express data stores (email stores, .PST files, and so on). When indexing Outlook email, for example, the protocol handler crawls all messages in a set of Outlook folders extracting information from each message and attachment. This information is passed to the Indexer to include in the Windows Search catalog.

Often users need to search other data stores such as legacy databases, email stores or data structures not supported by Windows Search. You can extend Windows Search to crawl a new data store by implementing a protocol handler specifically for that data store. First, you should determine if a protocol handler already exists for your data store, perhaps for use with another application like SharePoint Services. If so, you can install that protocol handler on the system. Windows Search protocol handlers use design specifications similar to SharePoint Services, and they can often be used interchangeably.

Furthermore, if the data store contains data or file types other than one of the 200 file types supported by Windows Search, you also need to implement a filter to access and index the contents of items in the store. Windows Search uses protocol handler and IFilter technology similar to that used by SharePoint Services. If you already have filters for a specific store and file type installed on the system being indexed, Windows Search may be able to use the existing interfaces to index this data.

 

Roadmap to Adding New Data Stores

To extend Windows Search to crawl new data stores, you can create a protocol handler and one or more of the following add-ins: context menu handler, icon handler and a SearchProtocolOptions add-in.

  1. Create and register a multithreaded protocol handler for the data store:
    • ISearchProtocol - This interface accesses a protocol and maps a URL to an IUrlAccessor.
    • IUrlAccessor - This is the main interface used for accessing items from the content source and binding the content to appropriate filter.
    • IFilter - This interface returns the URL of each item in a folder as value properties for processing. Note    The minimum add-in functionality needed to return search results from a non-hierarchical data store is an implementation of ISearchProtocol and IUrlAccessor interfaces.
  2. Create a shell namespace extension (IShellFolder), if the URL defined by your protocol handler is not already handled by the Shell, to provide user interface elements like context menus and file-specific icons. Refer to the Shell documentation for further instructions.
  3. Optionally, implement a mechanism to notify the Indexer of changes to your data store:
    • ISearchItemsChangedSink or ISearchPersistentItemsChangedSink- These interfaces enable your protocol handler to notify the Index of changes to your data store. This improves performance by ensuring the Indexer doesn't crawl the entire store on incremental indexes.