IWordBreaker Interface

The IWordBreaker interface is a language-specific language resource component. The word breaker parses text and identifies individual words and phrases. The word breaker is used in background processes and must be optimized for both throughput and minimal use of resources.

IWordBreaker Members

BreakText Parses text to identify words and phrases and provides the results to the WordSink and PhraseSink objects.
ComposePhrase Not currently supported.
GetLicenseToUse Gets a pointer to the license information for this implementation of the IWordBreaker interface.
Init Initializes the IWordBreaker implementation and indicates the mode in which the component operates.

Remarks

When to Implement

Implement this interface to create a custom word breaker for a language. Windows Search calls the methods of this interface when it builds content indexes and runs queries.

Word breaker components for Windows Search run in the Local Security context. They should be written to manage buffers and to stack correctly. All string copies must have explicit checks to guard against buffer overruns. You should always verify the allocated size of the buffer and test the size of the data against the size of the buffer.

Interface Information

Inherits from IUnknown
Header indexsrv.h
Import library user-defined
Minimum operating systems Windows NT 4.0 with the Windows NT 4.0 Option Pack, Windows 2000

See Also

Implementing a Word Breaker, Language Resource Samples, PhraseSink, Secure Code Practices, WordSink