What’s new in Search in SharePoint 2010

Search is one of the key features provided by Microsoft Office SharePoint Server (MOSS) and there are lot of improvements in this area in SharePoint 2010.  I would prefer to talk about the scalability improvements in this release compared to MOSS.

The major bottlenecks for search in MOSS are,

1.  There is only one Search db which is part of the SSP and sharing this across crawling and querying limits the system.  This also affect the crawl speed as well as query latency when the number of items in the index increases.

2. Single index flat file on query servers does not scale.

3. Indexer is the single point of failure for search subsystem.

4. Load on SQL as the crawl/query tables are in the SSP Search database.

In SharePoint 2010, the search system can be split into  multiple independently scalable components

Crawl Components (Indexer)

If crawl process is bottleneck, add additional crawler machines

Crawl history databases (SQL) and Metadata databases (SQL)

If SQL database is the bottleneck, add additional databases

Index Partitions (Query server)

If flat file index is bottleneck, split it into multiple flat files.

Admin Component (not scaled-out)

Includes associated search admin database to store configuration information

One interesting point to note is that Crawler machine is stateless worker, means it doesn’t store any index on its hard drive.  Completes indexing and propagates content to query servers.