Gather information about the current search environment (Search Server 2010)

 

Applies to: Search Server 2010

Topic Last Modified: 2010-11-02

An important step in planning the enterprise search solution is to gather information about the current environment, including the following types of information and reports:

  • Information about the organization

  • Information about the topology

  • Current settings for search

  • Performance and usage reports

You will need this information for planning the search topology, crawling and federation, and the end-user search experience.

Organization information

Gather the following information about the organization:

  • User, business, and functional requirements for the enterprise search solution, along with any service level agreements (SLAs). This information will help you to design and build the search solution and verify whether the solution meets the requirements during testing.

  • Contact information for existing farm administrators, search administrators, site collection administrators, site owners, and any other stakeholders for the enterprise search solution. This information will help you to plan the enterprise search team, and it also provides a contact list for any communications that occur during planning, deployment, and operations.

Topology information

Gather the following information about the topology:

  • Current topology diagrams. You will refer to these while planning the topology.

  • Locations of content repositories that should be included in search results, including SharePoint sites, Web sites, file shares, Exchange public folders, business data sources, user profile stores, Lotus Notes, and external sites.

  • Locations of users.

Current search settings

If you are starting from a previous version of SharePoint products and technologies, gather the following information about current settings for search:

  • Default content access account

  • Content source settings, including the following settings for each content source:

    • Content source name

    • Content source type

    • Start addresses

    • Crawl settings

    • Full crawl schedule

    • Incremental crawl schedule

  • Crawler impact rules, including the following settings for each crawler impact rule:

    • Site (URL)

    • Request frequency

  • Crawl rules, including the following settings for each crawl rule:

    • Path

    • Crawl configuration (excluded or included items)

    • Content access account

  • Third-party or custom connectors (called protocol handlers in prior versions)

  • File types included in the file-type inclusions list, and whether they required an additional IFilter

  • File types removed from the file-type inclusions list

  • Languages for which word breakers and stemmers are installed

  • Farm-level search settings, including the following information:

    • Contact e-mail address

    • Proxy server settings (address, port, whether to bypass for local addresses, and addresses for which you do not want to use a proxy server)

    • Crawler time-out settings (connection time and request acknowledgement time)

    • SSL certificate warning configuration

  • Scope settings

  • Crawl settings

  • The following additional settings:

    • Federated locations

    • Server name mappings

    • Indexer performance settings

    • Crawled properties

    • Managed properties

    • Search result removal

    • Alerts

    • Keywords

    • Best Bets

    • Authoritative pages

Performance and usage reports

Gather the following performance and usage data:

  • Performance metrics from search administration reports, if available. You will use this information when you plan the topology. For more information, see Use search administration reports (Search Server 2010).

  • Usage metrics from Web analytics reports. You will use this information when you design the end-user experience for search.