crawlerdbtool.exe reference

 

Applies to: FAST Search Server 2010

Use the crawlerdbtool tool to view entries from the crawler databases. This includes information that is stored about the URIs that were crawled with the FAST Search Web crawler.

Note

To use a command-line tool, verify that you meet the following minimum requirements: You are a member of the FASTSearchAdministrators local group on the computer where Microsoft FAST Search Server 2010 for SharePoint is installed.

Syntax

<FASTSearchFolder>\bin\crawlerdbtool [options]

Parameters

Parameter Description

<FASTSearchFolder>

The path of the folder where you have installed FAST Search Server 2010 for SharePoint, for example C:\FASTSearch.

crawlerdbtool general options

Option Value Required Description

-h

No

Displays help information.

-v

No

Displays version information.

crawlerdbtool mode options

Option Value Required Description

-m

<mode>

No

Specifies the mode for running the crawlerdbtool:

  • check - Reports corrupted databases only.

  • repair - Attempts to repair corrupted databases by copying elements to new databases. New databases are verified before replacing corrupted databases.

  • delete - Deletes corrupted databases.

  • compact - Compacts a database (specify file name or directory) or item store cluster (specify cluster directory).

  • list - Outputs all keys in a database.

  • count - Counts the number of keys in a database.

  • view - Displays an entry in a database based on the key that is specified with -k. If no key is specified, all keys are output.

  • viewraw - Displays the same data as the view mode, but without any formatting.

  • export - Exports a database to marshalled data.

  • import - Imports a database from marshalled data.

  • align - Aligns a routing database with a duplicate server database. Only the routing database is altered. It must be manually copied to all node schedulers afterward.

  • reroute - Creates a new routing database based on the contents in the metastore databases. Must be run on all nodes with the same routetab.hashdb and then distributed to all nodes.

  • analyze - Analyzes a meta database.

  • bloomfix - Recreates the bloom filter for the specified database. The -b option must be specified.

  • pphl2gb - Convert a postprocess checksum database from hashlog to gigabase format.

Default mode: check

-d

<directory|file>

Yes

Specifies a directory or file to process.

Required for all modes except align mode.

The -f option is ignored if a file is specified.

-f

<filemask>

No

Specifies the filemask/wildcard to work on. To specify multiple filemasks, use multiple parameters, such as the following:

-f *.metadb -f *.hashdb

Default: *

-c

<cachesize>

No

Specifies the cache size (in bytes) to be used when you open databases.

Default: 8388608

-s

<frequency>

No

Specifies the frequency of database synchronizations during repair. Enter the number of operations between synchronizations. A value of 1 will synchronize after each operation.

Default: 10

-t

<time-out>

No

Specifies a time-out in seconds after which a database check/repair process is ended. The database will be considered corrupted and beyond repair, and will be deleted.

Warning

Use with caution.

Default: none

-k

<key>

Specifies the database key to view.

Applies to view mode only.

-K

<key>

Same as -k, but assumes that the key is repaired and will call eval() before using it.

Use for viewing entries in postprocess or duplicate server databases.

-S

<site>

No

Specifies a crawl site to apply the current mode to. Use for inspecting meta databases.

If <site> is "all", all sites will be traversed.

If <site> is "list", all sites will be listed.

-p

<duplicate server db directory>

or

<metastore cluster directory>

No

Applies to align and reroute modes only.

In align mode, specifies the duplicate server database directory.

In reroute mode, specifies the metastore database cluster directory.

-r

<routing database file>

No

Applies to align and reroute modes only.Specifies the routing database file. For example:

<FASTSearchFolder>\data\crawler\config\MyCollection\routetab.hashdb

-M

<node1, node2,...,nodeN>

or

<current node scheduler identifier>

No

Applies to align and reroute modes only.

In align mode, specifies the node schedulers active in the routing scheme in a comma-separated list (no spaces allowed). For example:

-M node1,node2,node3

In reroute mode, specifies the node scheduler identifier of the current node.

-D

No

Applies to align and reroute modes only.

Enables domain based routing (instead of host name based).

-i

<intermediate format>

No

Applies to import and export modes only.

Exports to or imports from the specified format.

Valid formats:

  • marshal - fast space-efficient format

  • pickle - version and platform independent format

Default: marshal

-e

<maximum entries>

No

Applies to export mode only.

Specifies an upper limit to the number of elements stored per exported file.

Default: 10000

-b

<bits>

No

Applies to bloomfix mode only.

Specifies the number of bits to use in the bloom filter.

Examples

The following example lists all URIs crawled from the site www.contoso.com for the crawl collection MyCollection.

<FASTSearchFolder>\bin\crawlerdbtool -m list -d <FASTSearchFolder>\data\crawler\store\MyCollection\db -S www.contoso.com

The following example lists all known sites crawled for the collection MyCollection.

<FASTSearchFolder>\bin\crawlerdbtool -m list -d <FASTSearchFolder>\data\crawler\store\MyCollection\db -S all

The following example shows statistics for all URIs crawled for the collection MyCollection.

<FASTSearchFolder>\bin\crawlerdbtool -m analyze -d <FASTSearchFolder>\data\crawler\store\MyCollection\db -S all

The following example lists all URIs (keys) in a database for the crawl collection MyCollection.

<FASTSearchFolder>\bin\crawlerdbtool -m list <FASTSearchFolder>\data\crawler\store\MyCollection\db\1\0.metadb2

To view the record of a specific URI (database key) for the collection MyCollection, follow this example.

<FASTSearchFolder>\bin\crawlerdbtool -m view <FASTSearchFolder>\data\crawler\store\MyCollection\db\1\0.metadb2 -k "https://www.contoso.com/example.html"