1.1 Glossary

This document uses the following terms:

ASCII: The American Standard Code for Information Interchange (ASCII) is an 8-bit character-encoding scheme based on the English alphabet. ASCII codes represent text in computers, communications equipment, and other devices that work with text. ASCII refers to a single 8-bit ASCII character or an array of 8-bit ASCII characters with the high bit of each character set to zero.

Augmented Backus-Naur Form (ABNF): A modified version of Backus-Naur Form (BNF), commonly used by Internet specifications. ABNF notation balances compactness and simplicity with reasonable representational power. ABNF differs from standard BNF in its definitions and uses of naming rules, repetition, alternatives, order-independence, and value ranges. For more information, see [RFC5234].

best bet: A URL that a site collection administrator assigns to a keyword as being relevant for that keyword. See also visual best bet.

binary large object (BLOB): A discrete packet of data that is stored in a database and is treated as a sequence of uninterpreted bytes.

bucket: A collection of items that were requested by a search application during a crawl. An item can be a person, a document, or any other type of item that can be crawled.

child element: In an XML document, an element that is subordinate to and is contained by another element, which is referred to as the parent element.

document vector: A set of name/value pairs that stores the most important terms and corresponding relevance weights for an indexed item.

duplicate: A search result that is identified as having identical or near identical content.

duplicate result removal: An operation to compare the similarity of items and remove duplicates from search results.

file extension: The sequence of characters in a file's name between the end of the file's name and the last "." character. Vendors of applications choose such sequences for the applications to uniquely identify files that were created by those applications. This allows file management software to determine which application are to be used to open a file.

folder: A file system construct. File systems organize a volume's data by providing a hierarchy of objects, which are referred to as folders or directories, that contain files and can also contain other folders.

globally unique identifier (GUID): A term used interchangeably with universally unique identifier (UUID) in Microsoft protocol technical documents (TDs). Interchanging the usage of these terms does not imply or require a specific algorithm or mechanism to generate the value. Specifically, the use of this term does not imply or require that the algorithms described in [RFC4122] or [C706] must be used for generating the GUID. See also universally unique identifier (UUID).

Graphics Interchange Format (GIF): A compression format that supports device-independent transmission and interchange of bitmapped image data. The format uses a palette of up to 256 distinct colors from the 24-bit RGB color space. It also supports animation and a separate palette of 256 colors for each frame. The color limitation makes the GIF format unsuitable for reproducing color photographs and other images with gradients of color, but it is well-suited for simpler images such as graphics with solid areas of color.

hit highlighted summary: A summary that appears on the search results page for each query result. It displays an excerpt from the item that contains the query text and applies highlight formatting to that query text.

Hypertext Transfer Protocol (HTTP): An application-level protocol for distributed, collaborative, hypermedia information systems (text, graphic images, sound, video, and other multimedia files) on the World Wide Web.

Hypertext Transfer Protocol Secure (HTTPS): An extension of HTTP that securely encrypts and decrypts web page requests. In some older protocols, "Hypertext Transfer Protocol over Secure Sockets Layer" is still used (Secure Sockets Layer has been deprecated). For more information, see [SSL3] and [RFC5246].

inflectional form: A variant of a root token that has been modified according to the linguistic rules of a given language. For example, inflections of the verb "swim" in English include "swim," "swims," "swimming," and "swam."

item: A unit of content that can be indexed and searched by a search application.

Joint Photographic Experts Group (JPEG): A raster graphics file format for displaying high-resolution color graphics. JPEG graphics apply a user-specified compression scheme that can significantly reduce the file sizes of photo-realistic color graphics. A higher level of compression results in lower quality, whereas a lower level of compression results in higher quality. JPEG-format files have a .jpg or .jpeg file name extension.

keyword consumer: A site collection that uses a specific set of keywords, synonyms, and best bets.

language code identifier (LCID): A 32-bit number that identifies the user interface human language dialect or variation that is supported by an application or a client computer.

list: A container within a SharePoint site that stores list items. A list has a customizable schema that is composed of one or more fields.

list item: An individual entry within a SharePoint list. Each list item has a schema that maps to fields in the list that contains the item, depending on the content type of the item.

managed property: A specific property that is part of a metadata schema. It can be exposed for use in search queries that are executed from the user interface.

multivalue property: A property that can contain multiple values of the same type.

noise word: See stop word.

Office SharePoint Server Search service: A farm-wide service that either responds to query requests from front-end web servers or crawls items.

Portable Network Graphics (PNG): A bitmap graphics file format that uses lossless data compression and supports variable transparency of images (alpha channels) and control of image brightness on different computers (gamma correction). PNG-format files have a .png file name extension.

post-query suggestions: An alternative search query that is related to the search query that was executed.

predicate: A statement that is associated with a crawled item and is used to determine whether a document is returned in query results. Its value depends on the state of the full-text index catalog.

pre-query suggestions: A search query that is related to the search query that the user is typing.

property identifier: A unique integer or a 16-bit, numeric identifier that is used to identify a specific attribute or property.

query: A formalized instruction to a data source to either extract data or perform a specified action. A query can be in the form of a query expression, a method-based query, or a combination of the two. The data source can be in different forms, such as a relational database, XML document, or in-memory object. See also search query.

query context: A component of a promotion that specifies the contexts in which a promotion is applied. Examples include the site where the query originates and a user's role or location.

query refinement: A process that is used to drill into query results by using aggregated statistical data, such as the distribution of managed property values in query results.

query result: A result that is returned for a query. It contains the title and URL of the item, and can also contain other managed properties and a hit-highlighted summary.

query text: The textual, string portion of a query.

rank: An integer that represents the relevance of a specific item for a search query. It can be a combination of static rank and dynamic rank. See also static rank and dynamic rank.

ranking model: In a search query, a set of weights and numerical parameters that are used to compute a ranking score for each item. All items share the same ranking model for a specific set of search results. See also rank.

refinement bin: A set of data that is returned with query results and represents a statistical distribution of those results. The data is based on values of the managed property with which a refiner is associated.

refinement token: A Base-64 encoded string that represents a single refinement modifier that can be used to refine a search query. The string includes the name of the refiner, refinement name, and refinement value.

refiner: A configuration that is used for query refinement and is associated with one managed property.

result provider: A component or application that serves a query to a search provider and translates the resulting data into a result set.

retrievable property: A managed property that is stored in a metadata index.

root element: The top-level element in an XML document. It contains all other elements and is not contained by any other element, as described in [XML].

search application: A unique group of search settings that is associated, one-to-one, with a shared service provider.

search query: A complete set of conditions that are used to generate search results, including query text, sort order, and ranking parameters.

search scope: A list of attributes that define a collection of items.

search setting context: An administrative setting that is used to specify when a search setting for a keyword is applied to a search query, based on the query context.

securable object: An object that can have unique security permissions associated with it.

shallow refinement: A type of query refinement that is based on the aggregation of managed property statistics for only some results of a search query. The number of refined results varies according to implementation. See also deep refinement.

site: A group of related pages and data within a SharePoint site collection. The structure and content of a site is based on a site definition. Also referred to as SharePoint site and web site.

site collection: A set of websites that are in the same content database, have the same owner, and share administration settings. A site collection can be identified by a GUID or the URL of the top-level site for the site collection. Each site collection contains a top-level site, can contain one or more subsites, and can have a shared navigational structure.

site collection administrator: A user who has administrative permissions for a site collection.

SOAP: A lightweight protocol for exchanging structured information in a decentralized, distributed environment. SOAP uses XML technologies to define an extensible messaging framework, which provides a message construct that can be exchanged over a variety of underlying protocols. The framework has been designed to be independent of any particular programming model and other implementation-specific semantics. SOAP 1.2 supersedes SOAP 1.1. See [SOAP1.2-1/2003].

SOAP action: The HTTP request header field used to indicate the intent of the SOAP request, using a URI value. See [SOAP1.1] section 6.1.1 for more information.

SOAP body: A container for the payload data being delivered by a SOAP message to its recipient. See [SOAP1.2-1/2007] section 5.3 for more information.

SOAP fault: A container for error and status information within a SOAP message. See [SOAP1.2-1/2007] section 5.4 for more information.

SOAP message: An XML document consisting of a mandatory SOAP envelope, an optional SOAP header, and a mandatory SOAP body. See [SOAP1.2-1/2007] section 5 for more information.

sort order: A set of rules in a search query that defines the ordering of rows in the search result. Each rule consists of a managed property, such as modified date or size, and a direction for order, such as ascending or descending. Multiple rules are applied sequentially.

subsite: A complete website that is stored in a named subdirectory of another website. The parent website can be the top-level site of a site collection or another subsite. Also referred to as subweb.

thesaurus: A file that contains a list of synonym sets. Each synonym set contains two or more terms that have the same meaning. When a search query is processed, the search is expanded to include the synonyms if the query text matches a term in the thesaurus.  For example, with a synonym set of "cat, feline," a search for cat will retrieve items that contain either "cat" or "feline."

token: A word in an item or a search query that translates into a meaningful word or number in written text. A token is the smallest textual unit that can be matched in a search query. Examples include "cat", "AB14", or "42".

Unicode: A character encoding standard developed by the Unicode Consortium that represents almost all of the written languages of the world. The Unicode standard [UNICODE5.0.0/2007] provides three forms (UTF-8, UTF-16, and UTF-32) and seven schemes (UTF-8, UTF-16, UTF-16 BE, UTF-16 LE, UTF-32, UTF-32 LE, and UTF-32 BE).

Unicode code point: Any value in the Unicode codespace, which is a range of integers from "0" to "10FFFF16". Each code point is a unique positive integer that maps to a specific character.

Uniform Resource Locator (URL): A string of characters in a standardized format that identifies a document or resource on the World Wide Web. The format is as specified in [RFC1738].

visual best bet: A URL that specifies the address of an image and is assigned to a keyword by a site collection administrator as being relevant for that keyword. See also best bet.

Web Services Description Language (WSDL): An XML format for describing network services as a set of endpoints that operate on messages that contain either document-oriented or procedure-oriented information. The operations and messages are described abstractly and are bound to a concrete network protocol and message format in order to define an endpoint. Related concrete endpoints are combined into abstract endpoints, which describe a network service. WSDL is extensible, which allows the description of endpoints and their messages regardless of the message formats or network protocols that are used.

WSDL operation: A single action or function of a web service. The execution of a WSDL operation typically requires the exchange of messages between the service requestor and the service provider.

XML: The Extensible Markup Language, as described in [XML1.0].

XML namespace: A collection of names that is used to identify elements, types, and attributes in XML documents identified in a URI reference [RFC3986]. A combination of XML namespace and local name allows XML documents to use elements, types, and attributes that have the same names but come from different sources. For more information, see [XMLNS-2ED].

XML namespace prefix: An abbreviated form of an XML namespace, as described in [XML].

XML schema: A description of a type of XML document that is typically expressed in terms of constraints on the structure and content of documents of that type, in addition to the basic syntax constraints that are imposed by XML itself. An XML schema provides a view of a document type at a relatively high level of abstraction.

MAY, SHOULD, MUST, SHOULD NOT, MUST NOT: These terms (in all caps) are used as defined in [RFC2119]. All statements of optional behavior use either MAY, SHOULD, or SHOULD NOT.