Docs Search API

Objective

The objective of search API is to provide an easy and scalable way to identify the list of technical documentation articles that are relevant to a context or search phrase.

Potential applications of the API

  • The user is on a particular artifact (e.g. page, screen, form etc..) within a product and needs further help on that. So the customer clicks on the help icon or presses a hot key (e.g F1) to get help. The client (i.e. the product) needs a search API that would return documentation results based on the context the user is on.
  • The user opens the help screen and enters free form text to get help results for those search terms. The client (i.e. the product) needs a search API that would return documentation results based on the search terms.
  • The user needs help with a functionality so she opens a help ticket. The client (i.e. the product) needs a search API that would return documentation results based on the ticket information.
  • Any combination of the above scenarios.

API Endpoint

https://docs.microsoft.com/api/search

API Parameters

search [Mandatory]

This attribute provides the search phrase that the client wants results on. If any content in docs have this search phrase in the (1) title OR (2) description OR (3) body of the article, then the content will be included in the search results. If * is provided all content will be included in the search results. Example

Important

The search phrase is not compared with any other metadata (e.g. keyword).

locale [Mandatory]

This attribute will be used to get the locale preference for the customer. The searching will be done on all languages in the fallback language chain. If the content is available in the preferred language, we will include the content in the preferred language. If the content is not available in the preferred language(s), we will use the fallback behavior to determine the content to be included. Example

Note

We support the same locale codes that the docs site does.

API Version [Optional]

To request a specific version, pass the api_version query string parameter. If no version parameter is passed, we will use the latest version. At present, the version is 0.1 (Beta). We will only change the API version, when the changes are incompatible with the existing version.Example

Important

We recommend passing the version.

Filters

These sets of attributes will be used to further filter the search to certain content. Product can pass any metadata used by the content team. Clients can add multiple using logical ‘and’ and ‘or’. Example

Following metadata are currently crawled and indexed:

  • title
  • description
  • body of the article
  • scopes
  • products
  • region
  • industry
  • form
  • validFrom
  • validTo
Note

Here is additional documentation on the 'scope' metadata.

Important

The values of filter are case sensitive.

Important

We highly recommend you to use one of the above metadata for filtering. If you still want to use other metadata, please inform your Docs onboarding PM with the names of metadata so that it can be added to the model.

You can use the following operator to pass metatags

  • eq:
    • If the metadata tag is present and the value of the metatag is equal to the value of the attribute passed, then the content will be included in the search results.
    • If the metadata tag is not present, then the content will NOT be included in the search results.
  • ne:
    • If the metadata tag is present and the value of the metatag is not equal to the value of the attribute passed, then the content will be included in the search results.
    • If the metadata tag is not present, then the content will be included in the search results.
  • lt:
    • If the metadata tag is present and the value of the metatag is less or equal to the value of the attribute passed, then the content will be included in the search results.
    • If the metadata tag is not present, then the content will be included in the search results.
  • gt:
    • If the metadata tag is present and the value of the metatag is greater or equal to the value of the attribute passed, then the content will be included in the search results.
    • If the metadata tag is not present, then the content will be included in the search results.
Important

Dates need to be in the ISO-8601.

Search Results

Data and Format

Following will be sent in the json format:

  • results
  • title
  • url
  • description – If description for the topic is not present, the first paragraph will be used as the description
  • lastUpdatedDate
  • breadcrumbs
  • url
  • name
  • count
  • @nextLink

Results order

Following is the current algorithm to weigh the search results:

  1. If Title contains the search phrase, the weightage points given is 10.
  2. If description contains the search phrase, the weightage given is 5.
  3. If the topic is available in the preferred language, the weightage given is 10x.
  4. The weightage points are added and then multiplied by the preferred language. For.eg. if the title and description contains the search phrase and the topic is available in the preferred language, the weightage give is (10+5)x10=150

Pagination

  1. By default, the first 25 search results are passed.
  2. If the client needs specific number of results, ‘top’ attribute need to be passed to the service. If client passes more than 25, an error will be returned. Example
  3. If the client needs next set of results, ‘skip’’ attribute need to be passed to the service. Example

Sample JSON file

{
    "results": [
        {
            "title": "Console",
            "url": "https://docs.microsoft.com/en-us/visualstudio/profiling/console",
            "description": "The VSPerfCmd.exe Console option starts the specified application in a new command prompt window. Console can only be used with the VSPerfCmd Launch option. If the application is not a command-line ap...",
            "lastUpdatedDate": "2016-11-04T00:00:00+00:00",
            "iconType": "Article",
            "breadcrumbs": [
                {
                    "url": "https://docs.microsoft.com/en-us/visualstudio/profiling/profiling-tools",
                    "name": "Profiling Tools"
                },
                {
                    "url": "https://docs.microsoft.com/en-us/visualstudio/profiling/performance-explorer",
                    "name": "Performance Explorer"
                },
            ]
        },
    ],
    "count": 5073,
    "@nextLink": "https://docs.microsoft.com/api/Search?locale=en-us&search=console&$skip=25&$top=25"
}

Service Level Agreement

  • Response Time: TBD
  • Uptime: TBD
  • Time-out: TBD

Interest Area Tags aka scope aka searchScope

docs.microsoft.com contains a mix of documentation from widely disparate products. A user who is viewing content within a single product area is unlikely to be interested in the entire set of documentation available on the site. For example, users viewing the .NET Core API reference are not necessarily the same people who want to view the Intune documentation. Given this, when a user decides to do a search from one of your pages, that search should be limited to the area of documentation that they are currently viewing. This is what Scoped Search enables you to do.

The intention behind this tag is to identify the interest areas applicable to a certain article or group of articles. The tag is declared as searchScope at docset level and scope at article level. Scope can have multiple values because an article may belong to multiple interest areas. The first value is treated as the default scope. Scope is independent of other metadata and it doesn’t work in conjunction with other metadata.

Warning

The strings that are declared for this tag are used by the site as is. There is no mapping to alterantive representations, such as a Display Name. There is no master list of scopes either. As such, you need to ensure that a well-known set of human-readable scope names is used consistently across your content.

Note
  • The value of searchScope is always an array, even if you only have one scope.
  • Same value of searchScope can be applied in multiple repositories and docset. All of these will be considered under this tag.

Site features built on scope

There are two experiences built around scope. Here is additional documentation on how to define scope.

  • If the content has defined scopes, we display the default scope (i.e the first scope for the content) in the search bar. When customer searches in the search bar, we will only display content in the search results that have that 'scope'. If the user doesn't want to scope their searches to your area, the UX enables them to remove the scope and broaden their search.
  • [Upcoming] The user can filter the search results based on available scopes in the search result screen.
  • [Upcoming] If the content has defined scopes, we will display all those scopes as RSS feed options. When customer uses one of these RSS feed, we will provide RSS feeds on all content have that 'scope'.

Defining Scope

Docset level

Our first step should be to update the search scopes for each docset. In the docfx.json file, add a new property to globalMetadata called searchScope. Declaring the searchScope metadata at the docset level ensures that all searches initiated from one of your docset pages will display only results that are also within the same scope. i.e. Documents inside your docset or that exist in another docset with the same scope configuration. Here's an example:

"globalMetadata": {
  "breadcrumb_path": "/openpublishing/test/breadcrumb.json",
  "_mockServerUrl": "https://apiexproxy.azurewebsites.net/svc",
  "_mockServerAuthorization": "Bearer {token:https://graph.windows.net/}",
  "is_e2e_test": true,
  "brand": "azure",
  "searchScope": ["Azure", ".NET"]
}

Folder level

You can override the docset search scope by specifying scope at folder level. Here's an example:

"fileMetadata": { 
    "searchScope": { "articles/uwp/**.md": ["Windows","UWP"] } 
}

Article level

In addition to docfx.json metadata, you can specify the same searchScope metadata at the document level. This overrides the global value for that document only. Here's an example of what the metadata header might look like in an individual document:

ms.date: na
searchScope:
  - Azure
  - .NET
ms.prod: identity-ata
Note

Remember, this will overwrite the docset-level metadata. It should only be used if the metadata at the docset level isn't appropriate or specific enough. This is not a general tagging mechanism. It is for broad search area scopes.

Examples

Example 1

Following is the configuration

  • Repository - Windows
    • Docfx file
    • "globalMetadata": { "searchScope": ["Windows"], }
    • "fileMetadata": { "searchScope": { "articles/uwp/**.md": ["Windows","UWP"] } }
    • Articles/uwp/article1.md
    • <meta name="scope" content=["Windows","UWP",”Windows Fluent Design”] /
    • Articles/uwp/article2.md
    • No scope defined.
    • Articles/iot/article3.md
    • No scope defined.

This will be the behavior

  • If I search within “Windows” scope, I will see article 1, article 2 and article 3
  • If I search within “UWP” scope, I will see article 1 and article 2
  • If I search within “Windows Fluent Design” scope, I will see article 1
  • When I go to article 1, article 2 and article 3, “Windows” will be the default scope in the search bar.

Example 2

Following is the configuration

  • Repository : UWP Apps
    • Docfx.json
    • "globalMetadata": { "searchScope": ["Windows”, “UWP”], }
  • Repository : Desktop
    • Docfx.json
    • "globalMetadata": { "searchScope": ["Windows”, “Desktop”], }
  • Repository : Microsoft Edge
    • Docfx.json
    • "globalMetadata": { "searchScope": ["Windows”, “Edge”], }

This will be the behavior

  • Your default scope will be “Windows” so customer will see Windows on the search bar
  • When customer searches on Windows scope, they will see results from all the three repositories.