What is the Web Language Model API? (Preview)

Important

The Web Language Model preview was decommissioned on August 9, 2018. We recommend using Azure Machine Learning text analytics modules for text processing and analysis.

The Microsoft Web Language Model API is a REST-based cloud service providing state-of-the-art tools for natural language processing. Using this API, your application can leverage the power of big data through language models trained on web-scale corpora collected by Bing in the en-US market.

These smoothed backoff N-gram language models, supporting up to fifth-order Markov chains, are trained on the following corpora:

  • Web page body text
  • Web page title text
  • Web page anchor text
  • Web search query text

The Web Language Model API supports four lookup operations:

  1. Joint (log10) probability of a sequence of words.
  2. Conditional (log10) probability of one word given a sequence of preceding words.
  3. List of words (completions) most likely to follow a given sequence of words.
  4. Word breaking of strings that contain no spaces.

Getting Started

  1. Subscribe to the service.
  2. Download the SDK.
  3. Run the SDK sample code.
  4. Refer to the API Reference for full details of the endpoints, including code snippets in a variety of languages.

Underlying Technology

The following paper provides details on the development of these language models, and should be cited in research publications that use this service:

Click here for a current list of papers citing this work.