What is the Web Language Model API? (Preview)
The Web Language Model preview was decommissioned on August 9, 2018. We recommend using Azure Machine Learning text analytics modules for text processing and analysis.
The Microsoft Web Language Model API is a REST-based cloud service providing state-of-the-art tools for natural language processing. Using this API, your application can leverage the power of big data through language models trained on web-scale corpora collected by Bing in the en-US market.
These smoothed backoff N-gram language models, supporting up to fifth-order Markov chains, are trained on the following corpora:
- Web page body text
- Web page title text
- Web page anchor text
- Web search query text
The Web Language Model API supports four lookup operations:
- Joint (log10) probability of a sequence of words.
- Conditional (log10) probability of one word given a sequence of preceding words.
- List of words (completions) most likely to follow a given sequence of words.
- Word breaking of strings that contain no spaces.
- Subscribe to the service.
- Download the SDK.
- Run the SDK sample code.
- Refer to the API Reference for full details of the endpoints, including code snippets in a variety of languages.
The following paper provides details on the development of these language models, and should be cited in research publications that use this service:
- An Overview of Microsoft Web N-gram Corpus and Applications, NAACL-HLT 2010
Click here for a current list of papers citing this work.
We'd love to hear your thoughts. Choose the type you'd like to provide:
Our feedback system is built on GitHub Issues. Read more on our blog.