Microsoft Cognitive Services lets you tap into an ever-growing collection of powerful AI algorithms developed by experts in the fields of computer vision, speech, natural language processing, knowledge extraction, and web search. The services simplify a variety of AI-based tasks, giving you a quick way to add state-of-the-art intelligence technologies to your bots with just a few lines of code. The APIs integrate into most modern languages and platforms. The APIs are also constantly improving, learning, and getting smarter, so experiences are always up to date.
Intelligent bots respond as if they can see the world as people see it. They discover information and extract knowledge from different sources to provide useful answers, and, best of all, they learn as they acquire more experience to continuously improve their capabilities.
The interaction between users and bots is mostly free-form, so bots need to understand language naturally and contextually. The Cognitive Service Language APIs provide powerful language models to determine what users want, to identify concepts and entities in a given sentence, and ultimately to allow your bots to respond with the appropriate action. The five APIs support several text analytics capabilities, such as spell checking, sentiment detection, language modeling, and extraction of accurate and rich insights from text.
Cognitive Services provides five APIs for language understanding:
The Language Understanding Intelligent Service (LUIS) is able to process natural language using pre-built or custom-trained language models.
The Text Analytics API detects sentiment, key phrases, topics, and language from text.
The Bing Spell Check API provides powerful spell check capabilities, and is able to recognize the difference between names, brand names, and slang.
The Linguistic Analysis API uses advanced linguistic analysis algorithms to process text, and perform operations such as breaking down the structure of the text, or performing part-of-speech tagging and parsing.
The Web Language Model (WebLM) API can be used to automate a variety of natural language processing tasks, such as word frequency or next-word prediction, using advanced language modeling algorithms.
Learn more about language understanding with Microsoft Cognitive Services.
Cognitive Services provides five knowledge APIs that enable you to identify named entities or phrases in unstructured text, add personalized recommendations, provide auto-complete suggestions based on natural interpretation of user queries, and search academic papers and other research like a personalized FAQ service.
The Entity Linking Intelligence Service annotates unstructured text with the relevant entities mentioned in the text. Depending on the context, the same word or phrase may refer to different things. This service understands the context of the supplied text and will identify each entity in your text.
The Recommendations API provides "frequently bought together" recommendations to a product, as well as personalized recommendations based on a user's history. Use this service to build and train a model based on data that you provide, and then use this model to add recommendations to your application.
The Knowledge Exploration Service provides natural language interpretation of user queries and returns annotated interpretations to enable rich search and auto-completion experiences that anticipate what the user is typing. Instant query completion suggestions and predictive query refinements are based on your own data and application-specific grammars to enable your users to perform fast queries.
The Academic Knowledge API returns academic research papers, authors, journals, conferences, topics, and universities from the Microsoft Academic Graph. Built as a domain-specific example of the Knowledge Exploration Service, the Academic Knowledge API provides a knowledge base using a graph-like dialog with search capabilities over hundreds of millions of research-related entities. Search for a topic, a professor, a university, or a conference, and the API will provide relevant publications and related entities. The grammar also supports natural queries like "Papers by Michael Jordan about machine learning after 2010".
The QnA Maker is a free, easy-to-use, REST API and web-based service that trains AI to respond to users’ questions in a natural, conversational way. With optimized machine learning logic and the ability to integrate industry-leading language processing, QnA Maker distills semi-structured data like question and answer pairs into distinct, helpful answers.
Learn more about knowledge extraction with Microsoft Cognitive Services.
Speech recognition and conversion
Use the Speech APIs to add advanced speech skills to your bot that leverage industry-leading algorithms for speech-to-text and text-to-speech conversion, as well as speaker recognition. The Speech APIs use built-in language and acoustic models that cover a wide range of scenarios with high accuracy.
For applications that require further customization, you can use the Custom Recognition Intelligent Service (CRIS). This allows you to calibrate the language and acoustic models of the speech recognizer by tailoring it to the vocabulary of the application, or even to the speaking style of your users.
There are three Speech APIs available in Cognitive Services to process or synthesize speech:
- The Bing Speech API provides speech-to-text and text-to-speech conversion capabilities.
- The Custom Recognition Intelligent Service (CRIS) allows you to create custom speech recognition models to tailor the speech-to-text conversion to an application's vocabulary or user's speaking style.
- The Speaker Recognition API enables speaker identification and verification through voice.
Learn more about speech recognition and conversion with Microsoft Cognitive Services.
The Bing Search APIs enable you to add intelligent web search capabilities to your bots. With a few lines of code, you can access billions of webpages, images, videos, news, and other result types. You can configure the APIs to return results by geographical location, market, or language for better relevance. You can further customize your search using the supported search parameters, such as Safesearch to filter out adult content, and Freshness to return results according to a specific date.
There are five Bing Search APIs available in Cognitive Services.
The Web Search API provides web, image, video, news and related search results with a single API call.
The Image Search API returns image results with enhanced metadata (dominant color, image kind, etc.) and supports several image filters to customize the results.
The Video Search API retrieves video results with rich metadata (video size, quality, price, etc.), video previews, and supports several video filters to customize the results.
The News Search API finds news articles around the world that match your search query or are currently trending on the Internet.
The Autosuggest API offers instant query completion suggestions to complete your search query faster and with less typing.
Learn more about web search with Microsoft Cognitive Services.
Image and video understanding
The Vision APIs bring advanced image and video understanding skills to your bots. State-of-the-art algorithms allow you to process images or videos and get back information you can transform into actions. For example, you can use them to recognize objects, people's faces, age, gender or even feelings.
The Vision APIs support a variety of image understanding features. They can identify mature or explicit content, estimate and accent colors, categorize the content of images, perform optical character recognition, and describe an image with complete English sentences. The Vision APIs also support several image and video processing capabilities, such as intelligently generating image or video thumbnails, or stabilizing the output of a video.
Cognitive Services provide four APIs you can use to process images or videos:
The Computer Vision API extracts rich information about images (such as objects or people), determines if the image contains mature or explicit content, and processes text (using OCR) in images.
The Emotion API analyzes human faces and recognizes their emotion across eight possible categories of human emotions.
The Face API detects human faces, compares them to similar faces, and can even organize people into groups according to visual similarity.
The Video API analyzes and processes video to stabilize video output, detects motion, tracks faces, and can generate a motion thumbnail summary of the video.
Learn more about image and video understanding with Microsoft Cognitive Services.
Bots often need the user to input a location to complete a task. For example, a Taxi bot requires the user's pickup and destination address before requesting a ride. Similarly, a Pizza bot must know the user's delivery address to submit the order, and so on. Normally, bot developers need to use a combination of location or place APIs so that their bots engage in a multi-turn dialog with users to get their desired location and subsequently validate it. Unfortunately, the development steps are usually complicated and error-prone.
The Bing location control makes this process easy by abstracting away the tedious coding steps to let the user pick a location and reliably validate it. The control offers the following capabilities:
- Address look up and validation using Bing's Maps REST services
- Address disambiguation when more than one address is found
- Support for declaring required location fields
- Support for FB Messenger's location picker GUI dialog
- Open-source code (C# and Node.js) with customizable dialog strings
Learn more about the location control with Microsoft Cognitive Services.
You can find comprehensive documentation of each product and their corresponding API references in the Cognitive Services documentation.