LUIS 支援的語言與區域Language and region support for LUIS

LUIS 在服務內有各種不同的功能。LUIS has a variety of features within the service. 並非所有功能都有相同的語言地位。Not all features are at the same language parity. 請確定您有興趣的功能支援您所針對的語言文化特性。Make sure the features you are interested in are supported in the language culture you are targeting. LUIS 應用程式是特定文化特性,一旦設定就無法變更。A LUIS app is culture-specific and cannot be changed once it is set.

多語言 LUIS 應用程式Multi-language LUIS apps

如果您需要多語言 LUIS 用戶端應用程式 (例如聊天機器人),您會有幾個選項可用。If you need a multi-language LUIS client application such as a chatbot, you have a few options. 如果 LUIS 支援所有語言,請針對每種語言開發 LUIS 應用程式。If LUIS supports all the languages, you develop a LUIS app for each language. 每個 LUIS 應用程式都有唯一的應用程式識別碼和端點記錄。Each LUIS app has a unique app ID, and endpoint log. 如果您必須為 LUIS 不支援的語言提供語言理解,您可以使用 Microsoft 翻譯工具 API將語句翻譯成支援的語言、將語句提交至 LUIS 端點,以及接收所產生的分數。If you need to provide language understanding for a language LUIS does not support, you can use Microsoft Translator API to translate the utterance into a supported language, submit the utterance to the LUIS endpoint, and receive the resulting scores.

支援的語言Languages supported

LUIS 可理解下列語言的語句:LUIS understands utterances in the following languages:

語言Language LocaleLocale 預建網域Prebuilt domain 預建實體Prebuilt entity 片語清單建議Phrase list recommendations **文字分析**Text analytics
(情感和(Sentiment and
關鍵字)Keywords)
美式英文American English en-US
阿拉伯文(預覽-現代化標準阿拉伯文)Arabic (preview - modern standard Arabic) ar-AR - - - -
*中文*Chinese zh-CN -
荷蘭文Dutch nl-NL - -
法文 (法國)French (France) fr-FR
法文 (加拿大)French (Canada) fr-CA - - -
德文German de-DE
北印度文Hindi hi-IN - - - -
義大利文Italian it-IT
*日文*Japanese ja-JP 僅限關鍵片語Key phrase only
韓文Korean ko-KR - - 僅限關鍵片語Key phrase only
葡萄牙文 (巴西)Portuguese (Brazil) pt-BR 並非所有的次文化特性not all sub-cultures
西班牙文 (西班牙)Spanish (Spain) es-ES
西班牙文 (墨西哥)Spanish (Mexico) es-MX - -
土耳其文Turkish tr-TR - - 僅限情感Sentiment only

語言支援會因預建實體網域而有所不同。Language support varies for prebuilt entities and prebuilt domains.

*中文支援附註*Chinese support notes

  • zh-CN 文化特性中,LUIS 預期會有簡體中文字元集,而不是繁體字元集。In the zh-CN culture, LUIS expects the simplified Chinese character set instead of the traditional character set.
  • 意圖、實體、功能和規則運算式的名稱可能採用中文或羅馬字元。The names of intents, entities, features, and regular expressions may be in Chinese or Roman characters.
  • 如需 zh-CN 文化特性中支援哪些預建網域的相關資訊,請參閱預先建立的網域參考See the prebuilt domains reference for information on which prebuilt domains are supported in the zh-CN culture.

*日文支援附註*Japanese support notes

  • 因為 LUIS 不提供語法分析而無法理解 Keigo (敬語) 與非正式日文之間的差異,所以您必須將不同的正式層級合併為您應用程式的訓練範例。Because LUIS does not provide syntactic analysis and will not understand the difference between Keigo and informal Japanese, you need to incorporate the different levels of formality as training examples for your applications.
    • でございます 與 です 不同。でございます is not the same as です.
    • です 與 だ 不同。です is not the same as だ.

**文字分析支援附註**Text analytics support notes

文字分析包含 keyPhrase 預先建置的實體和情感分析。Text analytics includes keyPhrase prebuilt entity and sentiment analysis. 只有葡萄牙文支援次文化特性:pt-PTpt-BROnly Portuguese is supported for subcultures: pt-PT and pt-BR. 主要文化特性層級支援其他所有的文化特性。All other cultures are supported at the primary culture level. 深入了解文字分析支援的語言Learn more about Text Analytics supported languages.

語音 API 支援的語言Speech API supported languages

請參閱語音支援的語言,以取得語音聽寫模式語言。See Speech Supported languages for Speech dictation mode languages.

Bing 拼字檢查支援的語言Bing Spell Check supported languages

如需支援的語言清單和狀態,請參閱 Bing 拼字檢查支援的語言See Bing Spell Check Supported languages for a list of supported languages and status.

應用程式中的罕見或外來字Rare or foreign words in an application

en-us 文化特性中,LUIS 會學習辨識大部分的英文字,包括俚語。In the en-us culture, LUIS learns to distinguish most English words, including slang. zh-cn 文化特性中,LUIS 會學習辨識大部分的中文字元。In the zh-cn culture, LUIS learns to distinguish most Chinese characters. 如果您使用 en-us 中的罕見字組或 zh-cn 中的字元,而且您發現 LUIS 似乎無法辨識該字組或字元,您可以將該字組或字元新增到片語清單功能If you use a rare word in en-us or character in zh-cn, and you see that LUIS seems unable to distinguish that word or character, you can add that word or character to a phrase-list feature. 例如,應用程式文化特性外部的字組 (也就是外來字組) 應新增至片語清單功能。For example, words outside of the culture of the application -- that is, foreign words -- should be added to a phrase-list feature.

混合式語言Hybrid languages

混合式語言結合來自兩個文化特性 (例如英文和中文) 的文字。Hybrid languages combine words from two cultures such as English and Chinese. LUIS 中不支援這些語言,因為應用程式是以單一文化特性為基礎。These languages are not supported in LUIS because an app is based on a single culture.

Token 化Tokenization

為了執行機器學習,LUIS 根據文化特性將語句分成數個語彙基元To perform machine learning, LUIS breaks an utterance into tokens based on culture.

語言Language 每個空格或特殊字元every space or special character 字元層級character level 複合字組compound words 傳回的 Token 化實體tokenized entity returned
阿拉伯文Arabic
中文Chinese
荷蘭文Dutch
英文 (en-us)English (en-us)
法文 (fr-FR)French (fr-FR)
法文 (fr-CA)French (fr-CA)
德文German
北印度文Hindi - - - -
義大利文Italian
日文Japanese
韓文Korean
葡萄牙文 (巴西)Portuguese (Brazil)
西班牙文 (es-ES)Spanish (es-ES)
西班牙文 (es-MX)Spanish (es-MX)

自訂 tokenizer 版本Custom tokenizer versions

下列文化特性具有自訂 tokenizer 版本:The following cultures have custom tokenizer versions:

文化特性Culture 版本Version 目的Purpose
德文German
de-de
1.0.01.0.0 使用以機器學習為基礎的 tokenizer 來分割它們,以嘗試將複合單字細分成單一元件,以 token 化單字。Tokenizes words by splitting them using a machine learning-based tokenizer that tries to break down composite words into their single components.
如果使用者輸入 Ich fahre einen krankenwagen 做為語句,就會變成 Ich fahre einen kranken wagenIf a user enters Ich fahre einen krankenwagen as an utterance, it is turned to Ich fahre einen kranken wagen. 允許標記 kranken,並獨立 wagen 為不同的實體。Allowing the marking of kranken and wagen independently as different entities.
德文German
de-de
1.0.21.0.2 藉由在空間上分割來 token 化單字。Tokenizes words by splitting them on spaces.
如果使用者輸入 Ich fahre einen krankenwagen 做為語句,它會保持為單一權杖。if a user enters Ich fahre einen krankenwagen as an utterance, it remains a single token. 因此 krankenwagen 會標示為單一實體。Thus krankenwagen is marked as a single entity.

在 tokenizer 版本之間遷移Migrating between tokenizer versions

Token 化會在應用層級發生。Tokenization happens at the app level. 不支援版本層級 token 化。There is no support for version-level tokenization.

將檔案匯入為新的應用程式,而不是版本。Import the file as a new app, instead of a version. 此動作表示新的應用程式具有不同的應用程式識別碼,但使用檔案中指定的 tokenizer 版本。This action means the new app has a different app ID but uses the tokenizer version specified in the file.