說話者辨識 APISpeaker Recognition API

歡迎使用 Azure 認知服務說話者辨識 API。Welcome to the Azure Cognitive Services Speaker Recognition APIs. 說話者辨識 API 是雲端式 API,提供最先進的演算法進行說話者驗證和說話者識別。Speaker Recognition APIs are cloud-based APIs that provide the most advanced algorithms for speaker verification and speaker identification. 說話者辨識可以分為兩類:說話者驗證和說話者識別。Speaker Recognition can be divided into two categories: speaker verification and speaker identification.

說話者驗證Speaker Verification

語音都有唯一的特性,像指紋一樣可以用來識別個人。Voice has unique characteristics that can be used to identify a person, just like a fingerprint. 將語音用為存取控制和驗證案例的訊號,已經成為新的創新工具,其主要提供高一級的安全性以簡化客戶的驗證體驗。Using voice as a signal for access control and authentication scenarios has emerged as a new innovative tool –essentially offering a level up in security that simplifies the authentication experience for customers.

說話者驗證 API 可以使用其聲音或話語內容自動檢查並驗證使用者。Speaker Verification APIs can automatically verify and authenticate users using their voice or speech.

申請Enrollment

說話者驗證註冊因文字而異,這表示說話者需要選擇在註冊和驗證階段期間所用的特定通關語。Enrollment for speaker verification is text-dependent, which means speakers need to choose a specific pass phrase to use during both enrollment and verification phases.

註冊會錄下說話者說出特定語句的聲音,然後擷取數個特點並辨識所選語句。In enrollment, the speaker's voice is recorded saying a specific phrase, then a number of features are extracted and the chosen phrase is recognized. 擷取的特點和所選語句兩者一起形成唯一的語音簽章。Together, both extracted features and the chosen phrase form a unique voice signature.

驗證Verification

驗證會比對輸入的語音和語句與註冊的語音簽章和語句,以確認它們是否來自同一人,以及是否說出正確的語句。In verification, an input voice and phrase are compared against the enrollment's voice signature and phrase –in order to verify whether or not they are from the same person, and if they are saying the correct phrase.

如需說話者驗證的詳細資訊,請參閱 說話者 - 驗證 API。For more details about speaker verification, please refer to the API Speaker - Verification.

說話者識別Speaker Identification

說話者識別 API 會自動找出音訊檔案中說話的人,假定有一群可能的說話者。Speaker Identification APIs can automatically identify the person speaking in an audio file, given a group of prospective speakers. 輸入的音訊會比對所提供的一群 說話者,如果找到相符的項目,即傳回說話者的身分識別。The input audio is paired against the provided group of speakers, and in the case that there is a match found, the speaker’s identity is returned.

所有的說話者都應該先完成註冊程序,向系統註冊其語音,並建立聲紋。All speakers should go through an enrollment process first to get their voice registered to the system, and have a voice print created.

申請Enrollment

說話者識別註冊與文字無關,這表示說話者在音訊中什麼都可以說。Enrollment for speaker identification is text-independent, which means that there are no restrictions on what the speaker says in the audio. 說話者的聲音會被錄下,並擷取多項特點以形成唯一的語音簽章。The speaker's voice is recorded, and a number of features are extracted to form a unique voice signature.

辨識Recognition

辨識期間會提供未知說話者和一群可能的說話者的音訊。The audio of the unknown speaker, together with the prospective group of speakers, is provided during recognition. 比對輸入的語音和所有說話者,以判斷這是誰的語音,如果找到相符項目,就會傳回說話者的身分識別。The input voice is compared against all speakers in order to determine whose voice it is, and if there is a match found, the identity of the speaker is returned.

如需說話者識別的詳細資訊,請參閱 說話者 - 識別 API。For more details about speaker identification, please refer to the API Speaker - Identification.