您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

说话人识别 APISpeaker Recognition API

欢迎使用 Azure 认知服务说话人识别 API。Welcome to the Azure Cognitive Services Speaker Recognition APIs. 说话人辨识 API 是基于云的 API,提供用于说话人验证和说话人识别的最先进算法。Speaker Recognition APIs are cloud-based APIs that provide the most advanced algorithms for speaker verification and speaker identification. 说话人辨识可以分为两类:说话人验证和说话人识别。Speaker Recognition can be divided into two categories: speaker verification and speaker identification.

说话人验证Speaker Verification

语音的独特特征可用于辨识一个人,就像指纹一样。Voice has unique characteristics that can be used to identify a person, just like a fingerprint. 使用语音作为访问控制和身份验证方案的信号已成为一种全新创新工具,本质上提升了安全级别,从而简化了客户的身份验证体验。Using voice as a signal for access control and authentication scenarios has emerged as a new innovative tool –essentially offering a level up in security that simplifies the authentication experience for customers.

说话人验证 API 可以使用声音或语音自动验证用户身份。Speaker Verification APIs can automatically verify and authenticate users using their voice or speech.

注册Enrollment

说话人验证注册依赖文本。也就是说,说话人需要选择特定的通行短语,以用于注册和验证阶段。Enrollment for speaker verification is text-dependent, which means speakers need to choose a specific pass phrase to use during both enrollment and verification phases.

在注册过程中,将会记录用户说出某短语的声音,然后提取大量特征,并识别选定短语。In enrollment, the speaker's voice is recorded saying a specific phrase, then a number of features are extracted and the chosen phrase is recognized. 提取的特征和选定短语共同构成唯一语音签名。Together, both extracted features and the chosen phrase form a unique voice signature.

验证Verification

在验证过程中,将把输入语音和短语与注册语音签名和短语进行比较,以验证是否是由同一个人说出,以及所说的短语是否正确。In verification, an input voice and phrase are compared against the enrollment's voice signature and phrase –in order to verify whether or not they are from the same person, and if they are saying the correct phrase.

若要详细了解说话人验证,请参阅 API 说话人 - 验证For more details about speaker verification, please refer to the API Speaker - Verification.

说话人识别Speaker Identification

说话人辨识 API 可以自动识别音频文件中的说话人(来自一组给定的预测说话人)。Speaker Identification APIs can automatically identify the person speaking in an audio file, given a group of prospective speakers. 将把输入音频与提供的一组说话人进行配对。如果发现匹配,就会返回说话人的身份。The input audio is paired against the provided group of speakers, and in the case that there is a match found, the speaker’s identity is returned.

所有说话人都应先完成注册过程,才能将语音注册到系统中,并创建语音打印。All speakers should go through an enrollment process first to get their voice registered to the system, and have a voice print created.

注册Enrollment

说话人识别注册不依赖文本。也就是说,对说话人在音频中所说的内容没有限制。Enrollment for speaker identification is text-independent, which means that there are no restrictions on what the speaker says in the audio. 记录说话人的语音,并提取大量特征,以构成唯一语音签名。The speaker's voice is recorded, and a number of features are extracted to form a unique voice signature.

识别Recognition

在识别过程中提供的是,未知说话人的语音以及一组预测说话人。The audio of the unknown speaker, together with the prospective group of speakers, is provided during recognition. 将输入语音与所有说话人进行比较,以确定这是谁的声音。如果发现匹配,就会返回说话人的身份。The input voice is compared against all speakers in order to determine whose voice it is, and if there is a match found, the identity of the speaker is returned.

若要详细了解说话人识别,请参阅 API 说话人 - 识别For more details about speaker identification, please refer to the API Speaker - Identification.