Text Dependent - Create Enrollment

Reference

Service:: Speaker Recognition

API Version:: 2021-09-05

Enroll Profile
Adds an enrollment to existing profile. If the minimum number of requested enrollment audios is reached, a voice print is created. If the voice print was created before, it gets recreated from all existing enrollment audios including the new one.

Limitations:

Minimum audio input length per request is 1 second
Maximum audio input length per request is 10 seconds
Minimum number of enrollments for creating a voiceprint is 3
Maximum number of enrollments for creating a voiceprint is 50
Minimum audio Signal-to-noise ratio (SNR) is 2dB

Constraints:

First enrollment must match an existing passphrase.
All enrollments after the first one, must use the same passphrase used in the first enrollment.

POST {endpoint}/speaker-recognition/verification/text-dependent/profiles/{profileId}/enrollments?api-version=2021-09-05

URI Parameters

Name	In	Required	Type	Description
endpoint	path	True	string	Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com).
profileId	path	True	string uuid	Unique identifier for profile id (guid).
api-version	query	True	string	Specifies the version of the operation to use for this request.

Request Header

Media Types: "audio/wav; codecs=audio/pcm"

Name	Required	Type	Description
Ocp-Apim-Subscription-Key	True	string

Request Body

Media Types: "audio/wav; codecs=audio/pcm"

Name	Type	Description
audioData	object	Binary audio file. Supported formats are audio/wav; codecs=audio/pcm. Supports audio up to 5MB.

Responses

Name	Type	Description
201 Created	TdEnrollmentInfo	Created
Other Status Codes	SpeakerErrorInfo	Failure Headers x-ms-error-code: string

Name

Type

Description

201 Created

TdEnrollmentInfo

Created

Other Status Codes

SpeakerErrorInfo

Failure

Headers

x-ms-error-code: string

Security

Ocp-Apim-Subscription-Key

Type: apiKey
In: header

Examples

Successful Query

Sample Request

HTTP

POST https://westus.api.cognitive.microsoft.com/speaker-recognition/verification/text-dependent/profiles/49a36324-fc4b-4387-aa06-090cfbf0064f/enrollments?api-version=2021-09-05


"{binary file date}"

Sample Response

Status code:: 201

Content-Type: application/json

Response Body

{
  "profileId": "49a36324-fc4b-4387-aa06-090cfbf0064f",
  "enrollmentStatus": "Enrolling",
  "enrollmentsCount": 1,
  "enrollmentsLengthInSec": 1.83,
  "enrollmentsSpeechLengthInSec": 1.35,
  "remainingEnrollmentsCount": 2,
  "passPhrase": "my voice is my passport verify me",
  "audioLengthInSec": 1.83,
  "audioSpeechLengthInSec": 1.35
}

Status code:: default

Content-Type: application/json
x-ms-error-code: Error Code

Response Body

{
  "error": {
    "code": "Error Code",
    "message": "Erro Messae"
  }
}

Definitions

Name	Description
Error
SpeakerErrorInfo	Speaker error message
TdEnrollmentInfo	Text-Dependent Speaker profile enrollment info
TrainingStatusType	Status representing the current state of the profile. Available values are: Enrolling: profile has no voice print and not ready for recognition requests. Training: voice print of profile is being created and can’t be used for recognition at the moment. Enrolled: profile has a voice print and ready for recognition requests.

Error

Name	Type	Description
code	string
message	string

SpeakerErrorInfo

Speaker error message

Name	Type	Description
error	Error

TdEnrollmentInfo

Text-Dependent Speaker profile enrollment info

Name	Type	Description
audioLengthInSec	number	This enrollment audio length in seconds.
audioSpeechLengthInSec	number	This enrollment audio pure speech (which is the amount of audio after removing silence and non-speech segments) length in seconds.
enrollmentStatus	TrainingStatusType	Status representing the current state of the profile. Available values are: Enrolling: profile has no voice print and not ready for recognition requests. Training: voice print of profile is being created and can’t be used for recognition at the moment. Enrolled: profile has a voice print and ready for recognition requests.
enrollmentsCount	integer	Number of enrollment audios accepted for this profile.
enrollmentsLengthInSec	number	Total length of enrollment audios accepted for this profile in seconds.
enrollmentsSpeechLengthInSec	number	Summation of pure speech (which is the amount of audio after removing silence and non-speech segments) across all profile enrollments in seconds.
passPhrase	string	Passphrase associated with this enrollment.
profileId	string	Unique identifier for profile id (guid).
remainingEnrollmentsCount	integer	Number of enrollment audios needed to complete profile enrollment.

TrainingStatusType

Status representing the current state of the profile. Available values are:

Enrolling: profile has no voice print and not ready for recognition requests.
Training: voice print of profile is being created and can’t be used for recognition at the moment.
Enrolled: profile has a voice print and ready for recognition requests.

Name	Type	Description
Enrolled	string
Enrolling	string
Training	string

Text Dependent - Create Enrollment

URI Parameters

Request Header

Request Body

Responses

Security

Ocp-Apim-Subscription-Key

Examples

Successful Query

Sample Request

Sample Response

Definitions

Error

SpeakerErrorInfo

TdEnrollmentInfo

TrainingStatusType

Additional resources