Text Dependent - Create Enrollment

Enroll Profile
Adds an enrollment to existing profile. If the minimum number of requested enrollment audios is reached, a voice print is created. If the voice print was created before, it gets recreated from all existing enrollment audios including the new one.

Limitations:

  • Minimum audio input length per request is 1 second
  • Maximum audio input length per request is 10 seconds
  • Minimum number of enrollments for creating a voiceprint is 3
  • Maximum number of enrollments for creating a voiceprint is 50
  • Minimum audio Signal-to-noise ratio (SNR) is 10dB

Constraints:

  • First enrollment must match an existing passphrase.
  • All enrollments after the first one, must use the same passphrase used in the first enrollment.
POST {Endpoint}/speaker/verification/v2.0/text-dependent/profiles/{profileId}/enrollments

URI Parameters

Name In Required Type Description
Endpoint
path True
  • string

Supported Cognitive Services endpoints (protocol and hostname, for example: https://westus.api.cognitive.microsoft.com).

profileId
path True
  • string
uuid

Unique identifier for profile id (guid).

Regex pattern: ^([0-9a-fA-F]){8}-?([0-9a-fA-F]){4}-?([0-9a-fA-F]){4}-?([0-9a-fA-F]){4}-?([0-9a-fA-F]){12}$

Request Header

Media Types: "audio/wav; codecs=audio/pcm"

Name Required Type Description
Ocp-Apim-Subscription-Key True
  • string

Request Body

Media Types: "audio/wav; codecs=audio/pcm"

Name Type Description
audioData
  • object

Binary audio file. Supported formats are audio/wav; codecs=audio/pcm. Supports audio up to 5MB.

Responses

Name Type Description
201 Created

Created

400 Bad Request

Bad Request

  • InvalidRequest - Invalid audio length. Maximum allowed length is 10 seconds.
  • InvalidRequest - Invalid audio length. Minimum allowed length is 1 second.
  • InvalidRequest - Invalid audio format. Supported format is WAV 16Khz 16bit Mono PCM.
  • InvalidRequest - Invalid passphrase.
  • InvalidRequest - Audio is too noisy. The minimum allowed Signal-to-noise ratio (SNR) is 10dB.
401 Unauthorized

Request is not authorized. Make sure subscription key is included and valid.

403 Forbidden

Forbidden

  • InvalidOperation - Maximum allowed number of enrollments per profile is 50.
404 Not Found

NotFound - Requested profile doesn't exist

500 Internal Server Error

Internal Server Error.

Other Status Codes

Conflict

  • Conflict - Sending concurrent requests on same profile is not allowed.
Other Status Codes

UnsupportedMediaType - Unsupported media type. Only "audio/wav; codecs=audio/pcm" is accepted.

Other Status Codes

Rate limit is exceeded.

Security

Ocp-Apim-Subscription-Key

Type: apiKey
In: header

Examples

Successful Query

Sample Request

POST {Endpoint}/speaker/verification/v2.0/text-dependent/profiles/49a36324-fc4b-4387-aa06-090cfbf0064f/enrollments
Ocp-Apim-Subscription-Key: {API key}
"{binary file date}"

Sample Response

Content-Type: application/json
{
  "profileId": "49a36324-fc4b-4387-aa06-090cfbf0064f",
  "enrollmentStatus": "Enrolling",
  "enrollmentsCount": 1,
  "enrollmentsLength": 1.83,
  "enrollmentsSpeechLength": 1.35,
  "remainingEnrollmentsCount": 2,
  "passPhrase": "my voice is my passport verify me",
  "audioLength": 1.83,
  "audioSpeechLength": 1.35
}
Content-Type: application/json
{
  "error": {
    "code": "InvalidRequest",
    "message": "Audio is too noisy."
  }
}
Content-Type: application/json
{
  "error": {
    "code": "Unauthorized",
    "message": "Request is not authorized. Make sure subscription key is included and valid."
  }
}
Content-Type: application/json
{
  "error": {
    "code": "InvalidRequest",
    "message": "Maximum allowed length across all profile enrollments is 300 seconds."
  }
}
Content-Type: application/json
{
  "error": {
    "code": "Not Found",
    "message": "Requested profile doesn't exist"
  }
}
Content-Type: application/json
{
  "error": {
    "code": "Conflict",
    "message": "Sending concurrent requests on same profile is not allowed."
  }
}
Content-Type: application/json
{
  "error": {
    "code": "UnsupportedMediaType",
    "message": "Unsupported media type. Only 'audio/wav; codecs=audio/pcm' is accepted."
  }
}
Content-Type: application/json
{
  "error": {
    "code": "RateLimit",
    "message": "Rate limit is exceeded."
  }
}
Content-Type: application/json
{
  "error": {
    "code": "InternalServerError",
    "message": "Internal Server Error."
  }
}

Definitions

Error

Speaker error message

TdEnrollmentInfo

Text-Dependent Speaker profile enrollment info

TrainingStatusType

Status representing the current state of the profile. Available values are:

  • Enrolling: profile has no voice print and not ready for recognition requests.
  • Training: voice print of profile is being created and can’t be used for recognition at the moment.
  • Enrolled: profile has a voice print and ready for recognition requests.

Error

Speaker error message

Name Type Description
error

TdEnrollmentInfo

Text-Dependent Speaker profile enrollment info

Name Type Description
audioLength
  • number

This enrolment audio length in seconds.

audioSpeechLength
  • number

This enrollment audio pure speech (which is the amount of audio after removing silence and non-speech segments) length in seconds.

enrollmentStatus

Status representing the current state of the profile. Available values are:

  • Enrolling: profile has no voice print and not ready for recognition requests.
  • Training: voice print of profile is being created and can’t be used for recognition at the moment.
  • Enrolled: profile has a voice print and ready for recognition requests.
enrollmentsCount
  • integer

Number of enrolment audios accepted for this profile.

enrollmentsLength
  • number

Total length of enrollment audios accepted for this profile in seconds.

enrollmentsSpeechLength
  • number

Summation of pure speech (which is the amount of audio after removing silence and non-speech segments) across all profile enrollments in seconds.

passPhrase
  • string

Passphrase associated with this enrollment.

profileId
  • string

Unique identifier for profile id (guid).

remainingEnrollmentsCount
  • integer

Number of enrollment audios needed to complete profile enrollment.

TrainingStatusType

Status representing the current state of the profile. Available values are:

  • Enrolling: profile has no voice print and not ready for recognition requests.
  • Training: voice print of profile is being created and can’t be used for recognition at the moment.
  • Enrolled: profile has a voice print and ready for recognition requests.
Name Type Description
Enrolled
  • string
Enrolling
  • string
Training
  • string