If you'd like to see us expand this article with more information, implementation details, pricing guidance, or code examples, let us know with GitHub Feedback!
With Speech Services, it's easy to transcribe every call. Index the transcription for full-text search, or apply Text Analytics to detect sentiment, language, and key phrases for insights. If your call center recordings involve specialized terminology, such as product names or IT jargon, create a custom language model to teach Speech Services the vocabulary. A custom acoustic model helps Speech Services understand speakers even with background noise or poor phone connections.
For more information, read how batch transcription works with Speech Services.
Download an SVG of this architecture.
- Adapt a model for your domain and deploy that model
- Upload your recordings to a blob container
- Create a POST request to batch transcription
- Speech Services schedules the transcription job
- Stereo files are split into two channels
- Mono files undergo diarization to distinguish between speakers
- Download the transcription using the transcription ID