Guidance for integration and responsible use of Speech-to-Text


This article is provided for informational purposes only and not for the purpose of providing legal advice. We strongly recommend seeking specialist legal advice when implementing Speech Services.

Integration and responsible use

As Microsoft works to help customers responsibly develop and deploy solutions using Speech-to-Text, we are taking a principled approach to upholding personal agency and dignity by considering the AI systems' fairness, reliability & safety, privacy & security, inclusiveness, transparency, and human accountability. These considerations reflect our commitment to developing Responsible AI.

When getting ready to deploy AI-powered products or features, the following activities help to set you up for success:

  • Understand what it can do: Fully assess the capabilities of any AI system you are using to understand its capabilities and limitations. Understand how it will perform in your particular scenario and context by thoroughly testing it with real life conditions and data.

  • Respect an individual's right to privacy: Only collect data and information from individuals for lawful and justifiable purposes. Only use data and information that you have consent to use for this purpose.

  • Legal review: Obtain appropriate legal advice to review your solution, particularly if you will use it in sensitive or high-risk applications. Understand what restrictions you might need to work within and your responsibility to resolve any issues that might come up in the future. Do not provide any legal advice or guidance.

  • Human-in-the-loop: Keep a human-in-the-loop and include human oversight as a consistent pattern area to explore. This means ensuring constant human oversight of the AI-powered product or feature, and maintaining the role of humans in decision making. Ensure you can have real-time human intervention in the solution to prevent harm. This enables you to manage situations when the AI model does not perform as required.

  • Security: Ensure your solution is secure and has adequate controls to preserve the integrity of your content and prevent unauthorized access.

  • Build trust with affected stakeholders: Communicate the expected benefits and potential risks to affected stakeholders. Help people understand why the data is needed and how the use of the data will lead to their benefit. Describe data handling in an understandable way.

  • Customer feedback loop: Provide a feedback channel that allows users and individuals to report issues with the service once it's been deployed. Once you've deployed an AI-powered product or feature it requires ongoing monitoring and improvement -- be ready to implement any feedback and suggestions for improvement. Establish channels to collect questions and concerns from affected stakeholders (people who may be directly or indirectly impacted by the system, including employees, visitors, and the general public). Examples of such channels are:

    • Feedback features built into app experiences,
    • An easy-to-remember email address for feedback,
    • Anonymous feedback boxes placed in semi-private spaces, and
    • Knowledgeable representatives on site.
  • Feedback: Seek out feedback from a diverse sampling of the community during the development and evaluation process (for example, historically marginalized groups, people with disabilities, and service workers). See: Community Jury.

  • User Study: Any consent or disclosure recommendations should be framed in a user study. Evaluate the first and continuous-use experience with a representative sample of the community to validate that the design choices lead to effective disclosure. Conduct user research with 10-20 community members (affected stakeholders) to evaluate their comprehension of the information and to determine if their expectations are met.

Recommendations for Preserving Privacy

A successful privacy approach empowers individuals with information and provides controls and protection to preserve their privacy.

Consent to process and store audio input: Be sure to have all necessary permissions from your end-users before using the Speech-to-Text enabled features in your applications or devices. Also ensure you have permission for Microsoft to process this data as your third-party cloud service processor. Note that the real-time API does not separately store any of the audio input and transcription output data. However, you may design your application or device to retain end-user data, such as transcription text. You have an option to turn on local data logging via the Speech SDK (See Enabling logging in the Speech SDK).

Next steps