使用語音 SDK 與用戶端應用程式整合

發行項
02/17/2024

重要

自定義命令將於 2026 年 4 月 30 日淘汰。自 2023 年 10 月 30 日起，您無法在 Speech Studio 中建立新的自定義命令應用程式。與此變更相關， LUIS 將於 2025 年 10 月 1 日淘汰。自 2023 年 4 月 1 日起，您無法建立新的 LUIS 資源。

在本文中，您將瞭解如何從 UWP 應用程式中執行的語音 SDK 向已發布的自定義命令應用程式提出要求。若要建立自定義命令應用程式的連線，您需要：

發佈自訂指令應用程式並取得應用程式識別碼（應用程式識別子）
使用語音 SDK 建立通用 Windows 平台（UWP）用戶端應用程式，讓您能夠與自定義命令應用程式通訊

必要條件

需要自定義命令應用程式才能完成本文。請嘗試快速入門來建立自訂命令應用程式：

建立自定義命令應用程式

您也需要：

Visual Studio 2019 或更高版本。本指南以Visual Studio 2019為基礎。
Azure AI 語音資源金鑰和區域：在 Azure 入口網站上建立語音資源。如需詳細資訊，請參閱建立多服務資源。
啟用您的裝置以進行開發

步驟 1：發佈自定義命令應用程式

開啟您先前建立的自訂命令應用程式。
移至 [設定]，選取 [LUIS 資源]。
如果未 指派預測資源 ，請選取查詢預測金鑰或建立新的預測密鑰。

發布應用程式之前，一律需要查詢預測密鑰。如需 LUIS 資源的詳細資訊，請參閱建立 LUIS 資源
返回編輯命令，選取 [發佈]。
從「發佈」通知複製應用程式識別碼，以供稍後使用。
複製語音資源金鑰以供稍後使用。

步驟 2：建立 Visual Studio 專案

建立適用於UWP開發的Visual Studio專案，並安裝語音 SDK。

步驟 3：新增範例程序代碼

在此步驟中，我們會新增定義應用程式使用者介面的 XAML 程式代碼，並新增 C# 程式代碼後置實作。

XAML 程式代碼

藉由新增 XAML 程式代碼來建立應用程式的使用者介面。

在 方案總管 中，開啟MainPage.xaml

在設計工具的 XAML 檢視中，以下列代碼段取代整個內容：

<Page
    x:Class="helloworld.MainPage"
    xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
    xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml"
    xmlns:local="using:helloworld"
    xmlns:d="http://schemas.microsoft.com/expression/blend/2008"
    xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006"
    mc:Ignorable="d"
    Background="{ThemeResource ApplicationPageBackgroundThemeBrush}">

    <Grid>
        <StackPanel Orientation="Vertical" HorizontalAlignment="Center"
                    Margin="20,50,0,0" VerticalAlignment="Center" Width="800">
            <Button x:Name="EnableMicrophoneButton" Content="Enable Microphone"
                    Margin="0,10,10,0" Click="EnableMicrophone_ButtonClicked"
                    Height="35"/>
            <Button x:Name="ListenButton" Content="Talk"
                    Margin="0,10,10,0" Click="ListenButton_ButtonClicked"
                    Height="35"/>
            <StackPanel x:Name="StatusPanel" Orientation="Vertical"
                        RelativePanel.AlignBottomWithPanel="True"
                        RelativePanel.AlignRightWithPanel="True"
                        RelativePanel.AlignLeftWithPanel="True">
                <TextBlock x:Name="StatusLabel" Margin="0,10,10,0"
                           TextWrapping="Wrap" Text="Status:" FontSize="20"/>
                <Border x:Name="StatusBorder" Margin="0,0,0,0">
                    <ScrollViewer VerticalScrollMode="Auto"
                                  VerticalScrollBarVisibility="Auto" MaxHeight="200">
                        <!-- Use LiveSetting to enable screen readers to announce
                             the status update. -->
                        <TextBlock
                            x:Name="StatusBlock" FontWeight="Bold"
                            AutomationProperties.LiveSetting="Assertive"
                            MaxWidth="{Binding ElementName=Splitter, Path=ActualWidth}"
                            Margin="10,10,10,20" TextWrapping="Wrap"  />
                    </ScrollViewer>
                </Border>
            </StackPanel>
        </StackPanel>
        <MediaElement x:Name="mediaElement"/>
    </Grid>
</Page>

[設計] 檢視會更新以顯示應用程式的使用者介面。

C# 程式代碼後置來源

新增程式代碼後置來源，讓應用程式如預期般運作。程式代碼後置來源包括：

和 Speech.Dialog 命名空間的必要usingSpeech語句。
簡單的實作，可確保麥克風存取，並聯機到按鈕處理程式。
在應用程式中呈現訊息和錯誤的基本UI協助程式。
初始化程式代碼路徑的登陸點。
可播放文字到語音轉換的協助程式（沒有串流支援）。
要開始接聽的空白按鈕處理程式。

新增程序代碼後置來源，如下所示：

在 方案總管 中，開啟程式代碼後置原始程式檔 MainPage.xaml.cs （分組在下方MainPage.xaml）

以下列程式代碼取代檔案的內容：

using Microsoft.CognitiveServices.Speech;
using Microsoft.CognitiveServices.Speech.Audio;
using Microsoft.CognitiveServices.Speech.Dialog;
using System;
using System.IO;
using System.Text;
using Windows.UI.Xaml;
using Windows.UI.Xaml.Controls;
using Windows.UI.Xaml.Media;

namespace helloworld
{
    public sealed partial class MainPage : Page
    {
        private DialogServiceConnector connector;

        private enum NotifyType
        {
            StatusMessage,
            ErrorMessage
        };

        public MainPage()
        {
            this.InitializeComponent();
        }

        private async void EnableMicrophone_ButtonClicked(
            object sender, RoutedEventArgs e)
        {
            bool isMicAvailable = true;
            try
            {
                var mediaCapture = new Windows.Media.Capture.MediaCapture();
                var settings =
                    new Windows.Media.Capture.MediaCaptureInitializationSettings();
                settings.StreamingCaptureMode =
                    Windows.Media.Capture.StreamingCaptureMode.Audio;
                await mediaCapture.InitializeAsync(settings);
            }
            catch (Exception)
            {
                isMicAvailable = false;
            }
            if (!isMicAvailable)
            {
                await Windows.System.Launcher.LaunchUriAsync(
                    new Uri("ms-settings:privacy-microphone"));
            }
            else
            {
                NotifyUser("Microphone was enabled", NotifyType.StatusMessage);
            }
        }

        private void NotifyUser(
            string strMessage, NotifyType type = NotifyType.StatusMessage)
        {
            // If called from the UI thread, then update immediately.
            // Otherwise, schedule a task on the UI thread to perform the update.
            if (Dispatcher.HasThreadAccess)
            {
                UpdateStatus(strMessage, type);
            }
            else
            {
                var task = Dispatcher.RunAsync(
                    Windows.UI.Core.CoreDispatcherPriority.Normal,
                    () => UpdateStatus(strMessage, type));
            }
        }

        private void UpdateStatus(string strMessage, NotifyType type)
        {
            switch (type)
            {
                case NotifyType.StatusMessage:
                    StatusBorder.Background = new SolidColorBrush(
                        Windows.UI.Colors.Green);
                    break;
                case NotifyType.ErrorMessage:
                    StatusBorder.Background = new SolidColorBrush(
                        Windows.UI.Colors.Red);
                    break;
            }
            StatusBlock.Text += string.IsNullOrEmpty(StatusBlock.Text)
                ? strMessage : "\n" + strMessage;

            if (!string.IsNullOrEmpty(StatusBlock.Text))
            {
                StatusBorder.Visibility = Visibility.Visible;
                StatusPanel.Visibility = Visibility.Visible;
            }
            else
            {
                StatusBorder.Visibility = Visibility.Collapsed;
                StatusPanel.Visibility = Visibility.Collapsed;
            }
            // Raise an event if necessary to enable a screen reader
            // to announce the status update.
            var peer = Windows.UI.Xaml.Automation.Peers.FrameworkElementAutomationPeer.FromElement(StatusBlock);
            if (peer != null)
            {
                peer.RaiseAutomationEvent(
                    Windows.UI.Xaml.Automation.Peers.AutomationEvents.LiveRegionChanged);
            }
        }

        // Waits for and accumulates all audio associated with a given
        // PullAudioOutputStream and then plays it to the MediaElement. Long spoken
        // audio will create extra latency and a streaming playback solution
        // (that plays audio while it continues to be received) should be used --
        // see the samples for examples of this.
        private void SynchronouslyPlayActivityAudio(
            PullAudioOutputStream activityAudio)
        {
            var playbackStreamWithHeader = new MemoryStream();
            playbackStreamWithHeader.Write(Encoding.ASCII.GetBytes("RIFF"), 0, 4); // ChunkID
            playbackStreamWithHeader.Write(BitConverter.GetBytes(UInt32.MaxValue), 0, 4); // ChunkSize: max
            playbackStreamWithHeader.Write(Encoding.ASCII.GetBytes("WAVE"), 0, 4); // Format
            playbackStreamWithHeader.Write(Encoding.ASCII.GetBytes("fmt "), 0, 4); // Subchunk1ID
            playbackStreamWithHeader.Write(BitConverter.GetBytes(16), 0, 4); // Subchunk1Size: PCM
            playbackStreamWithHeader.Write(BitConverter.GetBytes(1), 0, 2); // AudioFormat: PCM
            playbackStreamWithHeader.Write(BitConverter.GetBytes(1), 0, 2); // NumChannels: mono
            playbackStreamWithHeader.Write(BitConverter.GetBytes(16000), 0, 4); // SampleRate: 16kHz
            playbackStreamWithHeader.Write(BitConverter.GetBytes(32000), 0, 4); // ByteRate
            playbackStreamWithHeader.Write(BitConverter.GetBytes(2), 0, 2); // BlockAlign
            playbackStreamWithHeader.Write(BitConverter.GetBytes(16), 0, 2); // BitsPerSample: 16-bit
            playbackStreamWithHeader.Write(Encoding.ASCII.GetBytes("data"), 0, 4); // Subchunk2ID
            playbackStreamWithHeader.Write(BitConverter.GetBytes(UInt32.MaxValue), 0, 4); // Subchunk2Size

            byte[] pullBuffer = new byte[2056];

            uint lastRead = 0;
            do
            {
                lastRead = activityAudio.Read(pullBuffer);
                playbackStreamWithHeader.Write(pullBuffer, 0, (int)lastRead);
            }
            while (lastRead == pullBuffer.Length);

            var task = Dispatcher.RunAsync(
                Windows.UI.Core.CoreDispatcherPriority.Normal, () =>
            {
                mediaElement.SetSource(
                    playbackStreamWithHeader.AsRandomAccessStream(), "audio/wav");
                mediaElement.Play();
            });
        }

        private void InitializeDialogServiceConnector()
        {
            // New code will go here
        }

        private async void ListenButton_ButtonClicked(
            object sender, RoutedEventArgs e)
        {
            // New code will go here
        }
    }
}

注意

如果您看到錯誤：「類型 'Object' 定義於未參考的元件中」

以滑鼠右鍵按下您的解決方案。
選擇 [管理方案的 NuGet 套件]，選取 [更新
如果您在更新清單中看到 Microsoft.NETCore.UniversalWindowsPlatform，請將 Microsoft.NETCore.UniversalWindowsPlatform 更新為最新版本

將下列程式代碼新增至的方法主體 InitializeDialogServiceConnector

// This code creates the `DialogServiceConnector` with your resource information.
// create a DialogServiceConfig by providing a Custom Commands application id and Speech resource key
// The RecoLanguage property is optional (default en-US); note that only en-US is supported in Preview
const string speechCommandsApplicationId = "YourApplicationId"; // Your application id
const string speechSubscriptionKey = "YourSpeechSubscriptionKey"; // Your Speech resource key
const string region = "YourServiceRegion"; // The Speech resource region. 

var speechCommandsConfig = CustomCommandsConfig.FromSubscription(speechCommandsApplicationId, speechSubscriptionKey, region);
speechCommandsConfig.SetProperty(PropertyId.SpeechServiceConnection_RecoLanguage, "en-us");
connector = new DialogServiceConnector(speechCommandsConfig);

將字串 YourApplicationId、 YourSpeechSubscriptionKey和 YourServiceRegion 取代為您的應用程式、語音索引鍵和區域您自己的值

將下列代碼段附加至方法主體的結尾 InitializeDialogServiceConnector

//
// This code sets up handlers for events relied on by `DialogServiceConnector` to communicate its activities,
// speech recognition results, and other information.
//
// ActivityReceived is the main way your client will receive messages, audio, and events
connector.ActivityReceived += (sender, activityReceivedEventArgs) =>
{
    NotifyUser(
        $"Activity received, hasAudio={activityReceivedEventArgs.HasAudio} activity={activityReceivedEventArgs.Activity}");

    if (activityReceivedEventArgs.HasAudio)
    {
        SynchronouslyPlayActivityAudio(activityReceivedEventArgs.Audio);
    }
};

// Canceled will be signaled when a turn is aborted or experiences an error condition
connector.Canceled += (sender, canceledEventArgs) =>
{
    NotifyUser($"Canceled, reason={canceledEventArgs.Reason}");
    if (canceledEventArgs.Reason == CancellationReason.Error)
    {
        NotifyUser(
            $"Error: code={canceledEventArgs.ErrorCode}, details={canceledEventArgs.ErrorDetails}");
    }
};

// Recognizing (not 'Recognized') will provide the intermediate recognized text
// while an audio stream is being processed
connector.Recognizing += (sender, recognitionEventArgs) =>
{
    NotifyUser($"Recognizing! in-progress text={recognitionEventArgs.Result.Text}");
};

// Recognized (not 'Recognizing') will provide the final recognized text
// once audio capture is completed
connector.Recognized += (sender, recognitionEventArgs) =>
{
    NotifyUser($"Final speech to text result: '{recognitionEventArgs.Result.Text}'");
};

// SessionStarted will notify when audio begins flowing to the service for a turn
connector.SessionStarted += (sender, sessionEventArgs) =>
{
    NotifyUser($"Now Listening! Session started, id={sessionEventArgs.SessionId}");
};

// SessionStopped will notify when a turn is complete and
// it's safe to begin listening again
connector.SessionStopped += (sender, sessionEventArgs) =>
{
    NotifyUser($"Listening complete. Session ended, id={sessionEventArgs.SessionId}");
};

將下列代碼段新增至類別中 MainPage 方法的ListenButton_ButtonClicked主體

// This code sets up `DialogServiceConnector` to listen, since you already established the configuration and
// registered the event handlers.
if (connector == null)
{
    InitializeDialogServiceConnector();
    // Optional step to speed up first interaction: if not called,
    // connection happens automatically on first use
    var connectTask = connector.ConnectAsync();
}

try
{
    // Start sending audio
    await connector.ListenOnceAsync();
}
catch (Exception ex)
{
    NotifyUser($"Exception: {ex.ToString()}", NotifyType.ErrorMessage);
}

從功能表列，選擇 [ 檔案>儲存全部 ] 以儲存變更

試試看

從功能表列，選擇 [建>置建置方案 ] 以建置應用程式。程式代碼應該編譯而不會發生錯誤。
選擇 [偵>錯開始偵錯] （或按 F5）啟動應用程式。 helloworld 視窗隨即出現。
選取 [ 啟用麥克風]。如果存取權要求快顯，請選取 [ 是]。
選取 [交談]，然後在裝置的麥克風中說出英文片語或句子。您的語音會傳輸至 Direct Line Speech 通道，並轉譯為出現在視窗中的文字。

下一步

操作說明：將活動傳送至用戶端應用程式（預覽）

Share via