定型自訂語音的模型Train a model for Custom Speech

訓練語音轉換文字模型可以改善 Microsoft 基準模型的辨識準確度。Training a speech-to-text model can improve recognition accuracy for Microsoft's baseline model. 模型會使用人類加上標籤的轉譯和相關文字來定型。A model is trained using human-labeled transcriptions and related text. 這些資料集連同先前上傳的音訊資料,可用來精簡和定型語音轉換文字模型。These datasets along with previously uploaded audio data, are used to refine and train the speech-to-text model.

使用訓練來解決精確度問題Use training to resolve accuracy issues

如果您在模型中遇到辨識問題,使用人為標記的文字記錄和相關資料來進行其他訓練,有助於改善正確性。If you're encountering recognition issues with your model, using human-labeled transcripts and related data for additional training can help to improve accuracy. 使用此資料表來判斷要用來解決問題的資料集:Use this table to determine which dataset to use to address your issue(s):

使用案例Use case 資料類型Data type
改善業界特定詞彙和文法的辨識準確度,例如醫學術語或 IT 專業術語。Improve recognition accuracy on industry-specific vocabulary and grammar, such as medical terminology or IT jargon. 相關文字(句子/語句)Related text (sentences/utterances)
定義具有非標準發音之單字或詞彙的拼音文字和顯示形式,例如產品名稱或縮略字。Define the phonetic and displayed form of a word or term that has nonstandard pronunciation, such as product names or acronyms. 相關文字(發音)Related text (pronunciation)
改善說話風格、強調或特定背景雜音的辨識準確度。Improve recognition accuracy on speaking styles, accents, or specific background noises. 音訊 + 人為標記的文字記錄Audio + human-labeled transcripts

重要

如果您尚未上傳資料集,請參閱準備和測試您的資料If you haven't uploaded a data set, please see Prepare and test your data. 本檔提供上傳資料的指示,以及建立高品質資料集的指導方針。This document provides instructions for uploading data, and guidelines for creating high-quality datasets.

定型及評估模型Train and evaluate a model

定型模型的第一個步驟是上傳定型資料。The first step to train a model is to upload training data. 請使用準備和測試您的資料,以取得準備人為標記的轉譯和相關文字(語句和發音)的逐步指示。Use Prepare and test your data for step-by-step instructions to prepare human-labeled transcriptions and related text (utterances and pronunciations). 在您上傳定型資料之後,請遵循下列指示來開始訓練您的模型:After you've uploaded training data, follow these instructions to start training your model:

  1. 登入自訂語音入口網站Sign in to the Custom Speech portal.
  2. 流覽至語音轉換文字 > 自訂語音 > [專案名稱] > 訓練Navigate to Speech-to-text > Custom Speech > [name of project] > Training.
  3. 按一下 [定型模型]。Click Train model.
  4. 接下來,為您的訓練提供名稱描述Next, give your training a Name and Description.
  5. 從 [案例和基準模型] 下拉式功能表中,選取最適合您的網域的案例。From the Scenario and Baseline model drop-down menu, select the scenario that best fits your domain. 如果您不確定要選擇哪一個案例,請選取 [一般]。If you're unsure of which scenario to choose, select General. 基準模型是訓練的起點。The baseline model is the starting point for training. 最新的模型通常是最佳選擇。The latest model is usually the best choice.
  6. 從 [選取定型資料] 頁面中,選擇一或多個音訊 + 個人化的轉譯資料集,您想要用來進行定型。From the Select training data page, choose one or multiple audio + human-labeled transcription datasets that you'd like to use for training.
  7. 定型完成後,您可以選擇對新定型的模型執行精確度測試。Once the training is complete, you can choose to perform accuracy testing on the newly trained model. 此為選用步驟。This step is optional.
  8. 選取 [建立] 以建立您的自訂模型。Select Create to build your custom model.

定型資料表會顯示新的專案,並對應到這個新建立的模型。The Training table displays a new entry that corresponds to this newly created model. 資料表也會顯示狀態: [處理中]、[成功]、[失敗]。The table also displays the status: Processing, Succeeded, Failed.

評估定型模型的精確度Evaluate the accuracy of a trained model

您可以使用下列檔來檢查資料並評估模型的精確度:You can inspect the data and evaluate model accuracy using these documents:

如果您選擇測試精確度,請務必選取與您的模型所用不同的聲場資料集,以實際瞭解模型的效能。If you chose to test accuracy, it's important to select an acoustic dataset that's different from the one you used with your model to get a realistic sense of the model's performance.

後續步驟Next steps

其他資源Additional resources