教學課程:運用傳輸學習與 ML.NET 重新定型 TensorFlow 影像分類器Tutorial: Retrain a TensorFlow image classifier with transfer learning and ML.NET

了解如何使用傳輸學習和 ML.NET 重新定型影像分類 TensorFlow 模型。Learn how to retrain an image classification TensorFlow model with transfer learning and ML.NET. 原始模型已經定型,以將個別影像分類。The original model was trained to classify individual images. 重新定型之後,新模型會將影像組織為各種不同類別。After retraining, the new model organizes the images into broad categories.

若要從頭對影像分類 (英文) 模型進行定型,將會需要設定數以百萬計的參數、眾多已標籤的定型資料,以及大量的計算資源 (數以百計的 GPU 小時)。Training an Image Classification model from scratch requires setting millions of parameters, a ton of labeled training data and a vast amount of compute resources (hundreds of GPU hours). 遷移學習的效能雖然不如從頭對自訂模型進行定型來得有效,但它能讓您透過僅需處理數以千計的影像 (而非數以百萬計的已標籤影像) 來縮短此程序,並以較為快速的方式建置自訂模型 (在不具備 GPU 的電腦上於一小時內便能完成)。While not as effective as training a custom model from scratch, transfer learning allows you to shortcut this process by working with thousands of images vs. millions of labeled images and build a customized model fairly quickly (within an hour on a machine without a GPU).

在本教學課程中,您將了解如何:In this tutorial, you learn how to:

  • 了解問題Understand the problem
  • 重複使用並調整預先定型的模型Reuse and tune the pre-trained model
  • 分類影像Classify Images

什麼是傳輸學習?What is transfer learning?

想想看,如果您可以重複使用已預先定型的模型來解決類似問題,並重新對該模型的所有或部分層面進行定型來使它可以解決您的問題,會有多方便?What if you could reuse a model that's already been pre trained to solve a similar problem and retrain either all or some of the layers of that model to make it solve your problem? 這種重新使用部分已定型模型來建置新模型的技巧,稱為遷移學習This technique of reusing part of an already trained model to build a new model is known as transfer learning.

影像分類範例概觀Image classification sample overview

範例為使用 ML.NET 來建置影像分類器的主控台應用程式,其會重複使用已預先定型的模型搭配少量的定型資料來分類影像。The sample is a console application that uses ML.NET to build an image classifier by reusing a pre-trained model to classify images with a small amount of training data.

您可以在 dotnet/samples 存放庫中找到本教學課程的原始程式碼。You can find the source code for this tutorial at the dotnet/samples repository. 請注意,根據預設,本教學課程的 .NET 專案組態以 .NET Core 2.2 為目標。Note that by default, the .NET project configuration for this tutorial targets .NET core 2.2.

必要條件Prerequisites

選取適當的機器學習工作Select the appropriate machine learning task

深度學習是機器學習的子集,其正為電腦視覺及語音辨識等領域帶來革命性的影響。Deep learning is a subset of Machine Learning, which is revolutionizing areas like Computer Vision and Speech Recognition.

深度學習模型是使用包含多個學習層級的大量已標籤資料 (英文) 及神經網路來定型的。Deep learning models are trained by using large sets of labeled data and neural networks that contain multiple learning layers. 深度學習:Deep learning:

  • 能在某些工作上產生較好的效能,例如電腦視覺。Performs better on some tasks like Computer Vision.

  • 能在龐大資料量上產生較好的效能。Performs well on huge data amounts.

影像分類是常見的機器學習工作,其可讓我們將影像自動分類為多個類別,例如:Image Classification is a common Machine Learning task that allows us to automatically classify images into multiple categories such as:

  • 是否能在影像中偵測到人類面孔。Detecting a human face in an image or not.
  • 偵測貓與狗。Detecting Cats vs. dogs.

或是以下列影像為例,判斷該影像是否為食物、玩具或設備:Or as in the following images determining if an image is a(n) food, toy, or appliance:

披薩影像 玩具熊影像 烤麵包機影像pizza image teddy bear image toaster image

注意

上述影像為 Wikimedia Commons 所有,並具有下列屬性:The preceding images belong to Wikimedia Commons and are attributed as follows:

遷移學習包含幾個策略,例如「重新對所有層級進行定型」 和「倒數第二個層級」 。Transfer learning includes a few strategies, such as retrain all layers and penultimate layer. 本教學課程將會說明並示範使用「倒數第二個層級策略」 的方式。This tutorial will explain and show how to use the penultimate layer strategy. 「倒數第二個層級策略」 會重複使用已預先定型來解決某個特定問題的模型。The penultimate layer strategy reuses a model that's already been pre-trained to solve a specific problem. 該策略接著會將模型的最終層級重新定型,來使它能解決新的問題。The strategy then retrains the final layer of that model to make it solve a new problem. 重複使用預先定型的模型作為新模型的一部分,能為您省下大量的時間和資源。Reusing the pre-trained model as part of your new model will save significant time and resources.

您的影像分類模型會重複使用 Inception model,這是一個以 ImageNet 資料集進行定型的熱門影像辨識模型,其中 TensorFlow 模型會嘗試將影像分類為一千個類別中的其中一個類別,例如「雨傘」、「球衣」和「洗碗機」。Your image classification model reuses the Inception model, a popular image recognition model trained on the ImageNet dataset where the TensorFlow model tries to classify entire images into a thousand classes, like “Umbrella”, “Jersey”, and “Dishwasher”.

Inception v1 model 可以被分類為深度卷積神經網路 (英文),且可以在困難的辨識工作上取得相當合理的效能,並在某些領域上達到與人類相符甚至超越人類的效能。The Inception v1 model can be classified as a deep convolutional neural network and can achieve reasonable performance on hard visual recognition tasks, matching or exceeding human performance in some domains. 該模型/演算法是由數個研究者根據下列原始論文所開發:"Rethinking the Inception Architecture for Computer Vision” by Szegedy, et. al. (英文)The model/algorithm was developed by multiple researchers and based on the original paper: "Rethinking the Inception Architecture for Computer Vision” by Szegedy, et. al.

由於 Inception model 已經搭配數以千計的不同影像預先進行定型,其包含影像識別所需的影像特徵 (英文)。Because the Inception model has already been pre trained on thousands of different images, it contains the image features needed for image identification. 較低的影像特徵層能辨識簡單的特徵 (例如邊緣),而較高的層則能辨識更加複雜的特徵 (例如形狀)。The lower image feature layers recognize simple features (such as edges) and the higher layers recognize more complex features (such as shapes). 最終層會針對較小許多的資料集進行定型,因為您是以已經了解如何分類影像的預先定型模型作為開始。The final layer is trained against a much smaller set of data because you're starting with a pre trained model that already understands how to classify images. 隨著您的模型允許您將影像分類至兩個以上的類別,這將會成為多類別分類器的範例。As your model allows you to classify more than two categories, this is an example of a multi-class classifier.

TensorFlow 是熱門的深度學習及機器學習工具組,其能讓您對深度神經網路 (及一般的數值計算) 進行定型,並已實作為 ML.NET 中的 transformerTensorFlow is a popular deep learning and machine learning toolkit that enables training deep neural networks (and general numeric computations), and is implemented as a transformer in ML.NET. 針對本教學課程,我們會使用它來重複使用 Inception modelFor this tutorial, it's used to reuse the Inception model.

如下圖所示,您會在 .NET Core 或 .NET Framework 應用程式中新增對 ML.NET NuGet 套件的參考。As shown in the following diagram, you add a reference to the ML.NET NuGet packages in your .NET Core or .NET Framework applications. 實際上,ML.NET 會包含並參考原生 TensorFlow 程式庫,其能讓您撰寫能載入現有已定型 TensorFlow 模型檔案的程式碼以用於評分。Under the covers, ML.NET includes and references the native TensorFlow library that allows you to write code that loads an existing trained TensorFlow model file for scoring.

TensorFlow 轉換 ML.NET 架構圖表

Inception model 已針對將影像分類至一千個類別進行定型,但您只需要將影像分類至較小的類別集合,且僅分類至那些集合。The Inception model is trained to classify images into a thousand categories, but you need to classify images in a smaller category set, and only those categories. 這就是 transfer learning (遷移學習) 名稱中 transfer (遷移) 這部分派上用場的時候。Enter the transfer part of transfer learning. 您可以將 Inception model 辨識及分類影像的能力遷移至您自訂影像分類器新的受限類別之中。You can transfer the Inception model's ability to recognize and classify images to the new limited categories of your custom image classifier.

您將會使用三個類別集合來對該模型的最終層進行定型:You're going to retrain the final layer of that model using a set of three categories:

  • Food (食物)Food
  • Toy (玩具)Toy
  • Appliance (設備)Appliance

您的層會使用多維度羅吉斯迴歸演算法 (英文) 來以最快的速度找出正確的類別。Your layer uses a multinomial logistic regression algorithm to find the correct category as quickly as possible. 此演算法在分類時會使用機率來判斷解答,並為正確的類別提供「一」的值,然後為其餘類別提供「零」的值。This algorithm classifies using probabilities to determine the answer, giving a one value to the correct category and a zero value to the others.

資料集DataSet

有兩個資料來源:.tsv 檔案,以及影像檔案。There are two data sources: the .tsv file, and the image files. tags.tsv 檔案包含兩個資料行:第一個會被定義為 ImagePath,而第二個則是對應至影像的 LabelThe tags.tsv file contains two columns: the first one is defined as ImagePath and the second one is the Label corresponding to the image. 下列範例檔案沒有標頭資料列,且看起來像這樣:The following example file doesn't have a header row, and looks like this:

broccoli.jpg    food
pizza.jpg   food
pizza2.jpg  food
teddy2.jpg  toy
teddy3.jpg  toy
teddy4.jpg  toy
toaster.jpg appliance
toaster2.png    appliance

定型及測試影像都位於您將以 zip 檔案下載的資產資料夾中。The training and testing images are located in the assets folders that you'll download in a zip file. 這些影像皆由 Wikimedia Commons 所有。These images belong to Wikimedia Commons.

Wikimedia Commons (英文), the free media repositoryWikimedia Commons, the free media repository. 於 2018 年 10 月 17 日 10:48 擷取自:Retrieved 10:48, October 17, 2018 from:
https://commons.wikimedia.org/wiki/Pizza
https://commons.wikimedia.org/wiki/Toaster
https://commons.wikimedia.org/wiki/Teddy_bear

建立主控台應用程式Create a console application

建立專案Create a project

  1. 建立稱為 "TransferLearningTF" 的 .NET Core 主控台應用程式Create a .NET Core Console Application called "TransferLearningTF".

  2. 安裝「Microsoft.ML NuGet 套件」 :Install the Microsoft.ML NuGet Package:

    在 [方案總管] 中,於您的專案上按一下滑鼠右鍵,然後選取 [管理 NuGet 套件] 。In Solution Explorer, right-click on your project and select Manage NuGet Packages. 選擇 [nuget.org] 作為 [套件來源],選取 [瀏覽] 索引標籤,搜尋 Microsoft.MLChoose "nuget.org" as the Package source, select the Browse tab, search for Microsoft.ML. 按一下 [版本] 下拉式清單,選取清單中的 [1.0.0] 套件,然後選取 [安裝] 按鈕。Click on the Version drop-down, select the 1.0.0 package in the list, and select the Install button. 在 [預覽變更] 對話方塊上,選取 [確定] 按鈕,然後在 [授權接受] 對話方塊上,如果您同意所列套件的授權條款,請選取 [我接受] 。Select the OK button on the Preview Changes dialog and then select the I Accept button on the License Acceptance dialog if you agree with the license terms for the packages listed. 針對 Microsoft.ML.ImageAnalytics v1.0.0Microsoft.ML.TensorFlow v0.12.0 重複這些步驟。Repeat these steps for Microsoft.ML.ImageAnalytics v1.0.0 and Microsoft.ML.TensorFlow v0.12.0.

準備您的資料Prepare your data

  1. 下載專案資產目錄 zip 檔案,然後將它解壓縮。Download The project assets directory zip file, and unzip.

  2. assets 目錄複製到您的 TransferLearningTF 專案目錄中。Copy the assets directory into your TransferLearningTF project directory. 此目錄及其子目錄包含本教學課程所需的資料和支援檔案 (Inception model 除外,您將會在下個步驟中下載並新增它)。This directory and its subdirectories contain the data and support files (except for the Inception model, which you'll download and add in the next step) needed for this tutorial.

  3. 下載 Inception model,然後將它解壓縮。Download the Inception model, and unzip.

  4. 將剛才解壓縮的 inception5h 目錄的內容複製到您 TransferLearningTF 專案的 assets\inputs-train\inception 目錄中。Copy the contents of the inception5h directory just unzipped into your TransferLearningTF project assets\inputs-train\inception directory. 此目錄包含本教學課程所需的模型和其他支援檔案,如下圖所示:This directory contains the model and additional support files needed for this tutorial, as shown in the following image:

    Inception 目錄內容

  5. 在 [方案總管] 中,以滑鼠右鍵按一下資產目錄及子目錄中的每個檔案,然後選取 [內容] 。In Solution Explorer, right-click each of the files in the asset directory and subdirectories and select Properties. 在 [進階] 底下,將 [複製到輸出目錄] 的值變更為 [有更新時才複製] 。Under Advanced, change the value of Copy to Output Directory to Copy if newer.

建立類別及定義路徑Create classes and define paths

Program.cs 檔案頂端新增下列額外的 using 陳述式:Add the following additional using statements to the top of the Program.cs file:

using System;
using System.Collections.Generic;
using System.IO;
using System.Linq;
using Microsoft.ML;
using Microsoft.ML.Data;
using Microsoft.ML.Data.IO;
using Microsoft.ML.Trainers;
using Microsoft.ML.Transforms.Image;

建個全域欄位來保留各種資產的路徑,以及 LabelTokeyImageRealPredictedLabelValue 的全域變數:Create global fields to hold the paths to the various assets, and global variables for the LabelTokey,ImageReal, and PredictedLabelValue:

  • _assetsPath 具有資產的路徑。_assetsPath has the path to the assets.
  • _trainTagsTsv 具有定型影像資料標記 tsv 檔案的路徑。_trainTagsTsv has the path to the training image data tags tsv file.
  • _predictTagsTsv 具有預測影像資料標記 tsv 檔案的路徑。_predictTagsTsv has the path to the prediction image data tags tsv file.
  • _trainImagesFolder 具有用來將模型定型之影像的路徑。_trainImagesFolder has the path to the images used to train the model.
  • _predictImagesFolder 具有要由已定型模型分類之影像的路徑。_predictImagesFolder has the path to the images to be classified by the trained model.
  • _inceptionPb 具有要重複使用以重新對模型進行定型之預先定型 Inception model 的路徑。_inceptionPb has the path to the pre-trained Inception model to be reused to retrain your model.
  • _inputImageClassifierZip 具有用來載入已定型模型的路徑。_inputImageClassifierZip has the path where the trained model is loaded from.
  • _outputImageClassifierZip 包含用來儲存定型模型的路徑。_outputImageClassifierZip has the path where the trained model is saved.
  • LabelTokey 是對應至索引鍵的 Label 值。LabelTokey is the Label value mapped to a key.
  • ImageReal 是包含已預測影像值的資料行。ImageReal is the column containing the predicted image value.
  • PredictedLabelValue 是包含已預測標籤值的資料行。PredictedLabelValue is the column containing the predicted label value.

將下列程式碼新增至 Main 方法正上方的一行,以指定這些路徑和其他變數:Add the following code to the line right above the Main method to specify those paths and the other variables:

static readonly string _assetsPath = Path.Combine(Environment.CurrentDirectory, "assets");
static readonly string _trainTagsTsv = Path.Combine(_assetsPath, "inputs-train", "data", "tags.tsv");
static readonly string _predictImageListTsv = Path.Combine(_assetsPath, "inputs-predict", "data", "image_list.tsv");
static readonly string _trainImagesFolder = Path.Combine(_assetsPath, "inputs-train", "data");
static readonly string _predictImagesFolder = Path.Combine(_assetsPath, "inputs-predict", "data");
static readonly string _predictSingleImage = Path.Combine(_assetsPath, "inputs-predict-single", "data", "toaster3.jpg");
static readonly string _inceptionPb = Path.Combine(_assetsPath, "inputs-train", "inception", "tensorflow_inception_graph.pb");
static readonly string _inputImageClassifierZip = Path.Combine(_assetsPath, "inputs-predict", "imageClassifier.zip");
static readonly string _outputImageClassifierZip = Path.Combine(_assetsPath, "outputs", "imageClassifier.zip");
private static string LabelTokey = nameof(LabelTokey);
private static string PredictedLabelValue = nameof(PredictedLabelValue);

為輸入資料和預測建立一些類別。Create some classes for your input data, and predictions. 將新類別新增至專案:Add a new class to your project:

  1. 在 [方案總管] 中,於專案上按一下滑鼠右鍵,然後選取 [新增] > [新增項目] 。In Solution Explorer, right-click the project, and then select Add > New Item.

  2. 在 [加入新項目] 對話方塊中,選取 [類別] ,然後將 [名稱] 欄位變更為 ImageData.csIn the Add New Item dialog box, select Class and change the Name field to ImageData.cs. 接著,選取 [新增] 按鈕。Then, select the Add button.

    ImageData.cs 檔案隨即在程式碼編輯器中開啟。The ImageData.cs file opens in the code editor. 將下列 using 陳述式新增至 ImageData.cs 最上方:Add the following using statement to the top of ImageData.cs:

using Microsoft.ML.Data;

移除現有的類別定義,然後將下列適用於 ImageData 類別的程式碼新增至 ImageData.cs 檔案:Remove the existing class definition and add the following code for the ImageData class to the ImageData.cs file:

public class ImageData
{
    [LoadColumn(0)]
    public string ImagePath;

    [LoadColumn(1)]
    public string Label;
}

ImageData 是輸入資料集類別,並具有下列 String 欄位:ImageData is the input image data class and has the following String fields:

  • ImagePath 包含影像檔案名稱。ImagePath contains the image file name.
  • Label 包含影像標籤的值。Label contains a value for the image label.

將新類別新增至 ImagePrediction 的專案:Add a new class to your project for ImagePrediction:

  1. 在 [方案總管] 中,於專案上按一下滑鼠右鍵,然後選取 [新增] > [新增項目] 。In Solution Explorer, right-click the project, and then select Add > New Item.

  2. 在 [加入新項目] 對話方塊中,選取 [類別] ,然後將 [名稱] 欄位變更為 ImagePrediction.csIn the Add New Item dialog box, select Class and change the Name field to ImagePrediction.cs. 接著,選取 [新增] 按鈕。Then, select the Add button.

    ImagePrediction.cs 檔案隨即在程式碼編輯器中開啟。The ImagePrediction.cs file opens in the code editor. 移除 ImagePrediction.cs 頂端的 System.Collections.GenericSystem.Text using 陳述式:Remove both the System.Collections.Generic and the System.Text using statements at the top of ImagePrediction.cs:

移除現有的類別定義,然後將下列程式碼 (其具有 ImagePrediction 類別) 新增至 ImagePrediction.cs 檔案:Remove the existing class definition and add the following code, which has the ImagePrediction class, to the ImagePrediction.cs file:

public class ImagePrediction : ImageData
{
    public float[] Score;

    public string PredictedLabelValue;
}

ImagePrediction 是影像預測類別,並具有下列欄位:ImagePrediction is the image prediction class and has the following fields:

  • Score 包含指定影像分類的信賴百分比。Score contains the confidence percentage for a given image classification.
  • PredictedLabelValue 包含已預測影像分類標籤的值。PredictedLabelValue contains a value for the predicted image classification label.

ImagePrediction 是在模型定型後,用來進行預測的類別。ImagePrediction is the class used for prediction after the model has been trained. 它具有影像路徑的 string (ImagePath)。It has a string (ImagePath) for the image path. Label 會被用來進行模型的重複使用和重新定型。The Label is used to reuse and retrain the model. PredictedLabelValue 的使用時機是在進行預測和評估的期間。The PredictedLabelValue is used during prediction and evaluation. 就評估而言,會使用含有定型資料、預設值及模型的輸入。For evaluation, an input with training data, the predicted values, and the model are used.

MLContext 類別是所有 ML.NET 作業的起點,且初始化 mlContext 會建立新的 ML.NET 環境,其可在模型建立工作流程物件之間共用。The MLContext class is a starting point for all ML.NET operations, and initializing mlContext creates a new ML.NET environment that can be shared across the model creation workflow objects. 就概念而言,類似於 Entity Framework 中的 DBContextIt's similar, conceptually, to DBContext in Entity Framework.

在 Main 中初始化變數Initialize variables in Main

搭配 MLContext 的新執行個體來初始化 mlContext 變數。Initialize the mlContext variable with a new instance of MLContext. Main 方法中,以下列程式碼取代 Console.WriteLine("Hello World!")Replace the Console.WriteLine("Hello World!") line with the following code in the Main method:

MLContext mlContext = new MLContext(seed: 1);

建立適用於預設參數的結構Create a struct for default parameters

Inception model 具有數個您需要傳遞的預設參數。The Inception model has several default parameters you need to pass in. Main() 方法正後方使用下列程式碼來建立結構,以將預設參數值對應至易記名稱:Create a struct to map the default parameter values to friendly names with the following code, just after the Main() method:

private struct InceptionSettings
{
    public const int ImageHeight = 224;
    public const int ImageWidth = 224;
    public const float Mean = 117;
    public const float Scale = 1;
    public const bool ChannelsLast = true;
}

建立顯示公用程式方法Create a display utility method

因為您將會不只一次顯示影像資料和相關預測,所以請建立顯示公用程式方法來處理顯示影像和預測結果。Since you'll display the image data and the related predictions more than once, create a display utility method to handle displaying the image and prediction results.

DisplayResults() 方法會執行下列工作:The DisplayResults() method executes the following tasks:

  • 顯示預測的結果。Displays the predicted results.

使用下列程式碼,在緊接著 InceptionSettings 結構之後,建立 DisplayResults() 方法:Create the DisplayResults() method, just after the InceptionSettings struct, using the following code:

private static void DisplayResults(IEnumerable<ImagePrediction> imagePredictionData)
{

}

Transform() 方法會在 ImagePrediction 及預測欄位中填入 ImagePathThe Transform() method populated ImagePath in ImagePrediction along with the predicted fields. 隨著 ML.NET 處理的進行,每個元件都會新增資料行,使其易於顯示結果:As the ML.NET process progresses, each component adds columns, and this makes it easy to display the results:

foreach (ImagePrediction prediction in imagePredictionData)
{
    Console.WriteLine($"Image: {Path.GetFileName(prediction.ImagePath)} predicted as: {prediction.PredictedLabelValue} with score: {prediction.Score.Max()} ");
}

您將會在兩個影像分類方法中呼叫 DisplayResults() 方法。You'll call the DisplayResults() method in the two image classification methods.

建立 .tsv 檔案公用程式方法Create a .tsv file utility method

ReadFromTsv() 方法會執行下列工作:The ReadFromTsv() method executes the following tasks:

  • 讀取影像資料 tags.tsv 檔案。Reads the image data tags.tsv file.
  • 將檔案路徑新增至影像檔案名稱。Adds the file path to the image file name.
  • 將檔案資料載入 IEnumerableImageData 物件。Loads the file data into an IEnumerableImageData object.

請使用下列程式碼,在緊接著 PairAndDisplayResults() 方法之後,建立 ReadFromTsv() 方法:Create the ReadFromTsv() method, just after the PairAndDisplayResults() method, using the following code:

public static IEnumerable<ImageData> ReadFromTsv(string file, string folder)
{

}

下列程式碼會剖析 tags.tsv 檔案以新增 ImagePath 屬性之影像檔案名稱的檔案路徑,然後將它和 Label 載入 ImageData 物件。The following code parses through the tags.tsv file to add the file path to the image file name for the ImagePath property and load it and the Label into an ImageData object. 將它新增為 ReadFromTsv() 方法的第一行。Add it as the first line of the ReadFromTsv() method. 您需要完整的檔案路徑以顯示預測結果。You need the fully qualified file path to display the prediction results.

return File.ReadAllLines(file)
 .Select(line => line.Split('\t'))
 .Select(line => new ImageData()
 {
     ImagePath = Path.Combine(folder, line[0])
 });

ML.NET 有三個主要概念:資料轉換器以及估算工具There are three major concepts in ML.NET: Data, Transformers, and Estimators.

重複使用並調整預先定型的模型Reuse and tune pre-trained model

將下列呼叫新增至 ReuseAndTuneInceptionModel() 方法作為 Main() 方法中的下一行程式碼:Add the following call to the ReuseAndTuneInceptionModel()method as the next line of code in the Main() method:

var model = ReuseAndTuneInceptionModel(mlContext, _trainTagsTsv, _trainImagesFolder, _inceptionPb, _outputImageClassifierZip);

ReuseAndTuneInceptionModel() 方法會執行下列工作:The ReuseAndTuneInceptionModel() method executes the following tasks:

  • 載入資料Loads the data
  • 擷取並轉換資料。Extracts and transforms the data.
  • 為 TensorFlow 模型評分。Scores the TensorFlow model.
  • 調整 (重新定型) 模型。Tunes (retrains) the model.
  • 顯示模型結果。Displays model results.
  • 評估模型。Evaluates the model.
  • 傳回模型。Returns the model.

使用下列程式碼,緊接在 InceptionSettings 結構之後及 DisplayResults() 方法之前,建立 ReuseAndTuneInceptionModel() 方法:Create the ReuseAndTuneInceptionModel() method, just after the InceptionSettings struct and just before the DisplayResults() method, using the following code:

public static ITransformer ReuseAndTuneInceptionModel(MLContext mlContext, string dataLocation, string imagesFolder, string inputModelLocation, string outputModelLocation)
{

}

載入資料Load the data

ML.NET 中的資料以 IDataView 類別 表示。Data in ML.NET is represented as an IDataView class. IDataView 是彈性且有效率的表格式資料描述方式 (數值和文字)。IDataView is a flexible, efficient way of describing tabular data (numeric and text). 資料可以從文字或即時 (例如 SQL 資料庫或記錄檔) 載入至 IDataView 物件。Data can be loaded from a text file or in real time (for example, SQL database or log files) to an IDataView object.

使用 MLContext.Data.LoadFromTextFile 包裝函式載入資料。Load the data using the MLContext.Data.LoadFromTextFile wrapper. 將下列程式碼加入為 ReuseAndTuneInceptionModel() 方法中的下一行:Add the following code as the next line in the ReuseAndTuneInceptionModel() method:

var data = mlContext.Data.LoadFromTextFile<ImageData>(path: dataLocation, hasHeader: false);

擷取 Features 並傳輸資料Extract Features and transform the data

資料前處理和清除是相當重要的工作,發生在有效地使用資料集來進行機器學習之前。Pre-processing and cleaning data are important tasks that occur before a dataset is used effectively for machine learning. 使用資料時,如果未進行這些模型化工作,可能會產生誤導的結果。Using data without these modeling tasks can produce misleading results.

機器學習演算法能了解特徵化的資料,且在處理深度神經網路時,您必須將影像調整為網路所預期的格式。Machine learning algorithms understand featurized data, and when dealing with deep neural networks you must adapt the images to the format expected by the network. 該格式為數值向量That format is a numeric vector.

在定型和評估之後,搭配 [標籤] 資料行值進行預測。After training and evaluation, predict with the Label column values. 由於您是使用預先定型的模型,請使用 MapValueToKey() 方法將藍為對應至新的模型。As you're using a pre-trained model, map fields to the new model with the MapValueToKey() method. 此方法會將 Label 轉換為數值索引鍵類型 (LabelTokey) 的資料行,並將它新增為新的資料集資料行:請為此 estimator 命名,因為您也會為它新增定型器。This method transforms the Label into a numeric key type (LabelTokey) column and add it as new dataset column: Name this estimator as you'll also add the trainer to it. 新增下列程式碼:Add the next line of code:

var estimator = mlContext.Transforms.Conversion.MapValueToKey(outputColumnName: LabelTokey, inputColumnName: "Label")

您的影像處理估算工具會使用預先定型的深度神經網路 (DNN) 特徵化工具來擷取特徵。Your image processing estimator uses pre-trained Deep Neural Network(DNN) featurizers for feature extraction. 處理深度神經網路時,您必須將影像調整為預期的網路格式。When dealing with deep neural networks, you adapt the images to the expected network format. 這就是為何您會使用數種影像轉換,來將影像資料轉換為模型預期的形式:This is the reason you use several image transforms to get the image data into the model's expected form:

  1. LoadImages 轉換所處理的影像,會以點陣圖類型的形式載入記憶體。The LoadImagestransform images are loaded in memory as a Bitmap type.
  2. ResizeImages 轉換會調整影像大小,因為預先定型的模組具有已定義的輸入影像寬度和高度。The ResizeImages transform resizes the images as the pre-trained model has a defined input image width and height.
  3. ExtractPixels 轉換會從輸入影像擷取像素,並將它們轉換成數值向量。The ExtractPixels transform extracts the pixels from the input images and converts them into a numeric vector.

將這些影像轉換新增為後續的程式碼行:Add these image transforms as the next lines of code:

.Append(mlContext.Transforms.LoadImages(outputColumnName: "input", imageFolder: _trainImagesFolder, inputColumnName: nameof(ImageData.ImagePath)))
.Append(mlContext.Transforms.ResizeImages(outputColumnName: "input", imageWidth: InceptionSettings.ImageWidth, imageHeight: InceptionSettings.ImageHeight, inputColumnName: "input"))
.Append(mlContext.Transforms.ExtractPixels(outputColumnName: "input", interleavePixelColors: InceptionSettings.ChannelsLast, offsetImage: InceptionSettings.Mean))

LoadTensorFlowModel 是一種便利方法,允許載入 TensorFlow 模型一次,接著便會使用 ScoreTensorFlowModel 建立 TensorFlowEstimatorThe LoadTensorFlowModel is a convenience method that allows the TensorFlow model to be loaded once and then creates the TensorFlowEstimator using ScoreTensorFlowModel. ScoreTensorFlowModel 會擷取指定的輸出 (Inception model 的影像特徵 softmax2_pre_activation),並使用預先定型的 TensorFlow 模型為資料集評分。The ScoreTensorFlowModel extracts specified outputs (the Inception model's image features softmax2_pre_activation), and scores a dataset using the pre-trained TensorFlow model.

softmax2_pre_activation 會透過判斷影像所屬的類別來協助模型。softmax2_pre_activation assists the model with determining which class the images belongs to. softmax2_pre_activation 會傳回每個類別適用於某個影像的機率,且那些機率的總和必須為 1。softmax2_pre_activation returns a probability for each of the categories for an image, and all of those probabilities must add up to 1. 它假設某個影像將會僅屬於單一類別,如下列範例所示:It assumes that an image will belong to only one category, as shown in the following example:

類別Class 機率Probability
Food 0.0010.001
Toy 0.950.95
Appliance 0.060.06

搭配下列程式碼將 TensorFlowTransform 附加至 estimatorAppend the TensorFlowTransform to the estimator with the following line of code:

.Append(mlContext.Model.LoadTensorFlowModel(inputModelLocation).
    ScoreTensorFlowModel(outputColumnNames: new[] { "softmax2_pre_activation" }, inputColumnNames: new[] { "input" }, addBatchDimensionInput: true))

選擇的定型演算法Choose a training algorithm

若要新增定型演算法,請呼叫 mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy() 包裝函式方法。To add the training algorithm, call the mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy() wrapper method. LbfgsMaximumEntropy 會附加到 estimator,並接受 Inception 影像特徵 (softmax2_pre_activation) 及 Label 輸入參數來從歷史資料學習。The LbfgsMaximumEntropy is appended to the estimator and accepts the Inception image features (softmax2_pre_activation) and the Label input parameters to learn from the historic data. 使用下列程式碼來新增定型器:Add the trainer with the following code:

.Append(mlContext.MulticlassClassification.Trainers.LbfgsMaximumEntropy(labelColumnName: LabelTokey, featureColumnName: "softmax2_pre_activation"))

您也必須將 predictedlabel 對應至 predictedlabelvalueYou also need to map the predictedlabel to the predictedlabelvalue:

.Append(mlContext.Transforms.Conversion.MapKeyToValue(PredictedLabelValue, "PredictedLabel"))
.AppendCacheCheckpoint(mlContext);

Fit() 方法會透過轉換資料集及套用定型,來定型您的模型。The Fit() method trains your model by transforming the dataset and applying the training. 將下列內容新增為 ReuseAndTuneInceptionModel() 方法中的下一行程式碼,調整模型使其符合定型資料集,並傳回已定型的模型:Fit the model to the training dataset and return the trained model by adding the following as the next line of code in the ReuseAndTuneInceptionModel() method:

ITransformer model = estimator.Fit(data);

Transform() 方法會對測試資料集之多個提供的輸入資料列進行預測。The Transform() method makes predictions for multiple provided input rows of a test dataset. 將下列程式碼新增至 ReuseAndTuneInceptionModel() 以轉換 Training 資料:Transform the Training data by adding the following code to ReuseAndTuneInceptionModel():

var predictions = model.Transform(data);

將您的影像資料和預測 DataViews 轉換為強類型的 IEnumerables 以配對並輕鬆顯示。Convert your image data and prediction DataViews into strongly-typed IEnumerables to pair for easier display. 若要這麼做,請搭配下列程式碼使用 MLContext.CreateEnumerable() 方法:Use the MLContext.CreateEnumerable() method to do that, using the following code:

var imageData = mlContext.Data.CreateEnumerable<ImageData>(data, false, true);
var imagePredictionData = mlContext.Data.CreateEnumerable<ImagePrediction>(predictions, false, true);

呼叫 DisplayResults() 方法來顯示您的資料和預測,作為 ReuseAndTuneInceptionModel() 方法中的下一行:Call the DisplayResults() method to display your data and predictions as the next line in the ReuseAndTuneInceptionModel() method:

DisplayResults(imagePredictionData);

當您具有預測之後,請設定 Evaluate() 方法:Once you have the prediction set, the Evaluate() method:

  • 評估模型 (將預測的值與實際的資料集 Labels 比較)。Assesses the model (compares the predicted values with the actual dataset Labels).

  • 傳回模型效能計量。Returns the model performance metrics.

將下列程式碼加入 ReuseAndTuneInceptionModel() 方法中作為的下一行:Add the following code to the ReuseAndTuneInceptionModel() method as the next line:

var multiclassContext = mlContext.MulticlassClassification;
var metrics = multiclassContext.Evaluate(predictions, labelColumnName: LabelTokey, predictedLabelColumnName: "PredictedLabel");

下列計量會針對影像分類進行評估:The following metrics are evaluated for image classification:

  • Log-loss:請參閱記錄檔遺失Log-loss - see Log Loss. 建議讓記錄檔遺失盡量接近零。You want Log-loss to be as close to zero as possible.

  • Per class Log-loss.Per class Log-loss. 建議讓每個類別的記錄檔遺失盡量接近零。You want per class Log-loss to be as close to zero as possible.

使用下列程式碼來顯示計量、共用結果,然後依結果採取動作:Use the following code to display the metrics, share the results, and then act on them:

Console.WriteLine($"LogLoss is: {metrics.LogLoss}");
Console.WriteLine($"PerClassLogLoss is: {String.Join(" , ", metrics.PerClassLogLoss.Select(c => c.ToString()))}");

新增下列程式碼來將定型後的模型作為下一行傳回:Add the following code to return the trained model as the next line:

return model;

搭配已載入的模型分類影像Classify images with a loaded model

將下列呼叫新增至 ClassifyImages() 方法作為 Main 方法中的下一行程式碼:Add the following call to the ClassifyImages() method as the next line of code in the Main method:

ClassifyImages(mlContext, _predictImageListTsv, _predictImagesFolder, _outputImageClassifierZip, model);

ClassifyImages() 方法會執行下列工作:The ClassifyImages() method executes the following tasks:

  • 將 .TSV 檔案讀取至 IEnumerableReads .TSV file into IEnumerable.
  • 根據測試資料預測影像分類。Predicts image classifications based on test data.

使用下列程式碼,緊接在 ReuseAndTuneInceptionModel() 方法之後及 PairAndDisplayResults() 方法之前,建立 ClassifyImages() 方法:Create the ClassifyImages() method, just after the ReuseAndTuneInceptionModel() method and just before the PairAndDisplayResults() method, using the following code:

public static void ClassifyImages(MLContext mlContext, string dataLocation, string imagesFolder, string outputModelLocation, ITransformer model)
{

}

首先,呼叫 ReadFromTsv() 方法來建立 IEnumerable<ImageData> 類別,其中包含每個 ImagePath 的完整路徑。First, call the ReadFromTsv() method to create an IEnumerable<ImageData> class that contains the fully qualified path for each ImagePath. 您需要該檔案路徑來配對資料與預測結果。You need that file path to pair your data and prediction results. 您也需要將 IEnumerable<ImageData> 類別轉換為將用來進行預測的 IDataViewYou also need to convert the IEnumerable<ImageData> class to an IDataView that you will use to predict. 將下列程式碼新增為 ClassifyImages() 方法中的下兩行:Add the following code as the next two lines in the ClassifyImages() method:

var imageData = ReadFromTsv(dataLocation, imagesFolder);
var imageDataView = mlContext.Data.LoadFromEnumerable<ImageData>(imageData);

和您先前針對定型影像資料所做的相同,請使用傳入模型的 Transform() 方法來預測測試影像資料的類別。As you did previously with the training image data, predict the category of the test image data using the Transform() method of the model passed in. 將下列程式碼新增至 ClassifyImages() 方法以取得預測,並將 predictions IDataView 轉換為 IEnumerable 以進行配對及顯示:Add the following code to the ClassifyImages() method for the predictions and to convert the predictions IDataView into an IEnumerable for pairing and display:

var predictions = model.Transform(imageDataView);
var imagePredictionData = mlContext.Data.CreateEnumerable<ImagePrediction>(predictions, false, true);

若要配對及顯示您的測試影像資料和預測,請將下列程式碼新增為 ClassifyImages() 方法中的下一行,以呼叫先前所建立的 DisplayResults() 方法:To pair and display your test image data and predictions, add the following code to call the DisplayResults() method previously created as the next line in the ClassifyImages() method:

DisplayResults(imagePredictionData);

搭配已載入的模型分類單一影像Classify a single image with a loaded model

將下列呼叫新增至 ClassifySingleImage() 方法作為 Main 方法中的下一行程式碼:Add the following call to the ClassifySingleImage() method as the next line of code in the Main method:

ClassifySingleImage(mlContext, _predictSingleImage, _outputImageClassifierZip, model);

ClassifySingleImage() 方法會執行下列工作:The ClassifySingleImage() method executes the following tasks:

  • 載入 ImageData 執行個體。Loads an ImageData instance.
  • 根據測試資料預測影像分類。Predicts image classification based on test data.

使用下列程式碼,緊接在 ClassifyImages() 方法之後及 PairAndDisplayResults() 方法之前,建立 ClassifySingleImage() 方法:Create the ClassifySingleImage() method, just after the ClassifyImages() method and just before the PairAndDisplayResults() method, using the following code:

public static void ClassifySingleImage(MLContext mlContext, string imagePath, string outputModelLocation, ITransformer model)
{

}

首先,建立包含單一 ImagePath 完整路徑和影像檔案名稱的 ImageData 類別。First, create an ImageData class that contains the fully qualified path and image file name for the single ImagePath. 將下列程式碼新增為 ClassifySingleImage() 方法中的下一行:Add the following code as the next lines in the ClassifySingleImage() method:

var imageData = new ImageData()
{
    ImagePath = imagePath
};

PredictionEngine 類別是能針對單一資料執行個體執行預測的方便 API。The PredictionEngine class is a convenience API that performs a prediction on a single instance of data. Predict() 函式會在資料的單一資料行進行預測。The Predict() function makes a prediction on a single column of data. 將下列程式碼新增至 ClassifySingleImage() 來將 imageData 傳遞至 PredictionEngine 以預測影像類別:Pass imageData to the PredictionEngine to predict the image category by adding the following code to ClassifySingleImage():

// Make prediction function (input = ImageData, output = ImagePrediction)
var predictor = mlContext.Model.CreatePredictionEngine<ImageData, ImagePrediction>(model);
var prediction = predictor.Predict(imageData);

將預測結果顯示為 ClassifySingleImage() 方法中的下一行程式碼:Display the prediction result as the next line of code in the ClassifySingleImage() method:

Console.WriteLine($"Image: {Path.GetFileName(imageData.ImagePath)} predicted as: {prediction.PredictedLabelValue} with score: {prediction.Score.Max()} ");

結果Results

完成上述步驟後,請執行主控台應用程式 (Ctrl + F5)。After following the previous steps, run your console app (Ctrl + F5). 您的結果應該與下列輸出類似。Your results should be similar to the following output. 您可能會看到警告或處理中訊息,但為了讓結果變得清楚,這些訊息已從下列結果中移除。You may see warnings or processing messages, but these messages have been removed from the following results for clarity.

=============== Training classification model ===============
Image: broccoli.jpg predicted as: food with score: 0.976743
Image: pizza.jpg predicted as: food with score: 0.9751652
Image: pizza2.jpg predicted as: food with score: 0.9660203
Image: teddy2.jpg predicted as: toy with score: 0.9748783
Image: teddy3.jpg predicted as: toy with score: 0.9829691
Image: teddy4.jpg predicted as: toy with score: 0.9868168
Image: toaster.jpg predicted as: appliance with score: 0.9769174
Image: toaster2.png predicted as: appliance with score: 0.9800823
=============== Classification metrics ===============
LogLoss is: 0.0228266745633507
PerClassLogLoss is: 0.0277501705149937 , 0.0186303530571291 , 0.0217359128952187
=============== Making classifications ===============
Image: broccoli.png predicted as: food with score: 0.905548
Image: pizza3.jpg predicted as: food with score: 0.9709008
Image: teddy6.jpg predicted as: toy with score: 0.9750155
=============== Making single image classification ===============
Image: toaster3.jpg predicted as: appliance with score: 0.9625379

C:\Program Files\dotnet\dotnet.exe (process 4304) exited with code 0.
Press any key to close this window . . .

恭喜您!Congratulations! 您已透過在 ML.NET 中重複使用已預先定型的 TensorFlow 模型,成功建置出可用來分類影像的機器學習模型。You've now successfully built a machine learning model for image classification by reusing a pre-trained TensorFlow model in ML.NET.

您可以在 dotnet/samples 存放庫中找到本教學課程的原始程式碼。You can find the source code for this tutorial at the dotnet/samples repository.

在本教學課程中,您將了解如何:In this tutorial, you learned how to:

  • 了解問題Understand the problem
  • 重複使用並調整預先定型的模型Reuse and tune the pre-trained model
  • 搭配已載入的模型分類影像Classify images with a loaded model

請查看機器學習範例 GitHub 存放庫,以探索更複雜的影像分類範例。Check out the Machine Learning samples GitHub repository to explore an expanded image classification sample.