教程:在 ML.NET 中使用二元分类分析网站评论的情绪Tutorial: Analyze sentiment of website comments with binary classification in ML.NET

本教程演示如何创建 .NET Core 控制台应用程序,该应用程序对网站评论情绪进行分类并采取适当的措施。This tutorial shows you how to create a .NET Core console application that classifies sentiment from website comments and takes the appropriate action. 二元情绪分类器在 Visual Studio 2017 中使用 C#。The binary sentiment classifier uses C# in Visual Studio 2017.

在本教程中,你将了解:In this tutorial, you learn how to:

  • 创建控制台应用程序Create a console application
  • 准备数据Prepare data
  • 加载数据Load the data
  • 生成和定型模型Build and train the model
  • 评估模型Evaluate the model
  • 使用模型进行预测Use the model to make a prediction
  • 查看结果See the results

可以在 dotnet/samples 存储库中找到本教程的源代码。You can find the source code for this tutorial at the dotnet/samples repository.

先决条件Prerequisites

创建控制台应用程序Create a console application

  1. 创建名为“SentimentAnalysis”的 .NET Core 控制台应用程序Create a .NET Core Console Application called "SentimentAnalysis".

  2. 在项目中创建名为“Data”的目录,用于保存数据集文件。Create a directory named Data in your project to save your data set files.

  3. 安装“Microsoft.ML NuGet 包”:Install the Microsoft.ML NuGet Package:

    在“解决方案资源管理器”中,右键单击项目,然后选择“管理 NuGet 包”。In Solution Explorer, right-click on your project and select Manage NuGet Packages. 选择“nuget.org”作为包源,然后选择“浏览”选项卡。搜索“Microsoft.ML”,选择所需的包,然后选择“安装”按钮。Choose "nuget.org" as the package source, and then select the Browse tab. Search for Microsoft.ML, select the package you want, and then select the Install button. 同意所选包的许可条款,继续执行安装。Proceed with the installation by agreeing to the license terms for the package you choose. Microsoft.ML.FastTree NuGet 包执行相同操作。Do the same for the Microsoft.ML.FastTree NuGet package.

准备数据Prepare your data

备注

本教程的数据集摘自 KDD 2015 中由 Kotzias 等The datasets for this tutorial are from the 'From Group to Individual Labels using Deep Features', Kotzias et. 提出的“From Group to Individual Labels using Deep Features”,al,. 并托管在 UCI 机器学习存储库中(Dua, D. 和 Karra Taniskidou, E.(2017))。KDD 2015, and hosted at the UCI Machine Learning Repository - Dua, D. and Karra Taniskidou, E. (2017). UCI 机器学习存储库 [http://archive.ics.uci.edu/ml]。UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. 加利福尼亚州,加利福尼亚大学:欧文分校,信息与计算机科学学院。Irvine, CA: University of California, School of Information and Computer Science.

  1. 下载 UCI Sentiment Labeled Sentences 数据集 zip 文件并解压缩。Download UCI Sentiment Labeled Sentences dataset ZIP file, and unzip.

  2. yelp_labelled.txt 文件复制到已创建的“Data”目录中。Copy the yelp_labelled.txt file into the Data directory you created.

  3. 在“解决方案资源管理器”中,右键单击 yelp_labeled.txt 文件并选择“属性”。In Solution Explorer, right-click the yelp_labeled.txt file and select Properties. 在“高级”下,将“复制到输出目录”的值更改为“如果较新则复制”。Under Advanced, change the value of Copy to Output Directory to Copy if newer.

创建类和定义路径Create classes and define paths

  1. 将以下附加的 using 语句添加到“Program.cs”文件顶部:Add the following additional using statements to the top of the Program.cs file:

    using System;
    using System.Collections.Generic;
    using System.IO;
    using System.Linq;
    using Microsoft.ML;
    using Microsoft.ML.Data;
    using static Microsoft.ML.DataOperationsCatalog;
    using Microsoft.ML.Trainers;
    using Microsoft.ML.Transforms.Text;
    
  2. 将以下代码添加到 Main 方法正上方的行中,以创建字段来保存最近下载的数据集文件路径:Add the following code to the line right above the Main method, to create a field to hold the recently downloaded dataset file path:

    static readonly string _dataPath = Path.Combine(Environment.CurrentDirectory, "Data", "yelp_labelled.txt");
    
  3. 接下来,为输入数据和预测结果创建类。Next, create classes for your input data and predictions. 向项目添加一个新类:Add a new class to your project:

    • 在“解决方案资源管理器”中,右键单击项目,然后选择“添加” > “新项”。In Solution Explorer, right-click the project, and then select Add > New Item.

    • 在“添加新项”对话框中,选择“类”并将“名称”字段更改为“SentimentData.cs”。In the Add New Item dialog box, select Class and change the Name field to SentimentData.cs. 然后,选择“添加”按钮。Then, select the Add button.

  4. “SentimentData.cs”文件随即在代码编辑器中打开。The SentimentData.cs file opens in the code editor. 将下面的 using 语句添加到 SentimentData.cs 的顶部:Add the following using statement to the top of SentimentData.cs:

    using Microsoft.ML.Data;
    
  5. 删除现有类定义并向“SentimentData.cs”文件添加以下代码,其中有两个类 SentimentDataSentimentPredictionRemove the existing class definition and add the following code, which has two classes SentimentData and SentimentPrediction, to the SentimentData.cs file:

    public class SentimentData
    {
        [LoadColumn(0)]
        public string SentimentText;
    
        [LoadColumn(1), ColumnName("Label")]
        public bool Sentiment;
    }
    
    public class SentimentPrediction : SentimentData
    {
    
        [ColumnName("PredictedLabel")]
        public bool Prediction { get; set; }
    
        public float Probability { get; set; }
    
        public float Score { get; set; }
    }
    

如何准备数据How the data was prepared

输入数据集类 SentimentData 拥有一个用于用户评论 (SentimentText) 的 string,以及一个用于情绪的 bool (Sentiment),值为 1(正面)或 0(负面)。The input dataset class, SentimentData, has a string for user comments (SentimentText) and a bool (Sentiment) value of either 1 (positive) or 0 (negative) for sentiment. 这两个字段都附加了 LoadColumn 特性,其描述了每个字段的数据文件顺序。Both fields have LoadColumn attributes attached to them, which describes the data file order of each field. 此外,Sentiment 属性具有 ColumnName 特性,以将其指定为 Label 字段。In addition, the Sentiment property has a ColumnName attribute to designate it as the Label field. 下面的示例文件没有标题行,如下所示:The following example file doesn't have a header row, and looks like this:

情绪文本SentimentText 情绪(标签)Sentiment (Label)
女服务员的服务速度有点慢。Waitress was a little slow in service. 00
酥皮不行。Crust is not good. 00
哇...喜欢这个地方。Wow... Loved this place. 11
服务很及时。Service was very prompt. 11

SentimentPrediction 是在模型训练后使用的预测类。SentimentPrediction is the prediction class used after model training. 它继承自 SentimentData,因此输入 SentimentText 可与输出预测结果一并显示。It inherits from SentimentData so that the input SentimentText can be displayed along with the output prediction. Prediction 布尔值是随附新的输入 SentimentText 提供时模型预测出的值。The Prediction boolean is the value that the model predicts when supplied with new input SentimentText.

输出类 SentimentPrediction 包含另外两个由模型计算得出的属性:Score(模型计算得出的原始分数)和 Probability(校准到具有积极情绪的文本几率的分数)。The output class SentimentPrediction contains two other properties calculated by the model: Score - the raw score calculated by the model, and Probability - the score calibrated to the likelihood of the text having positive sentiment.

在本教程中,最重要的属性是 PredictionFor this tutorial, the most important property is Prediction.

加载数据Load the data

ML.NET 中的数据表示为 IDataView 类Data in ML.NET is represented as an IDataView class. IDataView 是用于描述表格数据(数字和文本)的一种灵活且有效的方法。IDataView is a flexible, efficient way of describing tabular data (numeric and text). 可从文本文件或实时(例如,SQL 数据库或日志文件)将数据加载到 IDataView 对象。Data can be loaded from a text file or in real time (for example, SQL database or log files) to an IDataView object.

MLContext 类是所有 ML.NET 操作的起点。The MLContext class is a starting point for all ML.NET operations. 初始化 mlContext 会创建一个新的 ML.NET 环境,可在模型创建工作流对象之间共享该环境。Initializing mlContext creates a new ML.NET environment that can be shared across the model creation workflow objects. 从概念上讲,它与实体框架中的 DBContext 类似。It's similar, conceptually, to DBContext in Entity Framework.

准备应用,然后加载数据:You prepare the app, and then load data:

  1. 使用以下代码替换 Main 方法中的 Console.WriteLine("Hello World!") 行,以声明和初始化 mlContext 变量:Replace the Console.WriteLine("Hello World!") line in the Main method with the following code to declare and initialize the mlContext variable:

    MLContext mlContext = new MLContext();
    
  2. 将以下代码作为下一行代码添加到 Main() 方法中:Add the following as the next line of code in the Main() method:

    TrainTestData splitDataView = LoadData(mlContext);
    
  3. 使用下面的代码紧随 Main() 方法后创建 LoadData() 方法:Create the LoadData() method, just after the Main() method, using the following code:

    public static TrainTestData LoadData(MLContext mlContext)
    {
    
    }
    

    LoadData() 方法执行以下任务:The LoadData() method executes the following tasks:

    • 加载数据。Loads the data.
    • 将加载的数据集拆分为训练数据集和测试数据集。Splits the loaded dataset into train and test datasets.
    • 返回拆分的训练数据集和测试数据集。Returns the split train and test datasets.
  4. 将以下代码添加为 LoadData() 方法的首行:Add the following code as the first line of the LoadData() method:

    IDataView dataView = mlContext.Data.LoadFromTextFile<SentimentData>(_dataPath, hasHeader: false);
    

    LoadFromTextFile() 用于定义数据架构并读取文件。The LoadFromTextFile() defines the data schema and reads in the file. 它使用数据路径变量并返回 IDataViewIt takes in the data path variables and returns an IDataView.

拆分数据集以进行模型训练和测试Split the dataset for model training and testing

准备模型时,使用部分数据集来训练它,并使用部分数据集来测试模型的准确性。When preparing a model, you use part of the dataset to train it and part of the dataset to test the model's accuracy.

  1. 要将加载的数据拆分为所需的数据集,请添加以下代码作为 LoadData() 方法中的下一行:To split the loaded data into the needed datasets, add the following code as the next line in the LoadData() method:

    TrainTestData splitDataView = mlContext.Data.TrainTestSplit(dataView, testFraction: 0.2);
    

    上述代码使用 TrainTestSplit() 方法将加载的数据集拆分为训练数据集和测试数据集,并在 TrainTestData 类中返回它们。The previous code uses the TrainTestSplit() method to split the loaded dataset into train and test datasets and return them in the TrainTestData class. 使用 testFraction 参数指定数据的测试集百分比。Specify the test set percentage of data with the testFractionparameter. 默认值为 10%,在本例中使用 20%,以评估更多数据。The default is 10%, in this case you use 20% to evaluate more data.

  2. LoadData() 方法末尾返回 splitDataViewReturn the splitDataView at the end of the LoadData() method:

    return splitDataView;
    

生成和定型模型Build and train the model

  1. 将以下调用添加到 BuildAndTrainModel 方法作为 Main() 方法的下一行代码:Add the following call to the BuildAndTrainModelmethod as the next line of code in the Main() method:

    ITransformer model = BuildAndTrainModel(mlContext, splitDataView.TrainSet);
    

    BuildAndTrainModel() 方法执行以下任务:The BuildAndTrainModel() method executes the following tasks:

    • 提取并转换数据。Extracts and transforms the data.
    • 定型模型。Trains the model.
    • 根据测试数据预测情绪。Predicts sentiment based on test data.
    • 返回模型。Returns the model.
  2. 使用下面的代码紧随 Main() 方法后创建 BuildAndTrainModel() 方法:Create the BuildAndTrainModel() method, just after the Main() method, using the following code:

    public static ITransformer BuildAndTrainModel(MLContext mlContext, IDataView splitTrainSet)
    {
    
    }
    

提取和转换数据Extract and transform the data

  1. FeaturizeText 作为下一行代码调用:Call FeaturizeText as the next line of code:

    var estimator = mlContext.Transforms.Text.FeaturizeText(outputColumnName: "Features", inputColumnName: nameof(SentimentData.SentimentText))
    

    上述代码中的 FeaturizeText() 方法将文本列 (SentimentText) 转换为机器学习算法使用的数字键类型的 Features 列,并将其作为新的数据集列添加:The FeaturizeText() method in the previous code converts the text column (SentimentText) into a numeric key type Features column used by the machine learning algorithm and adds it as a new dataset column:

    情绪文本SentimentText 情绪Sentiment 特征Features
    女服务员的服务速度有点慢。Waitress was a little slow in service. 00 [0.76, 0.65, 0.44, …][0.76, 0.65, 0.44, …]
    酥皮不行。Crust is not good. 00 [0.98, 0.43, 0.54, …][0.98, 0.43, 0.54, …]
    哇...喜欢这个地方。Wow... Loved this place. 11 [0.35, 0.73, 0.46, …][0.35, 0.73, 0.46, …]
    服务很及时。Service was very prompt. 11 [0.39, 0, 0.75, …][0.39, 0, 0.75, …]

添加学习算法Add a learning algorithm

此应用使用对数据项或数据行进行分类的分类算法。This app uses a classification algorithm that categorizes items or rows of data. 应用将网站评论分类为正面评论或负面评论,因此,请使用二元分类任务。The app categorizes website comments as either positive or negative, so use the binary classification task.

将机器学习任务追加到数据转换定义中,方法是在 BuildAndTrainModel() 中添加以下代码作为下一行代码:Append the machine learning task to the data transformation definitions by adding the following as the next line of code in BuildAndTrainModel():

.Append(mlContext.BinaryClassification.Trainers.SdcaLogisticRegression(labelColumnName: "Label", featureColumnName: "Features"));

SdcaLogisticRegressionBinaryTrainer 是你的分类训练算法。The SdcaLogisticRegressionBinaryTrainer is your classification training algorithm. 此算法会追加到 estimator 并接受特征化的 SentimentText (Features) 和 Label 输入参数,以便从历史数据中学习。This is appended to the estimator and accepts the featurized SentimentText (Features) and the Label input parameters to learn from the historic data.

定型模型Train the model

BuildAndTrainModel() 方法中添加以下代码作为下一代码行,使模型适应 splitTrainSet 数据,并返回经过训练的模型:Fit the model to the splitTrainSet data and return the trained model by adding the following as the next line of code in the BuildAndTrainModel() method:

Console.WriteLine("=============== Create and Train the Model ===============");
var model = estimator.Fit(splitTrainSet);
Console.WriteLine("=============== End of training ===============");
Console.WriteLine();

Fit() 方法通过转换数据集并应用训练来训练模型。The Fit() method trains your model by transforming the dataset and applying the training.

返回定型模型以用于评估Return the model trained to use for evaluation

BuildAndTrainModel() 方法末尾返回模型:Return the model at the end of the BuildAndTrainModel() method:

return model;

评估模型Evaluate the model

训练模型后,使用测试数据验证模型的性能。After your model is trained, use your test data validate the model's performance.

  1. 使用以下代码紧跟 BuildAndTrainModel() 之后创建 Evaluate() 方法:Create the Evaluate() method, just after BuildAndTrainModel(), with the following code:

    public static void Evaluate(MLContext mlContext, ITransformer model, IDataView splitTestSet)
    {
    
    }
    

    Evaluate() 方法执行以下任务:The Evaluate() method executes the following tasks:

    • 加载测试数据集。Loads the test dataset.
    • 创建 BinaryClassification 计算器。Creates the BinaryClassification evaluator.
    • 评估模型并创建指标。Evaluates the model and creates metrics.
    • 显示指标。Displays the metrics.
  2. 使用下面的代码,在 BuildAndTrainModel() 方法调用的正下方,从 Main() 方法中添加对新方法的调用:Add a call to the new method from the Main() method, right under the BuildAndTrainModel() method call, using the following code:

    Evaluate(mlContext, model, splitDataView.TestSet);
    
  3. 将以下代码添加到 Evaluate() 以转换 splitTestSet 数据:Transform the splitTestSet data by adding the following code to Evaluate():

    Console.WriteLine("=============== Evaluating Model accuracy with Test data===============");
    IDataView predictions = model.Transform(splitTestSet);
    

    之前的代码使用 Transform() 方法对测试数据集提供的多个输入行进行预测。The previous code uses the Transform() method to make predictions for multiple provided input rows of a test dataset.

  4. 通过在 Evaluate() 方法中添加以下代码作为下一代码行来评估模型:Evaluate the model by adding the following as the next line of code in the Evaluate() method:

    CalibratedBinaryClassificationMetrics metrics = mlContext.BinaryClassification.Evaluate(predictions, "Label");
    

获得预测集 (predictions) 后,Evaluate() 方法会对模型进行评估,其会将预测值与测试数据集中的实际 Labels 进行比较,并返回有关模型执行情况的 CalibratedBinaryClassificationMetrics 对象。Once you have the prediction set (predictions), the Evaluate() method assesses the model, which compares the predicted values with the actual Labels in the test dataset and returns a CalibratedBinaryClassificationMetrics object on how the model is performing.

显示用于模型验证的指标Displaying the metrics for model validation

使用以下代码显示指标:Use the following code to display the metrics:

Console.WriteLine();
Console.WriteLine("Model quality metrics evaluation");
Console.WriteLine("--------------------------------");
Console.WriteLine($"Accuracy: {metrics.Accuracy:P2}");
Console.WriteLine($"Auc: {metrics.AreaUnderRocCurve:P2}");
Console.WriteLine($"F1Score: {metrics.F1Score:P2}");
Console.WriteLine("=============== End of model evaluation ===============");
  • Accuracy 指标可获取模型的准确性,即测试集中正确预测所占的比例。The Accuracy metric gets the accuracy of a model, which is the proportion of correct predictions in the test set.

  • AreaUnderRocCurve 指标指示模型对正面类和负面类进行正确分类的置信度。The AreaUnderRocCurve metric indicates how confident the model is correctly classifying the positive and negative classes. 应该使 AreaUnderRocCurve 尽可能接近 1。You want the AreaUnderRocCurve to be as close to one as possible.

  • F1Score 指标可获取模型的 F1 分数,该分数是查准率查全率之间的平衡关系的度量值。The F1Score metric gets the model's F1 score, which is a measure of balance between precision and recall. 应该使 F1Score 尽可能接近 1。You want the F1Score to be as close to one as possible.

预测测试数据结果Predict the test data outcome

  1. 使用下面的代码紧随 Evaluate() 方法后创建 UseModelWithSingleItem() 方法:Create the UseModelWithSingleItem() method, just after the Evaluate() method, using the following code:

    private static void UseModelWithSingleItem(MLContext mlContext, ITransformer model)
    {
    
    }
    

    UseModelWithSingleItem() 方法执行以下任务:The UseModelWithSingleItem() method executes the following tasks:

    • 创建测试数据的单个注释。Creates a single comment of test data.
    • 根据测试数据预测情绪。Predicts sentiment based on test data.
    • 结合测试数据和预测进行报告。Combines test data and predictions for reporting.
    • 显示预测结果。Displays the predicted results.
  2. 使用下面的代码,在 Evaluate() 方法调用的正下方,从 Main() 方法中添加对新方法的调用:Add a call to the new method from the Main() method, right under the Evaluate() method call, using the following code:

    UseModelWithSingleItem(mlContext, model);
    
  3. 添加以下代码,以便作为 UseModelWithSingleItem() 方法中的第一行进行创建:Add the following code to create as the first line in the UseModelWithSingleItem() Method:

    PredictionEngine<SentimentData, SentimentPrediction> predictionFunction = mlContext.Model.CreatePredictionEngine<SentimentData, SentimentPrediction>(model);
    

    PredictionEngine 是一个简便 API,可使用它对单个数据实例执行预测。The PredictionEngine is a convenience API, which allows you to perform a prediction on a single instance of data. PredictionEngine 不是线程安全型。PredictionEngine is not thread-safe. 可以在单线程环境或原型环境中使用。It's acceptable to use in single-threaded or prototype environments. 为了在生产环境中提高性能和线程安全,请使用 PredictionEnginePool 服务,这将创建一个在整个应用程序中使用的 PredictionEngine 对象的 ObjectPoolFor improved performance and thread safety in production environments, use the PredictionEnginePool service, which creates an ObjectPool of PredictionEngine objects for use throughout your application. 请参阅本指南,了解如何在 ASP.NET Core Web API 中使用 PredictionEnginePoolSee this guide on how to use PredictionEnginePool in an ASP.NET Core Web API.

    备注

    PredictionEnginePool 服务扩展目前处于预览状态。PredictionEnginePool service extension is currently in preview.

  4. 通过创建一个 SentimentData 实例,在 UseModelWithSingleItem() 方法中添加一个注释来测试定型模型的预测:Add a comment to test the trained model's prediction in the UseModelWithSingleItem() method by creating an instance of SentimentData:

    SentimentData sampleStatement = new SentimentData
    {
        SentimentText = "This was a very bad steak"
    };
    
  5. 通过在 UseModelWithSingleItem() 方法中将以下代码作为下一行代码添加,将测试评论数据传递到 PredictionEnginePass the test comment data to the PredictionEngine by adding the following as the next lines of code in the UseModelWithSingleItem() method:

    var resultPrediction = predictionFunction.Predict(sampleStatement);
    

    Predict() 函数对单行数据进行预测。The Predict() function makes a prediction on a single row of data.

  6. 使用以下代码显示 SentimentText 和相应的情绪预测:Display SentimentText and corresponding sentiment prediction using the following code:

    Console.WriteLine();
    Console.WriteLine("=============== Prediction Test of model with a single sample and test dataset ===============");
    
    Console.WriteLine();
    Console.WriteLine($"Sentiment: {resultPrediction.SentimentText} | Prediction: {(Convert.ToBoolean(resultPrediction.Prediction) ? "Positive" : "Negative")} | Probability: {resultPrediction.Probability} ");
    
    Console.WriteLine("=============== End of Predictions ===============");
    Console.WriteLine();
    

使用模型进行预测Use the model for prediction

部署和预测批项目Deploy and predict batch items

  1. 使用下面的代码紧随 UseModelWithSingleItem() 方法后创建 UseModelWithBatchItems() 方法:Create the UseModelWithBatchItems() method, just after the UseModelWithSingleItem() method, using the following code:

    public static void UseModelWithBatchItems(MLContext mlContext, ITransformer model)
    {
    
    }
    

    UseModelWithBatchItems() 方法执行以下任务:The UseModelWithBatchItems() method executes the following tasks:

    • 创建批处理测试数据。Creates batch test data.
    • 根据测试数据预测情绪。Predicts sentiment based on test data.
    • 结合测试数据和预测进行报告。Combines test data and predictions for reporting.
    • 显示预测结果。Displays the predicted results.
  2. 使用下面的代码,在 UseModelWithSingleItem() 方法调用的正下方,从 Main 方法中添加对新方法的调用:Add a call to the new method from the Main method, right under the UseModelWithSingleItem() method call, using the following code:

    UseModelWithBatchItems(mlContext, model);
    
  3. 添加一些评论,以测试 UseModelWithBatchItems() 方法中的定型模型预测:Add some comments to test the trained model's predictions in the UseModelWithBatchItems() method:

    IEnumerable<SentimentData> sentiments = new[]
    {
        new SentimentData
        {
            SentimentText = "This was a horrible meal"
        },
        new SentimentData
        {
            SentimentText = "I love this spaghetti."
        }
    };
    

预测评论情绪Predict comment sentiment

使用模型通过 Transform() 方法预测评论数据情绪:Use the model to predict the comment data sentiment using the Transform() method:

IDataView batchComments = mlContext.Data.LoadFromEnumerable(sentiments);

IDataView predictions = model.Transform(batchComments);

// Use model to predict whether comment data is Positive (1) or Negative (0).
IEnumerable<SentimentPrediction> predictedResults = mlContext.Data.CreateEnumerable<SentimentPrediction>(predictions, reuseRowObject: false);

合并并显示预测结果Combine and display the predictions

使用以下代码为预测创建标头:Create a header for the predictions using the following code:

Console.WriteLine();

Console.WriteLine("=============== Prediction Test of loaded model with multiple samples ===============");

由于 SentimentPrediction 继承自 SentimentDataTransform() 方法使用预测字段填充 SentimentTextBecause SentimentPrediction is inherited from SentimentData, the Transform() method populated SentimentText with the predicted fields. 随着 ML.NET 进程继续执行,每个组件会添加列,这让显示结果变得轻松:As the ML.NET process processes, each component adds columns, and this makes it easy to display the results:

foreach (SentimentPrediction prediction  in predictedResults)
{
    Console.WriteLine($"Sentiment: {prediction.SentimentText} | Prediction: {(Convert.ToBoolean(prediction.Prediction) ? "Positive" : "Negative")} | Probability: {prediction.Probability} ");
}
Console.WriteLine("=============== End of predictions ===============");

结果Results

结果应如下所示。Your results should be similar to the following. 处理期间将显示消息。During processing, messages are displayed. 你可能会看到警告或处理消息。You may see warnings, or processing messages. 为清楚起见,已经从下面的结果中删除这些内容。These have been removed from the following results for clarity.

Model quality metrics evaluation
--------------------------------
Accuracy: 83.96%
Auc: 90.51%
F1Score: 84.04%

=============== End of model evaluation ===============

=============== Prediction Test of model with a single sample and test dataset ===============

Sentiment: This was a very bad steak | Prediction: Negative | Probability: 0.1027377
=============== End of Predictions ===============

=============== Prediction Test of loaded model with a multiple samples ===============

Sentiment: This was a horrible meal | Prediction: Negative | Probability: 0.1369192
Sentiment: I love this spaghetti. | Prediction: Positive | Probability: 0.9960636
=============== End of predictions ===============

=============== End of process ===============
Press any key to continue . . .

祝贺你!Congratulations! 现在,你已成功生成用于分类和预测消息情绪的机器学习模型。You've now successfully built a machine learning model for classifying and predicting messages sentiment.

生成成功的模型是一个迭代过程。Building successful models is an iterative process. 由于本教程使用小型数据集来提供快速模型训练,因此该模型的初始质量较低。This model has initial lower quality as the tutorial uses small datasets to provide quick model training. 如果对模型质量不满意,可以通过尝试提供更大的训练数据集,或通过为每种算法选择具有不同超参数的不同训练算法来改进它。If you aren't satisfied with the model quality, you can try to improve it by providing larger training datasets or by choosing different training algorithms with different hyper-parameters for each algorithm.

可以在 dotnet/samples 存储库中找到本教程的源代码。You can find the source code for this tutorial at the dotnet/samples repository.

后续步骤Next steps

在本教程中,你将了解:In this tutorial, you learned how to:

  • 创建控制台应用程序Create a console application
  • 准备数据Prepare data
  • 加载数据Load the data
  • 生成和定型模型Build and train the model
  • 评估模型Evaluate the model
  • 使用模型进行预测Use the model to make a prediction
  • 查看结果See the results

进入下一教程了解详细信息Advance to the next tutorial to learn more