# 定型和評估模型Train and evaluate a model

## 分割資料以進行定型和測試Split data for training and testing

public class HousingData
{
public float Size { get; set; }

[VectorType(3)]
public float[] HistoricalPrices { get; set; }

[ColumnName("Label")]
public float CurrentPrice { get; set; }
}


HousingData[] housingData = new HousingData[]
{
new HousingData
{
Size = 600f,
HistoricalPrices = new float[] { 100000f ,125000f ,122000f },
CurrentPrice = 170000f
},
new HousingData
{
Size = 1000f,
HistoricalPrices = new float[] { 200000f, 250000f, 230000f },
CurrentPrice = 225000f
},
new HousingData
{
Size = 1000f,
HistoricalPrices = new float[] { 126000f, 130000f, 200000f },
CurrentPrice = 195000f
},
new HousingData
{
Size = 850f,
HistoricalPrices = new float[] { 150000f,175000f,210000f },
CurrentPrice = 205000f
},
new HousingData
{
Size = 900f,
HistoricalPrices = new float[] { 155000f, 190000f, 220000f },
CurrentPrice = 210000f
},
new HousingData
{
Size = 550f,
HistoricalPrices = new float[] { 99000f, 98000f, 130000f },
CurrentPrice = 180000f
}
};


DataOperationsCatalog.TrainTestData dataSplit = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
IDataView trainData = dataSplit.TrainSet;
IDataView testData = dataSplit.TestSet;


## 準備資料Prepare the data

ML.NET 演算法針對輸入資料行類型具有條件約束。ML.NET algorithms have constraints on input column types. 此外，若沒有指定任何值，則會使用預設值作為輸入和輸出資料行名稱。Additionally, default values are used for input and output column names when no values are specified.

### 使用預期的資料行類型Working with expected column types

ML.NET 中的機器學習演算法預期收到大小已知浮動向量作為輸入。The machine learning algorithms in ML.NET expect a float vector of known size as input. 當所有資料都已是數字格式，且可一起進行處理時 (例如影像像素)，請將 VectorType 屬性套用到您的資料模型。Apply the VectorType attribute to your data model when all of the data is already in numerical format and is intended to be processed together (i.e. image pixels).

// Define Data Prep Estimator
// 1. Concatenate Size and Historical into a single feature vector output to a new column called Features
// 2. Normalize Features vector
IEstimator<ITransformer> dataPrepEstimator =
mlContext.Transforms.Concatenate("Features", "Size", "HistoricalPrices")
.Append(mlContext.Transforms.NormalizeMinMax("Features"));

// Create data prep transformer
ITransformer dataPrepTransformer = dataPrepEstimator.Fit(trainData);

// Apply transforms to training data
IDataView transformedTrainingData = dataPrepTransformer.Transform(trainData);


### 使用預設資料行名稱Working with default column names

ML.NET 演算法會在沒有指定任何項目時使用預設資料行名稱。ML.NET algorithms use default column names when none are specified. 所有訓練員都具有一個稱為 featureColumnName 的參數作為演算法的輸入，且當適用時，他們也會針對預期值擁有一個稱為 labelColumnName 的參數。All trainers have a parameter called featureColumnName for the inputs of the algorithm and when applicable they also have a parameter for the expected value called labelColumnName. 根據預設，這些值分別是 FeaturesLabelBy default those values are Features and Label respectively.

var UserDefinedColumnSdcaEstimator = mlContext.Regression.Trainers.Sdca(labelColumnName: "MyLabelColumnName", featureColumnName: "MyFeatureColumnName");


## 定型機器學習模型Train the machine learning model

// Define StochasticDualCoordinateAscent regression algorithm estimator
var sdcaEstimator = mlContext.Regression.Trainers.Sdca();

// Build machine learning model
var trainedModel = sdcaEstimator.Fit(transformedTrainingData);


## 擷取模型參數Extract model parameters

var trainedModelParameters = trainedModel.Model as LinearRegressionModelParameters;


## 評估模型品質Evaluate model quality

Evaluate 方法會根據執行的機器學習服務工作類型，產生不同的計量。The Evaluate method produces different metrics depending on which machine learning task was performed. 如需詳細資訊，請前往 Microsoft.ML.Data API 文件並尋找其名稱中包含 Metrics 的類別。For more details, visit the Microsoft.ML.Data API Documentation and look for classes that contain Metrics in their name.

// Measure trained model performance
// Apply data prep transformer to test data
IDataView transformedTestData = dataPrepTransformer.Transform(testData);

// Use trained model to make inferences on test data
IDataView testDataPredictions = trainedModel.Transform(transformedTestData);

// Extract model metrics and get RSquared
RegressionMetrics trainedModelMetrics = mlContext.Regression.Evaluate(testDataPredictions);
double rSquared = trainedModelMetrics.RSquared;


1. 測試資料集已使用先前定義的資料準備轉換進行預先處理。Test data set is pre-processed using the data preparation transforms previously defined.
2. 定型後的機器學習模型會用來針對測試資料進行預測。The trained machine learning model is used to make predictions on the test data.
3. Evaluate 方法中，測試資料集 CurrentPrice 資料行中的值會和新輸出預測的 Score 資料行比較，計算迴歸模型的計量，其中一個的決定係數儲存在 rSquared 變數中。In the Evaluate method, the values in the CurrentPrice column of the test data set are compared against the Score column of the newly output predictions to calculate the metrics for the regression model, one of which, R-Squared is stored in the rSquared variable.