执行链中的多个 ML 模型

项目
07/11/2023

Windows ML 对其 GPU 路径进行了细致的优化，因此支持高效地加载和执行模型链。模型链由两个或多个按顺序执行的模型定义，其中的一个模型的输出会成为链中下一个模型的输入。

为了说明如何通过 Windows ML 有效地链接模型，让我们使用 FNS-Candy 样式传输 ONNX 模型作为玩具示例。可以在 GitHub 的 FNS-Candy 样式传输示例文件夹中找到这种类型的模型。

假设我们想要执行由同一 FNS-Candy 模型（在这里称为 mosaic.onnx）的两个实例组成的链。应用程序代码会将图像传递到链中的第一个模型，让它计算输出，然后将转换后的图像传递到 FNS-Candy 的另一个实例，生成最终图像。

以下步骤演示了如何使用 Windows ML 完成该操作。

注意

在实际方案中，你很可能会使用两个不同的模型，但这应该足以说明相关概念。

首先，让我们加载 mosaic.onnx 模型，以便使用它。

std::wstring filePath = L"path\\to\\mosaic.onnx"; 
LearningModel model = LearningModel::LoadFromFilePath(filePath);

string filePath = "path\\to\\mosaic.onnx";
LearningModel model = LearningModel.LoadFromFilePath(filePath);

接下来，让我们使用与输入参数相同的模型在设备的默认 GPU 上创建两个相同的会话。

LearningModelSession session1(model, LearningModelDevice(LearningModelDeviceKind::DirectX));
LearningModelSession session2(model, LearningModelDevice(LearningModelDeviceKind::DirectX));

LearningModelSession session1 = 
  new LearningModelSession(model, new LearningModelDevice(LearningModelDeviceKind.DirectX));
LearningModelSession session2 = 
  new LearningModelSession(model, new LearningModelDevice(LearningModelDeviceKind.DirectX));

注意

为了获得链接的性能优势，需要为所有模型创建相同的 GPU 会话。如果不这样做，会导致额外的数据从 GPU 移出并进入 CPU，这会降低性能。

下面几行代码将为每个会话创建绑定：

LearningModelBinding binding1(session1);
LearningModelBinding binding2(session2);

LearningModelBinding binding1 = new LearningModelBinding(session1);
LearningModelBinding binding2 = new LearningModelBinding(session2);

接下来，我们将为第一个模型绑定一个输入。我们会传入一个与模型位于同一路径中的图像。在此示例中，该图像称为“fish_720.png”。

//get the input descriptor
ILearningModelFeatureDescriptor input = model.InputFeatures().GetAt(0);
//load a SoftwareBitmap
hstring imagePath = L"path\\to\\fish_720.png";

// Get the image and bind it to the model's input
try
{
  StorageFile file = StorageFile::GetFileFromPathAsync(imagePath).get();
  IRandomAccessStream stream = file.OpenAsync(FileAccessMode::Read).get();
  BitmapDecoder decoder = BitmapDecoder::CreateAsync(stream).get();
  SoftwareBitmap softwareBitmap = decoder.GetSoftwareBitmapAsync().get();
  VideoFrame videoFrame = VideoFrame::CreateWithSoftwareBitmap(softwareBitmap);
  ImageFeatureValue image = ImageFeatureValue::CreateFromVideoFrame(videoFrame);
  binding1.Bind(input.Name(), image);
}
catch (...)
{
  printf("Failed to load/bind image\n");
}

//get the input descriptor
ILearningModelFeatureDescriptor input = model.InputFeatures[0];
//load a SoftwareBitmap
string imagePath = "path\\to\\fish_720.png";

// Get the image and bind it to the model's input
try
{
    StorageFile file = await StorageFile.GetFileFromPathAsync(imagePath);
    IRandomAccessStream stream = await file.OpenAsync(FileAccessMode.Read);
    BitmapDecoder decoder = await BitmapDecoder.CreateAsync(stream);
    SoftwareBitmap softwareBitmap = await decoder.GetSoftwareBitmapAsync();
    VideoFrame videoFrame = VideoFrame.CreateWithSoftwareBitmap(softwareBitmap);
    ImageFeatureValue image = ImageFeatureValue.CreateFromVideoFrame(videoFrame);
    binding1.Bind(input.Name, image);
}
catch
{
    Console.WriteLine("Failed to load/bind image");
}

为了使链中的下一个模型使用第一个模型的评估输出，我们需要创建一个空的输出张量并绑定输出，这样我们就有了一个用于链接的标记：

//get the output descriptor
ILearningModelFeatureDescriptor output = model.OutputFeatures().GetAt(0);
//create an empty output tensor 
std::vector<int64_t> shape = {1, 3, 720, 720};
TensorFloat outputValue = TensorFloat::Create(shape); 
//bind the (empty) output
binding1.Bind(output.Name(), outputValue);

//get the output descriptor
ILearningModelFeatureDescriptor output = model.OutputFeatures[0];
//create an empty output tensor 
List<long> shape = new List<long> { 1, 3, 720, 720 };
TensorFloat outputValue = TensorFloat.Create(shape);
//bind the (empty) output
binding1.Bind(output.Name, outputValue);

注意

绑定输出时，必须使用 TensorFloat 数据类型。这会阻止第一个模型的评估完成后发生去张量化，因此，还会避免进行第二个模型的加载和绑定操作所需的额外的 GPU 排队。

现在，我们运行第一个模型的评估，并将其输出绑定到下一个模型的输入：

//run session1 evaluation
session1.EvaluateAsync(binding1, L"");
//bind the output to the next model input
binding2.Bind(input.Name(), outputValue);
//run session2 evaluation
auto session2AsyncOp = session2.EvaluateAsync(binding2, L"");

//run session1 evaluation
await session1.EvaluateAsync(binding1, "");
//bind the output to the next model input
binding2.Bind(input.Name, outputValue);
//run session2 evaluation
LearningModelEvaluationResult results = await session2.EvaluateAsync(binding2, "");

最后，让我们使用下面的代码行检索在运行这两个模型后生成的最终输出。

auto finalOutput = session2AsyncOp.get().Outputs().First().Current().Value();

var finalOutput = results.Outputs.First().Value;

就这么简单！现在，可以充分利用可用的 GPU 资源，按顺序执行两个模型。

注意

使用以下资源可获取有关 Windows ML 的帮助：

若要提出或回答有关 Windows ML 的技术问题，请在 Stack Overflow 上使用 windows-machine-learning 标记。
若要报告 bug，请在 GitHub 上提交问题。

执行链中的多个 ML 模型

反馈

反馈

其他资源