자습서: Windows Machine Learning 데스크톱 응용 프로그램 만들기(C++)

아티클
10/20/2023

Windows ML API를 활용하여 C++ 데스크톱(Win32) 애플리케이션 내에서 기계 학습 모델과 쉽게 상호 작용할 수 있습니다. 애플리케이션에서 로드, 바인딩 및 평가의 세 단계를 사용하여 기계 학습 기능을 활용할 수 있습니다.

Load -> Bind -> Evaluate

GitHub에서 사용할 수 있는 SqueezeNet 개체 검색 샘플의 다소 간소화된 버전을 만들려고 합니다. 완료 시의 결과를 확인하려는 경우 전체 샘플을 다운로드할 수 있습니다.

C++/WinRT를 사용하여 WinML API에 액세스합니다. 자세한 내용은 C++/WinRT를 참조하세요.

이 자습서에서는 다음 작업을 수행하는 방법을 알아봅니다.

기계 학습 모델 로드
이미지를 VideoFrame으로 로드
모델의 입력 및 출력 바인딩
모델 평가 및 의미 있는 결과 출력

필수 조건

Visual Studio 2019(또는 Visual Studio 2017 버전 15.7.4 이상)
Windows 10 버전 1809 이상
Windows SDK 빌드 17763 이상
C++/WinRT용 Visual Studio 확장
1. Visual Studio에서 도구 > 확장 및 업데이트를 차례로 선택합니다.
2. 왼쪽 창에서 온라인을 선택하고, 오른쪽의 검색 상자를 사용하여 "WinRT"를 검색합니다.
3. C++/WinRT를 선택하고, 다운로드를 클릭하고, Visual Studio를 닫습니다.
4. 설치 지침을 따른 다음, Visual Studio를 다시 엽니다.
Windows-Machine-Learning Github 리포지토리(ZIP 파일로 다운로드하거나 머신에 복제할 수 있음)

프로젝트 만들기

먼저 Visual Studio에서 프로젝트를 만듭니다.

파일 > 새로 만들기 > 프로젝트를 차례로 선택하여 새 프로젝트 창을 엽니다.
왼쪽 창에서 설치됨 > Visual C++ > Windows 데스크톱을 차례로 선택하고, 가운데 영역에서 Windows 콘솔 애플리케이션(C++/WinRT)을 선택합니다.
프로젝트에 이름 및 위치를 지정한 다음, 확인을 클릭합니다.
새 유니버설 Windows 플랫폼 프로젝트 창에서 대상 및 최소 버전을 모두 빌드 17763 이상으로 설정하고, 확인을 클릭합니다.
위쪽 도구 모음의 드롭다운 메뉴가 컴퓨터의 아키텍처에 따라 디버그 및 x64 또는 x86 중 하나로 설정되어 있는지 확인합니다.
Ctrl + F5를 눌러 디버그하지 않고 프로그램을 실행합니다. 터미널이 "Hello world" 텍스트로 열립니다. 아무 키나 눌러 닫습니다.

모델 로드

다음으로 LearningModel.LoadFromFilePath를 사용하여 ONNX 모델을 프로그램에 로드합니다.

헤더 파일 폴더의 pch.h에서 다음 include 문을 추가합니다(이에 따라 필요한 모든 API에 액세스할 수 있음).

#include <winrt/Windows.AI.MachineLearning.h>
#include <winrt/Windows.Foundation.Collections.h>
#include <winrt/Windows.Graphics.Imaging.h>
#include <winrt/Windows.Media.h>
#include <winrt/Windows.Storage.h>

#include <string>
#include <fstream>

#include <Windows.h>

원본 파일 폴더의 main.cpp에서 다음 using 문을 추가합니다.

using namespace Windows::AI::MachineLearning;
using namespace Windows::Foundation::Collections;
using namespace Windows::Graphics::Imaging;
using namespace Windows::Media;
using namespace Windows::Storage;

using namespace std;

using 문 뒤에 다음 변수 선언을 추가합니다.

// Global variables
hstring modelPath;
string deviceName = "default";
hstring imagePath;
LearningModel model = nullptr;
LearningModelDeviceKind deviceKind = LearningModelDeviceKind::Default;
LearningModelSession session = nullptr;
LearningModelBinding binding = nullptr;
VideoFrame imageFrame = nullptr;
string labelsFilePath;
vector<string> labels;

글로벌 변수 뒤에 다음 forward(정방향) 선언을 추가합니다.

// Forward declarations
void LoadModel();
VideoFrame LoadImageFile(hstring filePath);
void BindModel();
void EvaluateModel();
void PrintResults(IVectorView<float> results);
void LoadLabels();

main.cpp에서 "Hello world" 코드(init_apartment 뒤에 있는 main 함수의 모든 항목)를 제거합니다.
Windows-Machine-Learning 리포지토리의 로컬 복제본에서 SqueezeNet.onnx 파일을 찾습니다. \Windows-Machine-Learning\SharedContent\models에 있습니다.
파일 경로를 복사하여 위쪽에 정의한 modelPath 변수에 할당합니다. hstring에서 제대로 작동하도록 와이드 문자열로 만들고 추가 백슬래시(\)를 사용하여 백슬래시를 이스케이프하려면 해당 문자열에 L 접두사를 지정해야 합니다. 예시:
```
hstring modelPath = L"C:\\Repos\\Windows-Machine-Learning\\SharedContent\\models\\SqueezeNet.onnx";
```

먼저 LoadModel 메서드를 구현합니다. main 메서드 뒤에 다음 메서드를 추가합니다. 이 메서드는 모델을 로드하고 소요된 시간을 출력합니다.

void LoadModel()
{
     // load the model
     printf("Loading modelfile '%ws' on the '%s' device\n", modelPath.c_str(), deviceName.c_str());
     DWORD ticks = GetTickCount();
     model = LearningModel::LoadFromFilePath(modelPath);
     ticks = GetTickCount() - ticks;
     printf("model file loaded in %d ticks\n", ticks);
}

마지막으로 main 메서드에서 이 메서드를 호출합니다.
```
LoadModel();
```
디버그하지 않고 프로그램을 실행합니다. 모델이 성공적으로 로드되는 것을 확인할 수 있습니다!

이미지 로드

다음으로 이미지 파일을 프로그램에 로드합니다.

다음 메서드를 추가합니다. 이 메서드는 지정된 경로에서 이미지를 로드하고 해당 이미지에서 VideoFrame을 만듭니다.

VideoFrame LoadImageFile(hstring filePath)
{
    printf("Loading the image...\n");
    DWORD ticks = GetTickCount();
    VideoFrame inputImage = nullptr;

    try
    {
        // open the file
        StorageFile file = StorageFile::GetFileFromPathAsync(filePath).get();
        // get a stream on it
        auto stream = file.OpenAsync(FileAccessMode::Read).get();
        // Create the decoder from the stream
        BitmapDecoder decoder = BitmapDecoder::CreateAsync(stream).get();
        // get the bitmap
        SoftwareBitmap softwareBitmap = decoder.GetSoftwareBitmapAsync().get();
        // load a videoframe from it
        inputImage = VideoFrame::CreateWithSoftwareBitmap(softwareBitmap);
    }
    catch (...)
    {
        printf("failed to load the image file, make sure you are using fully qualified paths\r\n");
        exit(EXIT_FAILURE);
    }

    ticks = GetTickCount() - ticks;
    printf("image file loaded in %d ticks\n", ticks);
    // all done
    return inputImage;
}

이 메서드에 대한 호출을 main 메서드에 추가합니다.
```
imageFrame = LoadImageFile(imagePath);
```
Windows-Machine-Learning 리포지토리의 로컬 복제본에서 media 폴더를 찾습니다. \Windows-Machine-Learning\SharedContent\media에 있습니다.
해당 폴더에서 이미지 중 하나를 선택하고, 파일 경로를 위쪽에서 정의한 imagePath 변수에 할당합니다. 와이드 문자열을 만들고 다른 백슬래시를 사용하여 백슬래시를 이스케이프하려면 L 접두사를 지정해야 합니다. 예시:
```
hstring imagePath = L"C:\\Repos\\Windows-Machine-Learning\\SharedContent\\media\\kitten_224.png";
```
디버그하지 않고 프로그램을 실행합니다. 성공적으로 로드된 이미지가 표시됩니다!

입력 및 출력 바인딩

다음으로 모델에 기반한 세션을 만들고, LearningModelBinding.Bind를 사용하여 세션의 입력 및 출력을 바인딩합니다. 바인딩에 대한 자세한 내용은 모델 바인딩을 참조하세요.

BindModel 메서드를 구현합니다. 그러면 모델 및 디바이스에 기반한 세션과 해당 세션에 기반한 바인딩이 만들어집니다. 그런 다음, 해당 이름을 사용하여 만든 변수에 입력과 출력을 바인딩합니다. 입력 기능의 이름이 "data_0"이고 출력 기능의 이름이 "softmaxout_1"이라는 것을 미리 알고 있습니다. 모델에 대한 이러한 속성은 온라인 모델 시각화 도구인 Netron을 열면 확인할 수 있습니다.

void BindModel()
{
    printf("Binding the model...\n");
    DWORD ticks = GetTickCount();

    // now create a session and binding
    session = LearningModelSession{ model, LearningModelDevice(deviceKind) };
    binding = LearningModelBinding{ session };
    // bind the intput image
    binding.Bind(L"data_0", ImageFeatureValue::CreateFromVideoFrame(imageFrame));
    // bind the output
    vector<int64_t> shape({ 1, 1000, 1, 1 });
    binding.Bind(L"softmaxout_1", TensorFloat::Create(shape));

    ticks = GetTickCount() - ticks;
    printf("Model bound in %d ticks\n", ticks);
}

main 메서드에서 BindModel에 대한 호출을 추가합니다.
```
BindModel();
```
디버그하지 않고 프로그램을 실행합니다. 모델의 입력 및 출력은 성공적으로 바인딩됩니다. 거의 완료되었습니다!

모델 평가

이제 이 자습서의 시작 부분에 있는 다이어그램의 마지막 평가 단계에 있습니다. 모델은 LearningModelSession.Evaluate를 사용하여 평가합니다.

EvaluateModel 메서드를 구현합니다. 이 메서드는 세션을 가져오고, 바인딩 및 상관 관계 ID를 사용하여 이를 평가합니다. 상관 관계 ID는 나중에 특정 평가 호출을 출력 결과와 일치시키는 데 사용할 수 있는 항목입니다. 다시 한번, 출력 이름이 "softmaxout_1"이라는 것을 미리 알고 있습니다.

void EvaluateModel()
{
    // now run the model
    printf("Running the model...\n");
    DWORD ticks = GetTickCount();

    auto results = session.Evaluate(binding, L"RunId");

    ticks = GetTickCount() - ticks;
    printf("model run took %d ticks\n", ticks);

    // get the output
    auto resultTensor = results.Outputs().Lookup(L"softmaxout_1").as<TensorFloat>();
    auto resultVector = resultTensor.GetAsVectorView();
    PrintResults(resultVector);
}

이제 PrintResults를 구현해 보겠습니다. 이 메서드는 이미지에 있을 수 있는 개체에 대한 상위 세 개의 확률을 가져와서 출력합니다.

void PrintResults(IVectorView<float> results)
{
    // load the labels
    LoadLabels();
    // Find the top 3 probabilities
    vector<float> topProbabilities(3);
    vector<int> topProbabilityLabelIndexes(3);
    // SqueezeNet returns a list of 1000 options, with probabilities for each, loop through all
    for (uint32_t i = 0; i < results.Size(); i++)
    {
        // is it one of the top 3?
        for (int j = 0; j < 3; j++)
        {
            if (results.GetAt(i) > topProbabilities[j])
            {
                topProbabilityLabelIndexes[j] = i;
                topProbabilities[j] = results.GetAt(i);
                break;
            }
        }
    }
    // Display the result
    for (int i = 0; i < 3; i++)
    {
        printf("%s with confidence of %f\n", labels[topProbabilityLabelIndexes[i]].c_str(), topProbabilities[i]);
    }
}

LoadLabels도 구현해야 합니다. 이 메서드는 모델에서 인식할 수 있는 다른 모든 개체가 포함된 레이블 파일을 열고 구문 분석합니다.

void LoadLabels()
{
    // Parse labels from labels file.  We know the file's entries are already sorted in order.
    ifstream labelFile{ labelsFilePath, ifstream::in };
    if (labelFile.fail())
    {
        printf("failed to load the %s file.  Make sure it exists in the same folder as the app\r\n", labelsFilePath.c_str());
        exit(EXIT_FAILURE);
    }

    std::string s;
    while (std::getline(labelFile, s, ','))
    {
        int labelValue = atoi(s.c_str());
        if (labelValue >= labels.size())
        {
            labels.resize(labelValue + 1);
        }
        std::getline(labelFile, s);
        labels[labelValue] = s;
    }
}

Windows-Machine-Learning 리포지토리의 로컬 복제본에서 Labels.txt 파일을 찾습니다. \Windows-Machine-Learning\Samples\SqueezeNetObjectDetection\Desktop\cpp에 있습니다.
이 파일 경로를 위쪽에서 정의한 labelsFilePath 변수에 할당합니다. 다른 백슬래시를 사용하여 백슬래시를 이스케이프해야 합니다. 예시:
```
string labelsFilePath = "C:\\Repos\\Windows-Machine-Learning\\Samples\\SqueezeNetObjectDetection\\Desktop\\cpp\\Labels.txt";
```
EvaluateModel에 대한 호출을 main 메서드에 추가합니다.
```
EvaluateModel();
```

디버그하지 않고 프로그램을 실행합니다. 이제 이미지의 내용을 올바르게 인식합니다! 출력할 수 있는 항목의 예제는 다음과 같습니다.

Loading modelfile 'C:\Repos\Windows-Machine-Learning\SharedContent\models\SqueezeNet.onnx' on the 'default' device
model file loaded in 250 ticks
Loading the image...
image file loaded in 78 ticks
Binding the model...Model bound in 15 ticks
Running the model...
model run took 16 ticks
tabby, tabby cat with confidence of 0.931461
Egyptian cat with confidence of 0.065307
Persian cat with confidence of 0.000193

다음 단계

이제 C++ 데스크톱 애플리케이션에서 작동하는 개체 검색을 구현했습니다! 다음으로, GitHub의 샘플과 비슷한 방식으로 명령줄 인수를 사용하여 모델과 이미지 파일을 하드 코딩하는 대신 입력해 볼 수 있습니다. GPU와 같은 다른 디바이스에서 평가를 실행하여 성능의 차이를 확인해 볼 수도 있습니다.

GitHub의 다른 샘플을 사용하여 원하는 대로 확장해 보세요!

참고 항목

Windows ML에 대한 도움말은 다음 리소스를 참조하세요.

Windows ML에 대한 기술적인 질문을 하거나 질문에 답하려면, Stack Overflow에서 windows-machine-learning 태그를 사용하세요.
버그를 보고하려면 GitHub에서 문제를 제출하세요.

Share via