您现在访问的是微软AZURE全球版技术文档网站,若需要访问由世纪互联运营的MICROSOFT AZURE中国区技术文档网站,请访问 https://docs.azure.cn.

示例:使用大规模使用的功能Example: Use the large-scale feature

本指南是有关如何从现有 PersonGroup 和 FaceList 对象分别纵向扩展到 LargePersonGroup 和 LargeFaceList 对象的高级文章。This guide is an advanced article on how to scale up from existing PersonGroup and FaceList objects to LargePersonGroup and LargeFaceList objects, respectively. 本指南演示迁移过程,This guide demonstrates the migration process. 其假设读者基本熟悉 PersonGroup 和 FaceList 对象、训练操作和人脸识别功能。It assumes a basic familiarity with PersonGroup and FaceList objects, the Train operation, and the face recognition functions. 若要详细了解这些主题,请参阅人脸识别概念指南。To learn more about these subjects, see the Face recognition conceptual guide.

LargePersonGroup 和 LargeFaceList 统称为大规模操作。LargePersonGroup and LargeFaceList are collectively referred to as large-scale operations. LargePersonGroup 最多可以包含 100 万个人,其中每个人最多有 248 张人脸。LargePersonGroup can contain up to 1 million persons, each with a maximum of 248 faces. LargeFaceList 最多可以包含 100 万张人脸。LargeFaceList can contain up to 1 million faces. 大规模操作类似于传统的 PersonGroup 和 FaceList,但因采用新体系结构而有一些差异。The large-scale operations are similar to the conventional PersonGroup and FaceList but have some differences because of the new architecture.

这些示例是使用 Azure 认知服务人脸客户端库以 C# 编写的。The samples are written in C# by using the Azure Cognitive Services Face client library.

备注

为了在大规模的 Identification 和 FindSimilar 操作中提高人脸搜索的性能,我们引入了一个“训练”操作用于预处理 LargeFaceList 和 LargePersonGroup。To enable Face search performance for Identification and FindSimilar in large scale, introduce a Train operation to preprocess the LargeFaceList and LargePersonGroup. 训练时间从几秒到约半小时不等,具体取决于实际容量。The training time varies from seconds to about half an hour based on the actual capacity. 如果以前的某个训练操作成功,则在训练期间,可以执行 Identification 和 FindSimilar。During the training period, it's possible to perform Identification and FindSimilar if a successful training operating was done before. 缺点在于,在完成迁移到大规模训练的最新后处理前,新添加的人员和人脸不会出现在结果中。The drawback is that the new added persons and faces don't appear in the result until a new post migration to large-scale training is completed.

步骤 1:初始化客户端对象Step 1: Initialize the client object

使用人脸客户端库时,订阅密钥和订阅终结点将通过 FaceClient 类的构造函数传入。When you use the Face client library, the subscription key and subscription endpoint are passed in through the constructor of the FaceClient class. 例如:For example:

string SubscriptionKey = "<Subscription Key>";
// Use your own subscription endpoint corresponding to the subscription key.
string SubscriptionEndpoint = "https://westus.api.cognitive.microsoft.com";
private readonly IFaceClient faceClient = new FaceClient(
            new ApiKeyServiceClientCredentials(subscriptionKey),
            new System.Net.Http.DelegatingHandler[] { });
faceClient.Endpoint = SubscriptionEndpoint

若要获取订阅密钥及其相应的终结点,请从 Azure 门户转到 Azure 市场。To get the subscription key with its corresponding endpoint, go to the Azure Marketplace from the Azure portal. 有关详细信息,请参阅订阅For more information, see Subscriptions.

步骤 2:代码迁移Step 2: Code migration

本部分重点介绍如何将 PersonGroup 或 FaceList 实现迁移到 LargePersonGroup 或 LargeFaceList。This section focuses on how to migrate PersonGroup or FaceList implementation to LargePersonGroup or LargeFaceList. 虽然 LargePersonGroup 或 LargeFaceList 与 PersonGroup 或 FaceList 在设计和内部实现方面存在差异,但用于后向兼容的 API 接口是类似的。Although LargePersonGroup or LargeFaceList differs from PersonGroup or FaceList in design and internal implementation, the API interfaces are similar for backward compatibility.

不支持数据迁移。Data migration isn't supported. 请改为重新创建 LargePersonGroup 或 LargeFaceList。You re-create the LargePersonGroup or LargeFaceList instead.

将 PersonGroup 迁移到 LargePersonGroupMigrate a PersonGroup to a LargePersonGroup

从 PersonGroup 迁移到 LargePersonGroup 的过程较为简单。Migration from a PersonGroup to a LargePersonGroup is simple. 它们采用完全相同的组级别操作。They share exactly the same group-level operations.

对于 PersonGroup 或 Person 相关的实现,只需将 API 路径或 SDK 类/模块更改为 LargePersonGroup 和 LargePersonGroup Person。For PersonGroup- or person-related implementation, it's necessary to change only the API paths or SDK class/module to LargePersonGroup and LargePersonGroup Person.

将 PersonGroup 中的所有人脸和人员添加到新的 LargePersonGroup。Add all of the faces and persons from the PersonGroup to the new LargePersonGroup. 有关详细信息,请参阅添加人脸For more information, see Add faces.

将 FaceList 迁移到 LargeFaceListMigrate a FaceList to a LargeFaceList

FaceList APIFaceList APIs LargeFaceList APILargeFaceList APIs
创建Create 创建Create
删除Delete 删除Delete
获取Get 获取Get
列出List 列出List
更新Update 更新Update
- 定型Train
- 获取定型状态Get Training Status

上表对 FaceList 和 LargeFaceList 的列级操作进行了对比。The preceding table is a comparison of list-level operations between FaceList and LargeFaceList. 如表中所示,与 FaceList 相比,LargeFaceList 附带了新的操作(“训练”和“获取训练状态”)。As is shown, LargeFaceList comes with new operations, Train and Get Training Status, when compared with FaceList. 训练 LargeFaceList 是 FindSimilar 操作的前提条件。Training the LargeFaceList is a precondition of the FindSimilar operation. FaceList 不需要训练。Training isn't required for FaceList. 以下代码片段是一个用于等待训练 LargeFaceList 的帮助器函数:The following snippet is a helper function to wait for the training of a LargeFaceList:

/// <summary>
/// Helper function to train LargeFaceList and wait for finish.
/// </summary>
/// <remarks>
/// The time interval can be adjusted considering the following factors:
/// - The training time which depends on the capacity of the LargeFaceList.
/// - The acceptable latency for getting the training status.
/// - The call frequency and cost.
///
/// Estimated training time for LargeFaceList in different scale:
/// -     1,000 faces cost about  1 to  2 seconds.
/// -    10,000 faces cost about  5 to 10 seconds.
/// -   100,000 faces cost about  1 to  2 minutes.
/// - 1,000,000 faces cost about 10 to 30 minutes.
/// </remarks>
/// <param name="largeFaceListId">The Id of the LargeFaceList for training.</param>
/// <param name="timeIntervalInMilliseconds">The time interval for getting training status in milliseconds.</param>
/// <returns>A task of waiting for LargeFaceList training finish.</returns>
private static async Task TrainLargeFaceList(
    string largeFaceListId,
    int timeIntervalInMilliseconds = 1000)
{
    // Trigger a train call.
    await FaceClient.LargeTrainLargeFaceListAsync(largeFaceListId);

    // Wait for training finish.
    while (true)
    {
        Task.Delay(timeIntervalInMilliseconds).Wait();
        var status = await faceClient.LargeFaceList.TrainAsync(largeFaceListId);

        if (status.Status == Status.Running)
        {
            continue;
        }
        else if (status.Status == Status.Succeeded)
        {
            break;
        }
        else
        {
            throw new Exception("The train operation is failed!");
        }
    }
}

以前,添加了人脸的 FaceList 和 FindSimilar 的典型用法如下所示:Previously, a typical use of FaceList with added faces and FindSimilar looked like the following:

// Create a FaceList.
const string FaceListId = "myfacelistid_001";
const string FaceListName = "MyFaceListDisplayName";
const string ImageDir = @"/path/to/FaceList/images";
faceClient.FaceList.CreateAsync(FaceListId, FaceListName).Wait();

// Add Faces to the FaceList.
Parallel.ForEach(
    Directory.GetFiles(ImageDir, "*.jpg"),
    async imagePath =>
        {
            using (Stream stream = File.OpenRead(imagePath))
            {
                await faceClient.FaceList.AddFaceFromStreamAsync(FaceListId, stream);
            }
        });

// Perform FindSimilar.
const string QueryImagePath = @"/path/to/query/image";
var results = new List<SimilarPersistedFace[]>();
using (Stream stream = File.OpenRead(QueryImagePath))
{
    var faces = faceClient.Face.DetectWithStreamAsync(stream).Result;
    foreach (var face in faces)
    {
        results.Add(await faceClient.Face.FindSimilarAsync(face.FaceId, FaceListId, 20));
    }
}

将它迁移到 LargeFaceList 时,它会变成:When migrating it to LargeFaceList, it becomes the following:

// Create a LargeFaceList.
const string LargeFaceListId = "mylargefacelistid_001";
const string LargeFaceListName = "MyLargeFaceListDisplayName";
const string ImageDir = @"/path/to/FaceList/images";
faceClient.LargeFaceList.CreateAsync(LargeFaceListId, LargeFaceListName).Wait();

// Add Faces to the LargeFaceList.
Parallel.ForEach(
    Directory.GetFiles(ImageDir, "*.jpg"),
    async imagePath =>
        {
            using (Stream stream = File.OpenRead(imagePath))
            {
                await faceClient.LargeFaceList.AddFaceFromStreamAsync(LargeFaceListId, stream);
            }
        });

// Train() is newly added operation for LargeFaceList.
// Must call it before FindSimilarAsync() to ensure the newly added faces searchable.
await TrainLargeFaceList(LargeFaceListId);

// Perform FindSimilar.
const string QueryImagePath = @"/path/to/query/image";
var results = new List<SimilarPersistedFace[]>();
using (Stream stream = File.OpenRead(QueryImagePath))
{
    var faces = faceClient.Face.DetectWithStreamAsync(stream).Result;
    foreach (var face in faces)
    {
        results.Add(await faceClient.Face.FindSimilarAsync(face.FaceId, largeFaceListId: LargeFaceListId));
    }
}

如上所示,数据管理和 FindSimilar 部分几乎一样。As previously shown, the data management and the FindSimilar part are almost the same. 唯一的例外是,全新的预处理训练操作必须在 LargeFaceList 中完成,然后 FindSimilar 才能正常工作。The only exception is that a fresh preprocessing Train operation must complete in the LargeFaceList before FindSimilar works.

步骤 3:训练建议Step 3: Train suggestions

尽管训练操作可以加快 FindSimilarIdentification,但训练时间非常漫长,尤其是涉及大规模操作时。Although the Train operation speeds up FindSimilar and Identification, the training time suffers, especially when coming to large scale. 下表列出了不同规模的估计训练时间。The estimated training time in different scales is listed in the following table.

人脸或人员的规模Scale for faces or persons 估计训练时间Estimated training time
1,0001,000 1-2 秒1-2 sec
10,00010,000 5-10 秒5-10 sec
100,000100,000 1-2 分钟1-2 min
1,000,0001,000,000 10-30 分钟10-30 min

为了更好地利用大规模功能,我们建议采用以下策略。To better utilize the large-scale feature, we recommend the following strategies.

步骤 3.1:自定义时间间隔Step 3.1: Customize time interval

TrainLargeFaceList() 中所示,某个以毫秒为单位的时间间隔可以延迟无限期的训练状态检查过程。As is shown in TrainLargeFaceList(), there's a time interval in milliseconds to delay the infinite training status checking process. 对于包含更多人脸的 LargeFaceList,使用较大间隔可减少调用计数和成本。For LargeFaceList with more faces, using a larger interval reduces the call counts and cost. 请根据 LargeFaceList 的预期容量自定义该时间间隔。Customize the time interval according to the expected capacity of the LargeFaceList.

同样的策略也适用于 LargePersonGroup。The same strategy also applies to LargePersonGroup. 例如,在训练包含 100 万人的 LargePersonGroup 时,timeIntervalInMilliseconds 可能为 60,000(即 1 分钟间隔)。For example, when you train a LargePersonGroup with 1 million persons, timeIntervalInMilliseconds might be 60,000, which is a 1-minute interval.

步骤 3.2:小规模缓冲区Step 3.2: Small-scale buffer

LargePersonGroup 或 LargeFaceList 中的人员/人脸仅在训练后才可搜索。Persons or faces in a LargePersonGroup or a LargeFaceList are searchable only after being trained. 在动态方案中,新人员或人脸会不断增加,且必须立即可供搜索,但训练时间可能超过所需时间。In a dynamic scenario, new persons or faces are constantly added and must be immediately searchable, yet training might take longer than desired.

若要缓解此问题,请仅对新添加的条目使用额外的小规模 LargePersonGroup 或 LargeFaceList 作为缓冲区。To mitigate this problem, use an extra small-scale LargePersonGroup or LargeFaceList as a buffer only for the newly added entries. 由于规模较小,此缓冲区所需训练时间较短。This buffer takes a shorter time to train because of the smaller size. 在此临时缓冲区中应可实现即时搜索功能。The immediate search capability on this temporary buffer should work. 在使用此缓冲区的同时,通过按稀疏间隔执行主训练来对主 LargePersonGroup 或 LargeFaceList 执行训练。Use this buffer in combination with training on the master LargePersonGroup or LargeFaceList by running the master training on a sparser interval. 例如,在午夜执行或每日执行一次。Examples are in the middle of the night and daily.

示例工作流:An example workflow:

  1. 创建一个主 LargePersonGroup 或 LargeFaceList(主集合)。Create a master LargePersonGroup or LargeFaceList, which is the master collection. 和一个缓冲区 LargePersonGroup 或 LargeFaceList(缓冲区集合)。Create a buffer LargePersonGroup or LargeFaceList, which is the buffer collection. 缓冲区集合仅用于新添加的人员或人脸。The buffer collection is only for newly added persons or faces.
  2. 同时向主集合和缓冲区集合添加新人员或人脸。Add new persons or faces to both the master collection and the buffer collection.
  3. 仅按短时间间隔训练缓冲区集合,以确保新添加的条目生效。Only train the buffer collection with a short time interval to ensure that the newly added entries take effect.
  4. 同时对主集合和缓冲区集合调用 Identification 或 FindSimilar。Call Identification or FindSimilar against both the master collection and the buffer collection. 合并结果。Merge the results.
  5. 当缓冲区集合大小增加到阈值或在系统空闲时,创建新的缓冲区集合。When the buffer collection size increases to a threshold or at a system idle time, create a new buffer collection. 对主集合触发训练操作。Trigger the Train operation on the master collection.
  6. 完成对主集合的训练操作后,删除旧的缓冲区集合。Delete the old buffer collection after the Train operation finishes on the master collection.

步骤 3.3:独立训练Step 3.3: Standalone training

如果可以接受相对较长的时间延迟,则不需要在添加新数据后立即触发训练操作。If a relatively long latency is acceptable, it isn't necessary to trigger the Train operation right after you add new data. 相反,可从主逻辑中拆分定型操作并定期触发该操作。Instead, the Train operation can be split from the main logic and triggered regularly. 此策略适用于可接受延迟的动态方案。This strategy is suitable for dynamic scenarios with acceptable latency. 可将它应用到静态方案,以进一步降低训练频率。It can be applied to static scenarios to further reduce the Train frequency.

假设存在类似于 TrainLargeFaceListTrainLargePersonGroup 函数。Suppose there's a TrainLargePersonGroup function similar to TrainLargeFaceList. 通过调用 System.Timers 中的 Timer 类,针对 LargePersonGroup 的独立训练的典型实现为:A typical implementation of the standalone training on a LargePersonGroup by invoking the Timer class in System.Timers is:

private static void Main()
{
    // Create a LargePersonGroup.
    const string LargePersonGroupId = "mylargepersongroupid_001";
    const string LargePersonGroupName = "MyLargePersonGroupDisplayName";
    faceClient.LargePersonGroup.CreateAsync(LargePersonGroupId, LargePersonGroupName).Wait();

    // Set up standalone training at regular intervals.
    const int TimeIntervalForStatus = 1000 * 60; // 1-minute interval for getting training status.
    const double TimeIntervalForTrain = 1000 * 60 * 60; // 1-hour interval for training.
    var trainTimer = new Timer(TimeIntervalForTrain);
    trainTimer.Elapsed += (sender, args) => TrainTimerOnElapsed(LargePersonGroupId, TimeIntervalForStatus);
    trainTimer.AutoReset = true;
    trainTimer.Enabled = true;

    // Other operations like creating persons, adding faces, and identification, except for Train.
    // ...
}

private static void TrainTimerOnElapsed(string largePersonGroupId, int timeIntervalInMilliseconds)
{
    TrainLargePersonGroup(largePersonGroupId, timeIntervalInMilliseconds).Wait();
}

若要详细了解数据管理以及与识别相关的实现,请参阅添加人脸For more information about data management and identification-related implementations, see Add faces.

总结Summary

本指南介绍了如何将现有 PersonGroup 或 FaceList 代码(不是数据)迁移到 LargePersonGroup 或 LargeFaceList:In this guide, you learned how to migrate the existing PersonGroup or FaceList code, not data, to the LargePersonGroup or LargeFaceList:

  • LargePersonGroup 和 LargeFaceList 的工作原理类似于 PersonGroup 或 FaceList,但 LargeFaceList 需要训练操作。LargePersonGroup and LargeFaceList work similar to PersonGroup or FaceList, except that the Train operation is required by LargeFaceList.
  • 采取适当的训练策略可对大规模数据集执行动态数据更新。Take the proper Train strategy to dynamic data update for large-scale data sets.

后续步骤Next steps

参照操作指南了解如何将人脸添加到 PersonGroup,或编写脚本以针对 PersonGroup 执行“识别”操作。Follow a how-to guide to learn how to add faces to a PersonGroup or write a script to do the Identify operation on a PersonGroup.