实现 Azure AI 蓝图Implementing the Azure blueprint for AI

介绍Introduction

医疗保健组织逐渐意识到 AI(人工智能)和 ML(机器学习)可成为改进其许多业务环节的有用工具,从提高患者疗效到简化常规操作均有用。Healthcare organizations are realizing that AI (Artificial Intelligence) and ML (machine learning) can be valuable tools for many parts of their business, from improving patient outcomes to streamlining daily operations. 通常医疗保健组织没有实现 AI/ML 系统的技术人员。Often, healthcare organizations do not have the technology staff to implement AI/ML systems. 为改善此情况并使 AI/ML 解决方案快速在 Azure 上运行,Microsoft 创建了 Azure 医疗保健 AI 蓝图To improve this situation and get AI/ML solutions running on Azure quickly, Microsoft created the Azure healthcare AI blueprint. 通过此蓝图,我们演示了如何以安全可靠且合规的方式快速启动 AI/ML。Using the blueprint, we show how to get started with AI/ML quickly in a safe, compliant, secure, and reliable way.

健康 AI 蓝图使用 Azure 将 AI/ML 引入组织。The health blueprint for AI bootstraps AI/ML into your organization using Azure. 本文介绍了此蓝图的安装过程、组件及如何使用它运行 AI/ML 实验来预测患者住院时长。This article describes installing the blueprint, its components, and how to use it to run an AI/ML experiment that predicts a patient’s length of stay.

优点Benefits

创建此蓝图的目的在于为医疗保健组织提供指南和适当 PaaS(平台即服务)体系结构的快速启动方法,以便在高度管控的医疗保健环境中支持 AI/ML,其中包括确保系统符合 HIPAA 和 HITRUST 合规性要求。The blueprint was created to give healthcare organizations guidance and a quick start on proper PaaS (Platform as a Service) architectures to support AI/ML in in highly regulated healthcare environments, including ensuring the system upholds HIPAA and HITRUST compliance requirements.

医疗保健组织中的技术人员通常只有少量时间用于新项目,尤其是要求他们必须学习复杂新技术的项目。Technology staff in healthcare organizations often have little time for new projects, especially those in which they must learn a new and complex technology. 此蓝图可帮助技术人员快速熟悉 Azure 以及它的多个服务,从而节省学习曲线的成本。The blueprint can help technical staff become familiar with Azure and several of its services quickly, saving the cost of a learning curve. 安装此蓝图后,技术人员可以将其作为参考实现进行学习,然后使用该知识来扩展其功能,或仿照此蓝图创建新的 AI/ML 解决方案。Technical staff can learn from the blueprint as a reference implementation after it is installed and use that knowledge to extend its capabilities or create a new AI/ML solution patterned after the blueprint.

借助此蓝图,组织可以快速启动并运行新的 AI/ML 功能。The blueprint gets your organization up and running with new AI/ML capabilities—quickly. AI 和 ML 就绪后,技术人员即可使用通过各种源收集的数据运行 AI/ML 实验。With AI and ML in place, technical staff are ready to run AI/ML experiments using data collected through various sources. 例如,可能已存在脓毒症旧案例以及许多随附变量(对有此病症的患者个体跟踪了这些变量)的相关数据。For instance, data may already exist on previous instances of sepsis and many of the accompanying variables that were tracked for individual patients with the condition. 借助此数据(使用匿名格式),技术人员可以在患者身上查找潜在脓毒症的迹象,并帮助更改工作流程,从而更有效地避免此病症。Using this data in an anonymized format, technical staff can look for indicators of potential sepsis in patients and help change operational procedures to better avoid the condition.

此蓝图提供了用于学习如何预测患者住院时长的数据和示例代码。The blueprint provides the data and example code for learning how to predict a patient’s length of stay. 此为示例用例,可使用它来了解 AI/ML 解决方案的组件。This is a sample use case that can be used for learning about the components of the AI/ML solution.

平台即服务或基础结构即服务Platform or infrastructure as a service

Microsoft Azure 提供了 PaaS 和 SaaS 两种服务,请根据不同的用例选择符合需求的适当服务。Microsoft Azure offers both PaaS and SaaS offerings and choosing the right one for your needs differs per use case. 此蓝图使用 PaaS 服务来解决预测患者住院时长的问题。The blueprint is designed to use PaaS services that solve for predicting a patient’s length of stay in hospital. Azure 医疗保健 AI 蓝图会为医疗保健组织预配置安全合规的 AI/ML 解决方案,并提供实例化该方案所需的一切内容。The Azure healthcare AI blueprint provides everything needed to instantiate a secure and compliant AI/ML solution pre-configured for healthcare organizations. 此蓝图使用的 PaaS 模型会将此蓝图安装并配置为完整的解决方案。The PaaS model used by this blueprint installs and configures the blueprint as a complete solution.

PaaS 选项PaaS option

使用 PaaS 服务模型可以降低总拥有成本 (TCO),因为不存在要管理的硬件。Using a PaaS services model results in reduced Total Cost of Ownership(TCO) because there is no hardware to manage. 组织无需购买和维护硬件或 VM。The organization doesn’t need to buy and maintain hardware or VMs. 此蓝图以独占方式使用 PaaS 服务。The blueprint uses PaaS services exclusively.

这降低了维护本地解决方案的成本,并使技术人员可以重点关注战略计划而非基础结构。This reduces the cost of maintaining an on-premises solution and frees technical staff to focus on strategic initiatives instead of infrastructure. 它还可以将计算和存储付费从资本投入预算转移到运营投入预算。It can also move paying for computing and storage from capital expense budgets to operational expense budgets. 运行此蓝图方案的成本由这些服务的使用情况及数据存储的成本驱动。The costs of running this blueprint scenario are driven by usage of the services plus the costs of data storage.

IaaS 选项IaaS option

此蓝图和本文均侧重于 PaaS 实现,但也存在此蓝图的开放源代码扩展,借助此扩展可以在基础结构即服务 (IaaS) 环境中使用此蓝图。Although the blueprint and this article focus on the PaaS implementation, there is an open source extension to the blueprint which allows using it in an infrastructure as a service (IaaS) environments.

在 IaaS 托管模型中,客户为 Azure 托管的 VM 的运行时间及其处理能力付费。In an IaaS hosting model, customers pay for uptime of Azure hosted VMs and their processing power. IaaS 提供了更高级别的控制能力,因为由客户管理自己的 VM,但通常成本也会增加,因为 VM 根据运行时间收费,而非使用情况。IaaS gives a higher level of control since the customer is managing their own VMs, but typically at increased costs as VMs are charged for uptime versus usage. 此外,客户负责通过应用补丁、防范恶意软件等方式来维护 VM。Further, the customer is responsible for maintaining the VMs by applying patches, guarding against malware and so on.

IaaS 模型不属于本文范围,本文重点介绍此蓝图的 PaaS 部署。The IaaS model is beyond the scope of this article, which focuses on a PaaS deployment of the blueprint.

医疗保健 AI/ML 蓝图The healthcare AI/ML blueprint

此蓝图创造了将此技术用于医疗保健上下文的起点。The blueprint creates a starting point for using this technology in a healthcare context. 此蓝图安装到 Azure 后,将使用适当的执行组件、权限和服务创建用于支持 AI/ML 方案的所有资源、服务和多个用户帐户。When the blueprint is installed to Azure, all resources, services and several user accounts are created to support the AI/ML scenario with appropriate actors, permissions, and services.

此蓝图包含用于预测患者住院时长的 AI/ML 实验,此实验有助于预测人员配备、床位数及其他组织工作。The blueprint includes an AI/ML experiment to predict a patient’s length of stay, which can help in forecasting staffing, bed counts, and other logistics. 此包包括安装脚本、示例代码、测试数据、安全和隐私支持等。The package includes installation scripts, example code, test data, security and privacy support and more.

蓝图技术资源Blueprint technical resources

下列资源均可在此 GitHub 存储库中找到。The resources below are all found in this GitHub repository.

主要资源有:Primary resources are:

  1. 部署、配置和其他任务的 PowerShell 脚本。PowerShell scripts for deployment, configuration, and other tasks.
  2. 安装详细说明,其中包括如何使用安装脚本。Detailed instructions for installation which include how to use the install script.
  3. 常见问题解答综合A comprehensive FAQ.

此模型的横切关注点包括标识和安全性,处理患者数据时这两者均格外重要。Cross cutting concerns for this model include identity and security, both of which are especially important when dealing with patient data. ML 管道的组件如下图中所示。The components of the ML pipeline are shown in this graphic.

ML 管道

下图介绍了安装的 Azure 产品。The graphic below shows the Azure products that are installed. 每个资源或服务提供一个 AI/ML 处理解决方案组件,其中包括标识和安全性横切关注点。Each resource or service provides a component of the AI/ML processing solution, including the cross-cutting concerns of identity and security.

组件区域

在受管控的医疗保健环境中实现新系统,此过程较为复杂。Implementing a new system in a regulated healthcare environment is complex. 例如,确保系统的各个方面都符合 HIPAA 并可受 HITRUST 认证,这不仅仅需要开发一个轻型解决方案。For example, ensuring all aspects of the system are HIPAA compliant and HITRUST certifiable takes more than developing a lightweight solution. 此蓝图会安装标识和资源权限,以降低这些难度。The blueprint installs identification and resource permissions to help with these complexities.

此蓝图还提供其他脚本和数据,用于模拟和研究入院患者或出院患者的结果。The blueprint also provides additional scripts and data used to simulate and study the results of admitting or discharging patients. 借助这些脚本,员工可以快速了解如何使用此解决方案在安全独立的应用场景中实现 AI 和 ML。These scripts allow staff to immediately begin to learn how to implement AI and ML using the solution in a safe, isolated scenario.

其他蓝图资源Additional blueprint resources

此蓝图将为技术人员提供优质的指南和说明,还会提供项目来帮助创建功能完备的安装。The blueprint provides exceptional guidance and instructions for technical staff and also includes artifacts to help create a fully functional installation. 另外的这类项目包括:These other artifacts include:

  1. 威胁模型(与 Microsoft 威胁建模工具结合使用)。A threat model for use with the Microsoft Threat Modeling Tool. 此威胁模型说明了解决方案的组件、组件之间的数据流以及信任边界。This threat model shows components of the solution, the data flows between them, and the trust boundaries. 若要扩展基本蓝图,可使用此工具进行威胁建模,或者可使用此工具来从安全性角度了解系统体系结构。The tool can be used for threat modeling by those looking to extend the base blueprint or for learning about the system architecture from a security perspective.

  2. HITRUST 客户责任矩阵(使用 Excel 工作簿格式)。The HITRUST customer responsibility matrix, in an Excel workbook. 它说明了客户和 Microsoft 要针对矩阵中的每个要求提供的内容。This shows what you (the customer) must provide versus what Microsoft provides for each requirement in the matrix. 有关此责任矩阵的详细信息,请参阅本文档中的“安全性和合规性”>“蓝图责任矩阵”部分。More information about this responsibility matrix is included in this article in the Security and Compliance > Blueprint responsibility matrix section of this document.

  3. HITRUST 健康数据和 AI 审查白皮书从进行 HITRUST 认证要满足的要求这一视角审视了此蓝图。The HITRUST health data and AI review whitepaper examines the blueprint through the lens of requirements to be met for HITRUST certification.

  4. HIPAA 健康数据和 AI 审查白皮书以 HIPAA 规定为准则审查了此体系结构。The HIPAA health data and AI review whitepaper reviews the architecture with HIPAA regulations in mind.

若要获取这些资源,请访问 GitHubThese resources are here on GitHub.

安装蓝图Installing the blueprint

只需投入少量时间即可启动并运行此蓝图解决方案。There is little time investment to get up and running with this blueprint solution. 建议掌握一些 PowerShell 脚本编写知识,也可用分步说明来指导安装过程,以便无论技术人员的脚本编写技能如何都可成功部署此蓝图。A bit of PowerShell scripting knowledge is recommended, but step by step instructions are available to help guide the installation so technologists will be successful deploying this blueprint regardless of their scripting skills.

经验不足的技术人员有望在 30 分钟到 1 个小时之内使用 Azure 安装好此蓝图。Technical staff can expect to install the blueprint with little experience using Azure in 30 minutes to an hour.

安装脚本The installation script

此蓝图提供了优质的指南和安装说明。The blueprint provides exceptional guidance and instructions for installation. 此外它还提供了用于安装和卸载蓝图服务和资源的脚本。It also provides scripting for install and uninstall of the blueprint services and resources. 调用 PowerShell 部署脚本较为简单。Calling the PowerShell deployment script is simple. 安装此蓝图前,必须收集特定数据,并将其用作 deploy.ps1 脚本的参数,如下所示。Before the blueprint is installed, certain data must be collected and used as arguments to the deploy.ps1 script as show below.

.\deploy.ps1 -deploymentPrefix <prefix> `
            -tenantId <tenant id> ` # also known as the AAD directory
            -tenantDomain <tenant domain> `
            -subscriptionId <subscription id> `
            -globalAdminUsername <user id> ` # ID from your AAD account
            -deploymentPassword <universal password> ` # applied to all new users and service accounts
            -appInsightsPlan 1 # we want app insights set up

安装环境The installation environment

重要说明:Important! 请勿从 Azure 外部的计算机中安装此蓝图。Do not install the blueprint from a machine outside of Azure. 若在 Azure 中创建一个干净的 Windows 10(或其他 Windows VM)并从此处运行安装脚本,成功安装的可能性会更高。The install is much more likely to succeed if you create a clean Windows 10 (or other Windows VM) in Azure and run the install scripts from there. 此方法使用基于云的 VM 来减少延迟,并促进安装顺利生成。This technique uses a cloud-based VM to mitigate latency and help to create a smooth installation.

安装期间,脚本会调用要加载和使用的其他包。During installation, the script calls out to other packages to load and use. 从 Azure 中的 VM 进行安装时,安装计算机和目标资源之间的延迟会更少。When installing from a VM in Azure, the lag between the installation machine and the target resources will be much lower. 但某些下载的脚本包仍易延迟,因为脚本包位于 Azure 环境之外,这可能会导致超时故障。However, some of the scripting packages downloaded are still vulnerable to latency as script packages live outside the Azure environment—which may lead to time-out failures.

安装失败!Install failure! (不必惊慌)(don’t panic)

安装时,安装程序会下载一些外部包。The installer downloads some external packages during installation. 有时,脚本资源请求会因安装计算机和包之间的延迟而超时。Sometimes, a script resource request will time out due to lag between the install machine and the package. 发生此情况时,有以下 2 个选项:When this happens, you have two choices:

  1. 不进行任何更改,重新运行安装脚本。Run the install script again with no changes. 安装程序查看已分配的资源,并仅安装所需的资源。The installer checks for already allocated resources and installs only those needed. 此技术可以起作用,但存在安装脚本会试图分配已就绪资源的风险。While this technique can work, there is a risk the install script will try to allocate resources already in place. 这可导致错误,安装会失败。This can cause an error and the installation will fail.

  2. 仍运行 deploy.ps1 脚本,但传递其他参数来卸载蓝图服务。You still run the deploy.ps1 script, but pass different arguments for uninstalling the blueprint services.

.\deploy.ps1 -clearDeploymentPrefix <prefix> `
             -tenantId <value> `
             -subscriptionId <value> `
             -tenantDomain <value> `
             -globalAdminUsername <value> `
             -clearDeployment

完成卸载后,在安装脚本中更改前缀,并尝试重新安装。After the uninstall is done, change the prefix in the install script and try installing again. 延迟问题可不再发生。The latency issue may not occur again. 若在下载脚本包时安装失败,则运行卸载程序脚本,然后重新运行安装程序。If the installation fails while downloading script packages, run the uninstaller script and then the installer again.

运行卸载脚本后,会删除以下内容。After running the uninstall script, the following will be gone.

  • 安装程序脚本安装的用户Users installed by the installer script
  • 会删除资源组及其各自的服务,其中包括数据存储The resource groups and their respective services are gone, including data storage
  • 使用 AAD (Azure Active Directory) 注册的应用程序The application registered with AAD (Azure Active Directory)

请注意,Key Vault 会以“软删除”的方式保留,尽管门户中未显示它,但在 30 天内不会将其解除分配。Note the Key Vault is held as a “soft delete” and while it isn’t seen in the portal, it doesn’t get deallocated for 30 days. 这样如有必要则可将其还原。This enables reconstituting the Key Vault if needed. 若要进一步了解此内容的含义及如何处理它,请参阅本文的“技术问题”>“Key Vault”部分。To learn more about the implications of this and how to handle it, see the Technical Issues > Key Vault section of this article.

卸载后重新安装Reinstall after an uninstall

如需在卸载蓝图后再重新安装它,必须在下一个部署中更改前缀,因为如若不更改前缀已卸载的 Key Vault 会引发错误。If there is a need to reinstall the blueprint after an uninstall, you must change the prefix in the next deployment as the uninstalled Key Vault will cause an error if you do not change the prefix. 有关此内容的详细信息,请参阅本文的“技术问题”>“Key Vault”部分。More about this is covered in the Technical Issues > Key Vault section of this article.

所需的管理员角色Required administrator roles

安装此蓝图的人员必须属于 AAD 的全局管理员角色。The person installing the blueprint must be in the Global Administrator role in the AAD. 此外,安装帐户必须是所使用的订阅的 Azure 订阅管理员。The installing account must also be an Azure subscription administrator for the subscription being used. 若执行安装的人员不属于这两个角色的其中之一,安装将失败。If the person doing the install is not in both of these roles, the install will fail.

蓝图安装程序

此外,由于与 AAD 的集成紧密,因此安装不适用于使用 MSDN 订阅进行。Further, the install is not designed to work with MSDN subscriptions due to the tight integration with AAD. 必须使用标准 Azure 帐户。A standard Azure account must be used. 如有必要,则获取免费试用版,以使用它提供的额度安装蓝图解决方案并运行其演示。If needed, get a free trial with credit to spend for installing the blueprint solution and running its demos.

添加其他资源Adding other resources

Azure 蓝图安装仅包含实现 AI/ML 用例所需的服务。The Azure blueprint installation doesn’t include more services than those needed to implement the AI/ML use case. 但可将其他资源或服务添加到 Azure 环境,使它成为其他计划的良好测试平台或生产系统的起点。However, more resources or services can be added to the Azure environment, making it a good test bed for additional initiatives, or a starting point for a production system. 例如,可以添加同一个订阅和 AAD 中的其他 PaaS 服务或 IaaS 资源。For instance, one might add other PaaS services or IaaS resources in the same subscription and AAD.

需要其他 Azure 功能时,可以将新资源(如 Cosmos DB)或新的 Azure Functions 添加到解决方案。New resources, like Cosmos DB or a new Azure Functions, may be added to the solution as more Azure capabilities are needed. 添加新资源或服务时,务必将它们配置为符合安全和隐私策略,以便符合法规和政策的要求。When adding new resources or services, ensure they are configured to meet security and privacy policies to remain compliant with regulations and policy.

可使用 Azure REST APIAzure PowerShell 脚本Azure 门户创建新资源和服务。New resources and services may be created with Azure REST APIs, Azure PowerShell scripting, or by using the Azure Portal.

将机器学习和蓝图结合使用Using machine learning with the blueprint

此蓝图用于通过模型中使用的回归算法演示 ML 方案来预测患者的住院时长The blueprint was built to demonstrate an ML scenario with a regression algorithm used in a model to predict a patient’s length of stay. 这是医疗保健提供商要运行的常见预测,因为它有助于安排人员配备及其他工作决策。This is a common prediction for healthcare providers to run as it helps in scheduling staffing and other operational decisions. 此外,当特定病况的平均住院时长上升或下降时,可逐渐检测到异常。Further, anomalies can be detected over time when an average length of stay for a given condition rises or declines.

引入定型数据Ingesting training data

安装蓝图且所有服务正常工作后,可引入要分析的数据。With the blueprint installed and all services working properly, the data to be analyzed can be ingested. 可引入 10 万份患者记录并且可供模型使用。100,000 patient records are available for ingest and working with the model. 引入患者记录是使用 Azure 机器学习工作室运行患者住院时长实验的第一步,如下所示。Ingesting patient records is the first step in the using Azure Machine Learning Studio to run the patient length of stay experiment as shown below.

引入

此蓝图包括一个实验以及在机器学习工作室 (MLS) 中运行 ML 作业所需的数据。The blueprint includes an experiment and the necessary data to run an ML job in Machine Learning Studio (MLS). 示例在实验中使用定型模型来基于大量变量预测患者住院时长。The example uses a trained model in an experiment to forecast patient length of stay based on many variables.

在此演示环境中,引入 Azure SQL 数据库的数据没有任何缺陷或任何缺失的数据元素。In this demonstration environment, the data ingested into the Azure SQL database is free from any defects or missing data elements. 此为干净的数据。This data is clean. 通常引入不干净的数据后,必须先进行“清理”才能将其提供给机器学习定型算法或将其用于 ML 作业。Often, unclean data is ingested and must be “cleaned” before it can be used for feeding a machine learning training algorithm, or before using the data in a ML job. 数据中存在缺失数据或错误值会对 ML 分析结果产生不利影响。Missing data or incorrect values in the data will negatively impact results of the ML analysis.

Azure 机器学习工作室Azure Machine Learning Studio

许多医疗保健组织没有专注于 ML 项目的技术人员。Many healthcare organizations don’t have the technical staff to focus on ML projects. 这通常意味着有用的数据被束之高阁,或高价引进顾问来创建 ML 解决方案。This often means valuable data is left unused or expensive consultants are brought in to create ML solutions.

AI/ML 专家及了解 AI/ML 的人员可以使用 Azure 机器学习工作室设计实验。AI/ML experts as well as those learning about AI/ML can use Azure Machine Learning Studio to design experiments. MLS 是基于 Web 的设计环境,用于创建 ML 实验。MLS is a web-based design environment used to create ML experiments. 借助 MLS,可以创建模型,并对模型进行定型、评估和评分,从而可以在使用不同工具开发模型时节省宝贵的时间。With MLS, you can create, train, evaluate, and score models, saving precious time when using different tools to develop models.

MLS 会为 ML 工作负载提供完整的工具集。MLS offers a complete toolset for ML workloads. 这意味着借助此工具 ML 新手可以快速启动,并且生成结果的速度也会快于其他 ML 工具。This means people new to ML can get a jump start using the tool and produce results faster than with other ML tools. 这样 IT 人员即可集中精力在其他方面发挥价值,并且无需引进 ML 专家。That lets your IT staff focus on providing value elsewhere and without bringing in a ML specialist. 在自己的健康保健组织中,此功能意味着可以检验各种假设,并且为可操作见解分析的生成数据,例如患者干预法会提供预编写模块,这些模块可用于拖放式画布,从而以可见的方式将端到端数据科学工作流撰写为实验。This capability in your own healthcare organization means various hypotheses can be tested and the resulting data analyzed for actionable insights, like patient interventionism offers pre-written modules to be used on a drag and drop canvas, visually composing end-to-end data-science workflows as experiments.

有封装特定算法的预编写模块,例如决策树、决策林、聚类分析、时序、异常情况检测等。There are pre-written modules that encapsulate specific algorithms such as decision trees, decision forests, clustering, time series, anomaly detection and others.

可将自定义模块添加到任意实验。Custom modules can be added to any experiment. 这些使用 R 语言Python 编写。These are written in the R language or in Python. 这样即可使用预生成模块及自定义逻辑创建更复杂的实验。This allows using pre-built modules as well as custom logic to create a more sophisticated experiment.

借助 MLS,可以创建和使用学习模型,也可提供一组预设计实验供通用应用程序使用。MLS enables creating and using learning models, as well as providing a set of pre-designed experiments for use in common applications. 此外,无需更改蓝图的任何资源,即可将新实验添加到 MLS。Additionally, new experiments can be added to MLS without changing any of the blueprint’s resources.

可访问 Azure AI 库查找特定行业随时可用的 ML 解决方案(其中包括医疗保健行业),以节省时间。To save time, visit the Azure AI Gallery to find ready-to-use ML solutions for specific industries, including healthcare. 例如,此库包含用于乳腺癌检测和心脏病预测的解决方案和实验。For example, the gallery includes solutions and experiments for breast cancer detection and heart disease prediction.

安全和符合性Security and compliance

在医疗保健环境中创建、安装或管理软件系统时,安全和合规是最需注意的两个要点。Security and compliance are two of the most important things to be mindful of when creating, installing or managing software systems in a healthcare environment. 如果不符合安全策略和认证的要求,对软件系统采用的投资效力可能会下降。The investment made in adopting a software system can be undercut by not meeting required security policies and certifications.

本文和医疗保健蓝图侧重于技术安全,但其他类型的安全也很重要,其中包括物理安全和管理安全。Although this article and the healthcare blueprint focus on technical security, other types of security are also important including physical security and administrative security. 这些安全主题不属于本文范围,本文重点介绍此蓝图的技术安全。These security topics are beyond the scope of this article, which focuses on the blueprint’s technical security.

最低权限原则Principle of least privilege

此蓝图会安装具有相关角色的命名用户,用于支持和限制他们对解决方案中资源的访问需求。The blueprint installs named users with roles to support and limit their access needs to resources in the solution. 此模型称为“最低权限原则”,它是系统设计中的一种资源访问方法。This model is known as the “principle of least privilege,” an approach to resource access in system design. 此原则主张服务和用户帐户应只有实现合理用途所需的系统和服务的访问权限。The principle states that service and user accounts should have access to only those systems and services needed for a legitimate purpose.

此安全模型可确保系统符合 HIPAA 和 HITRUST 的要求,并消除对组织的风险。This security model ensures the system’s compliance with HIPAA and HITRUST requirements, removing risk to the organization.

深层防御Defense in depth

使用多个安全控件抽象层的系统设计在使用深层防御。System designs using multiple abstraction layers of security controls are using defense in depth. 深层防御在多个级别提供安全冗余。Defense in depth provides security redundancy at multiple levels. 这意味着不仅仅依赖单个防御层。It means you are not dependent on a single layer of defense. 它可确保用户和服务帐户拥有适当的资源、服务和数据访问权限。It ensures that user and service accounts have appropriate access to resources, services and data. Azure 在系统体系结构的各个级别提供安全和监视资源,来为整个技术环境提供深层防御。Azure provides security and monitoring resources at every level of system architecture to provide defense in depth for the entire landscape of technologies.

在软件系统中(如此蓝图所安装的系统),用户可以进行登录,但没有对特定资源的权限。In a software system, like the one installed by the blueprint, a user may login but not have permission to a specific resource. 此深层防御示例由 RBAC(基于角色的访问控制)和 AAD 提供,它支持最低权限原则。This example of defense in depth is provided by RBAC (Role Based Access Control) and AAD, supporting the principle of least privilege.

此外,双重身份验证也是一种深层技术防御形式,安装蓝图时也可以选择将它包括在内。Two-factor authentication is also a form of technical defense in depth and may be optionally included when the blueprint is installed.

Azure Key VaultAzure Key Vault

Azure Key Vault 服务是一种容器和保管库,用于存储应用程序使用的机密、证书及其他数据。The Azure Key Vault service is a container—vault—used to store secrets, certificates, and other data used by applications. 其中包括数据库字符串、REST 终结点 URL、API 密钥及开发人员不想硬编码到应用程序或不想分发到 .config 文件的其他内容。These include database strings, rest endpoint URLs, API keys, and other things developers don’t want to hard-code into an application or distribute in a .config file.

此外,保管库可供具有 AAD 权限的应用程序服务标识或其他帐户访问。Additionally, vaults are accessible by application service identities or other accounts in with AAD permissions. 这样需要保管库内容的应用程序即可在运行时访问机密。This allows secrets to be accessed at runtime by applications needing a vault’s contents.

可以将保管库中存储的密钥加密或签名,并可通过监视密钥使用情况来查找任何安全问题。Keys stored in a vault may be encrypted or signed, and key usage can be monitored for any security concerns.

若删除了某个 Key Vault,则不会立即从 Azure 清除它。If a Key Vault is deleted, it is not immediately purged from Azure. 有关此内容的含义的详细信息,请参阅本文的“技术问题”>“Key Vault”部分。Implications of this are covered in the Technical Issues > Key Vault section of this article.

Application InsightsApplication Insights

医疗保健组织通常具有任务和生命关键系统,这些系统必须可靠且可复原。Healthcare organizations often have mission and life-critical systems that must be reliable and resilient. 服务异常和中断必须尽快检测到并加以修复。Anomalies or disruptions in service must be detected and corrected as soon as possible. Application Insights 是一种应用程序性能管理 (APM) 技术,可监视应用程序并在出错时发送警报。Application Insights is an Application Performance Management (APM) technology that monitors applications and sends alerts when something goes wrong. 它会监视运行中的应用程序是否出现错误或应用程序异常。It monitors applications at runtime for errors or application anomalies. 它可与多种编程语言结合使用,并可提供一套丰富的功能,来帮助确保应用程序正常并顺利运行。It is designed to work with multiple programming languages and provides a rich set of capabilities to help ensure applications are healthy and running smoothly.

例如,应用程序可能会存在内存泄漏。For example, an application may have a memory leak. Application Insights 可通过它监视的大量报告和 KPI 来帮助查找和诊断此类问题。Application Insights can help find and diagnose issues like this through the rich reporting and KPIs it monitors. 对于应用程序开发人员而言,Application Insights 是一种可靠地 APM 服务。Application Insights is a robust APM service for application developers.

互动演示介绍了 Application Insights 的主要功能,其中包括全面的监视仪表板,健康组织中的技术人员可使用此仪表板监视应用程序状态和运行状况。This interactive demo shows key features and capabilities of Application Insights, including a comprehensive monitoring dashboard which can be used by technologists in the health organization to monitor application state and health.

Azure 安全中心Azure Security Center

在任务关键应用程序中,必须实时监视安全性和 KPI。Real time security and KPI monitoring is a necessity in mission critical applications. Azure 安全中心 (ASC) 可帮助确保 Azure 资源安全且受保护。Azure Security Center (ASC) helps ensure your Azure resources are secure and protected. ASC 是一种安全管理和高级威胁防护服务。ASC is a security management and advanced threat protection service. 可使用它对各种工作负载应用安全策略、减少受到的威胁,以及检测和响应攻击。It can be used to apply security policies across your workloads, limit your exposure to threats and detect and respond to attacks.

安全中心标准提供以下服务:Security Center standard provides the following services.

  • 混合安全 – 在所有本地和云工作负载上获得统一的安全视图。Hybrid security – Get a unified view of security across all your on-premises and cloud workloads. 此服务对医疗保健组织通过 Azure 使用的混合云网络尤其有用。This is especially helpful in hybrid cloud networks used by healthcare organizations with Azure.
  • 高级威胁检测 – ASC 使用高级分析和 Microsoft Intelligent Security Graph 来压制不断演变的网络攻击,并立刻削弱这些攻击。Advanced threat detection – ASC uses advanced analytics and the Microsoft Intelligent Security Graph to get an edge over evolving cyber-attacks and mitigate them right away.
  • 访问和应用程序控件 - 通过应用适合特定工作负载且由机器学习提供支持的允许列表建议,阻止恶意软件和其他不需要的应用程序。Access and application controls - Block malware and other unwanted applications by applying whitelisting recommendations for your specific workloads and powered by machine learning.

在健康 AI 蓝图的上下文中,ASC 会分析系统组件,并提供一个仪表板来显示订阅中的服务和资源的漏洞。In the context of the Health AI blueprint, ASC analyzes the system components and provides a dashboard showing vulnerabilities in services and resources in the subscription. 清楚的仪表板元素会显示解决方案的问题,如下所示。Distinct dashboard elements provide visibility into a solution’s concerns as follows.

  • 策略和符合性Policy and compliance
  • 资源安全机制Resource security hygiene
  • 威胁防护Threat protection

以下是示例仪表板,其中标识了用于改进系统威胁漏洞的 13 个建议。Below is an example dashboard identifying 13 suggestions for improving system threat vulnerabilities. 此外,它还显示 HIPAA 和策略合规性仅为 46%。It also shows a mere 46% compliance with HIPAA and policy.

威胁防护

钻取高严重性安全问题会显示受影响的资源以及各个资源所需进行的修正,如下所示。Drilling into the high severity security problems shows what resources are affected and the remediation needed for each resource, as shown below.

IT 人员手动保护所有资源和网络可能会花费许多小时的时间。Many hours can be spent by IT staff trying to manually secure all resources and networks. 借助 ASC 标识给定系统中的漏洞,则可将时间用于其他战略追求。With ASC to identify vulnerabilities in a given system, time can be spent in other strategic pursuits. 对于许多标识的漏洞,无需管理员深入探究问题,ASC 即可自动应用修正操作,并保护相应的资源。For many of the vulnerabilities identified, ASC can automatically apply the remediating action and secure the resource without an administrator having to dig deeply into the problem.

高风险

ACS 还通过自身的威胁检测和警报功能发挥其他作用。ACS does even more through its threat detection and alerting capabilities. 可使用 ACS 监视网络、计算机和云服务是否有即将来袭的攻击和攻破后活动来保护环境。Use ACS to monitor networks, machines, and cloud services for incoming attacks and post-breach activity to keep your environment secure. ASC 自动从各种 Azure 资源收集、分析并集成安全信息和日志。ASC automatically collects, analyzes, and integrates security information and logs from a variety of Azure resources.

ASC 中的 ML 功能可以检测手动方法无法发现的威胁。The ML capabilities in ASC allow it to detect threats manual approaches would not reveal. ASC 会显示一个按优先级排列的安全警报列表,并提供快速调查问题所需的信息以及关于如何修复攻击的建议。A list of prioritized security alerts is shown in ASC along with the information needed to quickly investigate the problem along with recommendations for how to remediate an attack.

RBAC 安全性RBAC security

基于角色的访问控制 (RBAC) 提供或拒绝对受保护资源的访问,有时针对每个资源的特定权限。Role Based Access Control (RBAC) provides or denies access to protected resources, sometimes with specific rights per resource. 这可确保只有适当的用户可以访问指定的系统组件。This ensures only appropriate users can access their designated system components. 例如,数据库管理员可以访问包含加密患者数据的数据库,而健康护理提供者只能通过可显示相应患者记录的应用程序访问这些记录。For example, a database administrator may have access to a database containing encrypted patient data whereas a health care provider may only have access to appropriate patient’s records through the application that displays them. 此应用程序通常是电子病例或电子健康记录系统。This is typically an Electronic Medical Record or Electronic Health Record system. 护士无需访问数据库,而数据库管理员无需查看患者的健康记录数据。The nurse has no need to access the databases and the database administrator has no need to see a patient’s health record data.

为实现此功能,RBAC 成为了 Azure 安全性的一部分,通过它可对 Azure 资源进行极有针对性的访问管理。To enable this, RBAC is part of Azure security and enables precisely focused access management for Azure resources. 通过用于各个用户的细化设置,安全和系统管理员可以非常精细地向各个用户授予权限。Fine-grained settings for each user enable security and systems administrators to be very exact in the rights they give each user.

蓝图责任矩阵Blueprint responsibility matrix

HITRUST 客户责任矩阵是一个 Excel 文档,用于支持客户实现和记录在 Azure 上构建的系统的安全控件。The HITRUST customer responsibility matrix is an Excel document that supports customers implementing and documenting security controls for systems built on Azure. 此工作簿列出了相关的 HITRUST 要求,并说明了 Microsoft 和客户的相互责任。The workbook lists the relevant HITRUST requirements and explains how Microsoft and the customer are responsible for meeting each one.

对于在 Azure 上构建系统的客户而言,了解在云环境中实现安全控件的共同承担责任至关重要。Understanding the shared responsibility for implementing security controls in a cloud environment is essential for customers building systems on Azure. 实现特定安全控件可以是 Microsoft 的责任、客户的责任或 Microsoft 与客户共同承担的责任。Implementing a specific security control may be the responsibility of Microsoft, the responsibility of customers, or a shared responsibility between Microsoft and customers. 不同的云实现会影响 Microsoft 和客户共同分担责任的方式。Different cloud implementations affect the way responsibilities are shared between Microsoft and customers.

请参阅以下责任表中的示例。See the responsibilities table below for examples.

Azure 责任Azure responsibilities 客户责任Customer responsibilities
Azure 负责实现、管理及监视与其服务预配环境相关的信息保护程序方法和机制。Azure is responsible for implementation, management, and monitoring of information protection program methods and mechanisms in relation to its service provision environment. 客户负责实现、配置、管理及监视用于访问和使用 Azure 服务的客户控制资产的信息保护程序方法和机制。The customer is responsible for implementation, configuration, management, and monitoring of information protection program methods and mechanisms for customer-controlled assets used to access and consume Azure services.
Azure 负责实现、配置、管理及监视与其服务预配环境相关的帐户管理方法和机制。Azure is responsible for implementation, configuration, management, and monitoring of account management methods and mechanisms in relation to its service provision environment. 客户还负责已部署 Azure 虚拟机实例及驻留应用程序组件的帐户管理。The customer is also responsible for account management of deployed Azure virtual machine instances and resident application components.

以上仅是部署云系统时要考虑的众多责任中的两个示例。These are only two examples of the many responsibilities to be considered when deploying cloud systems. HITRUST 客户责任矩阵用于支持组织在 Azure 系统实现方面的 HITRUST 合规性。The HITRUST customer responsibility matrix is designed to support an organization’s HITRUST compliance with an Azure system implementation.

自定义Customization

安装此蓝图后通常会对它进行自定义。It is common to customize the blueprint after it is installed. 自定义此环境的原因和方法不一。Reasons and techniques to customize the environment vary.

在安装此蓝图前,可通过修改安装脚本对它进行自定义。The blueprint may be customized before installation by modifying the install scripts. 此方法可行,但建议创建独立的 PowerShell 脚本,并在初始安装完成后运行它。While this is possible, it is advisable to create independent PowerShell scripts to run after the initial install is complete. 进行初始安装后,也可通过门户向系统添加新服务。New services may also be added to the system through the portal once the initial installation has taken place.

自定义可包括以下任意操作或其他更多操作。Customizations may include any or more of the following.

  • 向机器学习工作室添加新实验Adding new experiments to Machine Learning Studio
  • 向环境添加其他无关服务Adding additional unrelated services to environment
  • 将数据引入和 ML 实验输出修改为使用不用于 Azure SQL 患者数据库的数据源Modifying data ingestion and the ML experiment output to use a different data source than the Azure SQL patientdb database
  • 向 ML 实验提供生产数据Providing production data to the ML experiment
  • 清理任何即将引入的专有数据,以匹配实验所需的数据Cleaning any proprietary data being ingested to match that needed by the experiment

自定义安装与使用任意 Azure 解决方案没有区别。Customizing the installation is no different than working with any Azure solution. 可以添加或删除服务/资源,从而提供新功能。Services or resources may be added or removed, providing new capabilities. 自定义此蓝图时,请注意不要改变整体的 ML 管道,以便确保实现继续工作。When customizing the blueprint, take care not to alter the overall ML pipeline to ensure the implementation continues working.

技术问题Technical Issues

下列问题可能会导致蓝图安装失败,或使用不符合需求的配置进行安装。The following issues can cause the blueprint installation to fail or to install in an undesirable configuration.

Key VaultKey Vault

删除 Azure 资源时,Key Vault 较为独特。Key vaults are unique when deleting an Azure resource. 保管库会由 Azure 保留,以便还原。Vaults are kept by Azure for recovery purposes. 相应地,每次运行安装脚本时均必须将不同的前缀传入安装脚本,否则安装将因与旧保管库名称冲突而失败。Accordingly, a different prefix must be passed into the install script each time the install script is run, or the install will fail due to a collision with the old vault name. Key Vault 及所有其他资源使用提供给安装脚本的前缀命名。Key vaults, and all other resources, are named using the prefix you provide to the install script.

安装脚本创建的 Key Vault 会以“软删除”的方式保留 30 天。A Key Vault created by the installation script is retained as a “soft delete” for 30 days. 虽然当前无法通过门户访问软删除的 Key Vault,但可从 PowerShell 管理 Key Vault,甚至可以将其手动删除。Although not currently accessible through the portal, soft deleted Key Vaults are manageable from PowerShell, and may even be deleted manually.

Azure Active DirectoryAzure Active Directory

强烈建议在空白的 AAD 中安装此蓝图,而非安装到生产环境。It is strongly recommended that you install the blueprint in an empty AAD rather than into a production system. 创建新的 AAD 实例,并在安装时使用其租户 ID,以避免向实时 AAD 实例添加蓝图帐户。Create a new AAD instance and use its tenant id during installs to avoid adding blueprint accounts to your live AAD instance.

所提供的技术Technologies presented

  • 请详细了解 Azure 健康数据和 AI 蓝图Learn more about the Azure Health Data and AI blueprint.
  • 在下方下载、克隆或创建分支:GitHub 存储库Download, clone or fork the GitHub repo here.
  • 机器学习工作室是科学家用于创建机器学习实验的工作区个工具数据。Machine Learning Studio is the workspace and tool data scientists use to create Machine Learning experiments. 它允许使用内置算法、特殊用途的小组件以及 Python 和 R 脚本。It allows using built-in algorithms, special purpose widgets, and Python and R scripts.
  • 机密、证书及其他专用数据保管在 Azure Key Vault 中。Secrets, certificates, and other private data is held in Azure Key Vault.
  • 安装说明提供了所需的命令,但脚本语言 PowerShell 可帮助安装此蓝图。The scripting language PowerShell is instrumental to setting up the blueprint, although needed commands are presented in the installation instructions.
  • Azure AI 库 根据客户的行业提供有用的 AI/ML 解决方案方案盒。Azure AI Gallery provides a recipe box of AI/ML solutions useful for customers by their industry. 数据科学家及其他医疗保健方面的专家发布了多个解决方案。There are several solutions published by data scientists along with other experts for healthcare.
  • 借助 Azure 安全中心,可了解应用程序的行为、漏洞及缓解方法。Azure Security Center provides insights into your application’s behavior, vulnerabilities, and mitigation techniques.

总结Wrapping up

Azure 健康数据 AI 蓝图是一个完整的解决方案,技术人员可将它用作学习工具来深入了解 Azure 及如何确保系统符合医疗保健法规要求。The Azure Health Data AI blueprint is a complete ML solution and can be used as a learning tool for technologists to better understand Azure and how to ensure systems conform to healthcare regulatory requirements. 也可将其用作将 Azure 机器学习工作室作为焦点的生产系统的起点。It can also be used as a starting point for a production system using Azure Machine Learning Studio as the focal point.

下载蓝图,以在几个小时内启动实现,无需数天或数周。Download the blueprint to get started with your implementation in hours, not days or weeks. 若安装出现问题或有蓝图方面的疑问,请访问常见问题解答页If you have problems with your install or questions about blueprint, visit the FAQ page.

请下载支持附件,以从安装和 ML 实验角度之外深入了解蓝图实现。Download the supporting collateral to gain a better understanding of the blueprint implementation beyond the installation and ML experiment. 此附件包括以下内容:This collateral includes the following.

  1. HITRUST 客户责任矩阵HITRUST customer responsibility matrix
  2. 全面的威胁模型The comprehensive threat model
  3. HITRUST 健康数据和 AI 审查白皮书HITRUST health data and AI review whitepaper
  4. HIPAA 健康数据和 AI 审查HIPAA health data and AI review

无论是出于学习目的使用此蓝图,还是将其用于组织的 AI/ML 解决方案种子,它都为以医疗保健为焦点在 Azure 中使用 AI/ML 提供了起点。Whether you are using the blueprint for learning purposes or for the seed of an AI/ML solution for your organization, it provides a starting point for working with AI/ML in Azure with a focus on healthcare.