分析平台系统中的工作负荷管理Workload management in Analytics Platform System

SQL Server PDW 的工作负荷管理功能,用户和管理员可以将请求分配给预先设置的内存配置和并发性。SQL Server PDW's workload management capabilities allow users and administrators to assign requests to pre-set configurations of memory, and concurrency. 使用工作负荷管理,通过允许请求拥有适当的资源,而无需从而使任何请求,从而提高工作负荷的性能。Use workload management to improve performance of your workload, either consistent or mixed, by allowing requests to have the appropriate resources without starving any requests forever.

例如,通过 SQL Server PDW 中的工作负荷管理技术,可以:For example, with the workload management techniques in SQL Server PDW, you could:

  • 向加载作业分配大量资源。Allocate a large number of resources to a load job.

  • 指定更多用于生成列存储索引的资源。Specify more resources for building a columnstore index.

  • 排查速度较慢的哈希联接的问题,以查看它是否需要更多内存,并为其分配更多内存。Troubleshoot a slow-performing hash join to see if it needs more memory, and then give it more memory.

工作负荷管理基础知识Workload Management Basics

主要术语Key Terms

工作负载管理Workload Management
工作负荷管理 是了解和调整系统资源利用率以便实现并发请求的最佳性能的能力。Workload Management is the ability to understand and adjust system resource utilization in order to achieve the best performance for concurrent requests.

资源类Resource Class
在 SQL Server PDW 中, 资源类 是具有预先分配的内存和并发限制的内置服务器角色。In SQL Server PDW, a resource class is a built-in server role that has pre-assigned limits for memory and concurrency. SQL Server PDW 根据提交请求的登录名的资源类服务器角色成员身份,将资源分配给请求。SQL Server PDW allocates resources to requests according to the resource class server role membership of the login that submits the requests.

在计算节点上,资源类的实现使用 SQL Server 中的 Resource Governor 功能。On the Compute nodes, the implementation of resource classes uses the Resource Governor feature in SQL Server. 有关 Resource Governor 的详细信息,请参阅 MSDN 上的 Resource GovernorFor more information about Resource Governor, see Resource Governor on MSDN.

了解当前资源利用率Understand Current Resource Utilization

若要了解当前正在运行的请求的系统资源使用情况,请使用 SQL Server PDW 动态管理视图。To understand system resource utilization for the currently running requests, use the SQL Server PDW dynamic management views. 例如,你可以使用 Dmv 来了解运行速度较慢的大型哈希联接是否可以通过具有更多内存来受益。For example, you can use DMVs to understand if a slow-running large hash join could benefit by having more memory.

调整资源分配Adjust Resource Allocations

若要调整资源利用率,请更改正在提交请求的登录名的资源类成员身份。To adjust resource utilization, change the resource class membership of the login that is submitting the request. 资源类服务器角色命名为 mediumrclargercxlargercThe resource class server roles are named mediumrc, largerc, and xlargerc. 它们分别表示中等、大和超大的大型资源分配。They represent medium, large, and extra large resource allocations respectively.

例如,若要将大量系统资源分配给请求,请将提交请求的登录名添加到 largerc 服务器角色。For example, to allocate a large amount of system resources to a request, add the login that is submitting the request to the largerc server role. 以下 ALTER SERVER ROLE 语句将 login Anna 添加到 largerc 服务器角色。The following ALTER SERVER ROLE statement adds the login Anna to the largerc server role.

ALTER SERVER ROLE largerc ADD MEMBER Anna;  

资源类说明Resource Class Descriptions

下表描述了资源类及其系统资源分配。The following table describes the resource classes and their system resource allocations.

资源类Resource Class 请求重要性Request Importance 最大内存使用量 *Maximum Memory Usage* 并发槽 (最大值 = 32) Concurrency Slots (Maximum = 32) 说明Description
默认值default 中型Medium 400 MB400 MB 11 默认情况下,每个登录名都允许使用少量的内存和并发资源来处理其请求。By default, each login is allowed a small amount of memory, and concurrency resources for its requests.

将登录名添加到资源类后,新类优先。When a login is added to a resource class, the new class takes precedence. 从所有资源类中删除某个登录名后,该登录名将恢复为默认的资源分配。When a login is dropped from all resource classes, the login reverts back to the default resource allocation.
MediumRCMediumRC 中型Medium 1200 MB1200 MB 33 可能需要中型资源类的请求示例:Examples of requests that might need the medium resource class:

具有大型哈希联接的 CTAS 操作。CTAS operations that have large hash joins.

选择需要更多内存以避免缓存到磁盘的操作。SELECT operations that need more memory to avoid caching to disk.

将数据加载到聚集列存储索引。Loading data into clustered columnstore indexes.

为包含10-15 列的较小表生成、重新生成和重新组织聚集列存储索引。Building, rebuilding, and reorganizing clustered columnstore indexes for smaller tables that have 10-15 columns.
LargercLargerc High 2.8 GB2.8 GB 77 可能需要大型资源类的请求示例:Examples of requests that might need the large resource class:

非常大的 CTAS 操作,这些操作具有巨大的哈希联接,或包含大型聚合,如大的 ORDER BY 或 GROUP BY 子句。Very large CTAS operations that have huge hash joins, or contain large aggregations, such as large ORDER BY or GROUP BY clauses.

选择需要大量内存来执行哈希联接或聚合(如 ORDER BY 或 GROUP BY 子句)的操作SELECT operations that require very large amounts of memory for operations such as hash joins, or aggregations such as ORDER BY or GROUP BY clauses

将数据加载到聚集列存储索引。Loading data into clustered columnstore indexes.

为包含10-15 列的较小表生成、重新生成和重新组织聚集列存储索引。Building, rebuilding, and reorganizing clustered columnstore indexes for smaller tables that have 10-15 columns.
xlargercxlargerc High 8.4 GB8.4 GB 2222 特大资源类适用于在运行时可能需要额外大资源消耗的请求。The extra large resource class is for requests that could require extra large resource consumption at run time.

*最大内存使用率是近似值。*Maximum memory usage is an approximation.

请求重要性Request Importance

请求重要性映射到计算节点上运行的 SQL Server CPU 时间,将为请求分配。The request importance maps to the amount of CPU time that SQL Server, running on the Compute nodes, will give to the requests. 优先级较高的请求接收更多的 CPU 时间。Requests with higher priority receive more CPU time.

最大内存使用情况Maximum Memory Usage

最大内存使用率是请求可在每个处理空间内使用的最大可用内存量。Maximum memory usage is the maximum amount of available memory a request can use within each processing space. 例如,mediumrc 请求最多可使用 1200 MB 来处理每个分布。For example a mediumrc request can use up to 1200 MB for processing within each distribution. 务必要确保数据不会歪斜,以避免几个分发执行大部分工作。It is still important to ensure data is not skewed in order to avoid having a few distributions performing most of the work.

并发槽Concurrency Slots

分配1、3、7和22并发槽的目标是允许同时运行大型和小型进程,而不会在运行大型进程时阻止小型进程。The goal of allocating 1, 3, 7, and 22 concurrency slots is to allow both large and small processes to run at the same time, without blocking small process when a large process is running. 例如,SQL Server PDW 可以分配最多32并发槽,以运行1个特大请求 (22 个槽) ,1个大型请求 (7 个槽) ,1个中型请求 (3 个槽) 。For example, SQL Server PDW can allocate maximum of 32 concurrency slots to run 1 extra large request (22 slots), 1 large request (7 slots), and 1 medium request (3 slots) at the same time.

将最多32并发槽分配给并发请求的示例:Examples of allocating up to 32 concurrency slots to concurrent requests:

  • 28个槽 = 4 个大28 slots = 4 large

  • 30个槽 = 10 个中等30 slots = 10 medium

  • 32插槽 = 32 默认值32 slots = 32 default

  • 32槽 = 1 个特大 + 1 大 + 1 中型32 slots = 1 extra large + 1 large + 1 medium

  • 32插槽 = 2 大 + 4 中等 + 6 默认32 slots = 2 large + 4 medium + 6 default

假设有6个大请求提交到 SQL Server PDW,然后提交10个默认请求。Suppose 6 large requests are submitted to SQL Server PDW, and then 10 default requests are submitted. SQL Server PDW 将按优先级顺序处理请求,如下所示:SQL Server PDW will process the requests in priority order as follows:

  • 分配28个并发槽,以便在内存可用时开始处理4个大型请求,并将两个大型请求保存在队列中。Allocate 28 concurrency slots to start processing 4 large requests as memory becomes available, and keep 2 large requests in the queue.

  • 分配4个并发槽,开始处理4个默认请求并在等待队列中保留6个默认请求。Allocate 4 concurrency slots to start processing 4 default requests and keep 6 default requests in the wait queue.

当请求完成并且并发槽可用时,SQL Server PDW 会根据可用资源和优先级分配剩余请求。As requests finish and concurrency slots become available, SQL Server PDW will allocate the remaining requests according to available resources and priority. 例如,当打开7个并发槽时,等待大请求的优先级比等待中型请求的优先级高。For example, when there are 7 concurrency slots open, waiting large requests will have higher priority for the 7 slots than waiting medium requests. 如果打开了6个槽,则 SQL Server PDW 将分配其他6个默认大小的请求。If 6 slots open, then SQL Server PDW will allocate 6 more default-sized requests. 但是,在 SQL Server PDW 允许运行请求之前,内存和并发槽必须全部可用。However, memory and concurrency slots must all be available before SQL Server PDW allows a request to run.

在每个资源类中,请求先以 (FIFO) 顺序运行。Within each resource class, the requests run in first in first out (FIFO) order.

一般备注General Remarks

如果登录名是多个资源类的成员,则具有最多资源的类优先。If a login is a member of more than one resource class, the class with the most resources takes precedence.

当向资源类添加登录名或从中删除登录名时,更改会立即对所有将来的请求生效;当前正在运行或等待的请求不受影响。When a login is added to or dropped from a resource class, the change takes effect immediately for all future requests; current requests that are running or waiting are not affected. 登录无需断开连接和重新连接才能发生更改。The login does not need to disconnect and reconnect in order for the change to occur.

对于每个登录名,资源类设置将应用于单个语句和操作,而不是应用于会话。For each login, the resource class settings are applied to individual statements and operations, and not to the session.

在 SQL Server PDW 运行某个语句之前,它会尝试获取该请求所需的并发槽。Before SQL Server PDW runs a statement, it tries to acquire the concurrency slots needed for the request. 如果无法获取足够的并发槽,SQL Server PDW 会将请求移到等待执行的状态。If it cannot acquire enough concurrency slots, SQL Server PDW moves the request into a waiting-to-be-executed state. 已分配给请求的所有资源系统都将返回到系统。All resources system that were already allocated to the request are returned back to the system.

大多数 SQL 语句始终需要默认资源分配,因此不受资源类控制。Most of the SQL statements always need the default resource allocations, and therefore are not controlled by resource classes. 例如,CREATE LOGIN 只需要少量的资源,并且即使登录名调用 CREATE LOGIN 是资源类的成员,也会分配默认资源。For example, CREATE LOGIN only needs a small amount of resources, and is allocated the default resources even if the login calling CREATE LOGIN is a member of a resource class. 例如,如果 Anna 是 largerc 资源类的成员,而她提交了 CREATE LOGIN 语句,则 CREATE LOGIN 语句将使用默认的资源数来运行。For example, if Anna is a member of the largerc resource class and she submits a CREATE LOGIN statement, the CREATE LOGIN statement will run with the default number of resources.

由资源类控制的 SQL 语句和操作:SQL statements and operations governed by resource classes:

  • ALTER INDEX REBUILDALTER INDEX REBUILD

  • ALTER INDEX REORGANIZEALTER INDEX REORGANIZE

  • ALTER TABLE REBUILDALTER TABLE REBUILD

  • 创建聚集索引CREATE CLUSTERED INDEX

  • CREATE CLUSTERED COLUMNSTORE INDEXCREATE CLUSTERED COLUMNSTORE INDEX

  • CREATE TABLE AS SELECTCREATE TABLE AS SELECT

  • 创建远程表为 SELECTCREATE REMOTE TABLE AS SELECT

  • dwloader加载数据。Loading data with dwloader.

  • INSERT-SELECTINSERT-SELECT

  • UPDATEUPDATE

  • DELETEDELETE

  • 还原到具有更多计算节点的设备时还原数据库。RESTORE DATABASE when restoring into an appliance with more Compute nodes.

  • 选择,排除仅 DMV 查询SELECT, excluding DMV-only queries

限制和局限Limitations and Restrictions

资源类控制内存和并发分配。The resource classes govern memory and concurrency allocations. 它们不管理输入/输出操作。They do not govern input/output operations.

元数据Metadata

Dmv,其中包含有关资源类和资源类成员的信息。DMVs that contain information about resource classes and resource class members.

Dmv,其中包含请求的状态及其所需资源的信息:DMVs that contain information about the state of requests and the resources they require:

从计算节点上 SQL Server Dmv 公开的相关系统视图。Related system views exposed from the SQL Server DMVs on the Compute nodes. 请参阅 MSDN 上的 SQL Server 动态管理视图 以获取这些 dmv 的链接。See SQL Server Dynamic Management Views for links to these DMVs on MSDN.

  • sys.dm_pdw_nodes_resource_governor_resource_poolssys.dm_pdw_nodes_resource_governor_resource_pools

  • sys.dm_pdw_nodes_resource_governor_workload_groupssys.dm_pdw_nodes_resource_governor_workload_groups

  • sys.dm_pdw_nodes_resource_governor_resource_poolssys.dm_pdw_nodes_resource_governor_resource_pools

  • sys. dm_pdw_nodws_resource_governor_workload_groupssys.dm_pdw_nodws_resource_governor_workload_groups

  • sys.dm_pdw_nodes_exec_sessionssys.dm_pdw_nodes_exec_sessions

  • sys.dm_pdw_nodes_exec_requestssys.dm_pdw_nodes_exec_requests

  • sys.dm_pdw_nodes_exec_query_memory_grantssys.dm_pdw_nodes_exec_query_memory_grants

  • sys.dm_pdw_nodes_exec_query_resource_semaphoressys.dm_pdw_nodes_exec_query_resource_semaphores

  • sys.dm_pdw_nodes_os_memory_brokerssys.dm_pdw_nodes_os_memory_brokers

  • sys.dm_pdw_nodes_os_memory_cache_entriessys.dm_pdw_nodes_os_memory_cache_entries

  • sys.dm_pdw_nodes_exec_cached_planssys.dm_pdw_nodes_exec_cached_plans

工作负荷管理任务Workload Management Tasks