忙碌資料庫反模式Busy Database antipattern

對資料庫伺服器進行卸載處理,可能會導致它花費大部分時間執行程式碼,而不是回應儲存和擷取資料的要求。Offloading processing to a database server can cause it to spend a significant proportion of time running code, rather than responding to requests to store and retrieve data.

問題說明Problem description

許多資料庫系統可以執行程式碼。Many database systems can run code. 範例包括預存程序和觸發程序。Examples include stored procedures and triggers. 通常,鄰近著資料來執行這個處理更有效率,而不是將資料傳送到用戶端應用程式進行處理。Often, it's more efficient to perform this processing close to the data, rather than transmitting the data to a client application for processing. 不過,過度使用這些功能會降低效能,原因如下:However, overusing these features can hurt performance, for several reasons:

  • 資料庫伺服器可能會花費太多時間進行處理,而不接受新的用戶端要求及擷取資料。The database server may spend too much time processing, rather than accepting new client requests and fetching data.
  • 資料庫通常是共用的資源,所以會在高使用的期間成為瓶頸。A database is usually a shared resource, so it can become a bottleneck during periods of high use.
  • 如果資料存放區計量付費,執行階段成本可能過高。Runtime costs may be excessive if the data store is metered. 受控資料庫服務更是如此。That's particularly true of managed database services. 例如,資料庫交易單位 (DTU) 的 Azure SQL Database 費用。For example, Azure SQL Database charges for Database Transaction Units (DTUs).
  • 資料庫具有有限的相應增加容量,所以水平調整資料庫並不簡單。Databases have finite capacity to scale up, and it's not trivial to scale a database horizontally. 因此,最好是將處理移到可以輕易相應放大的運算資源,例如,VM 或 App Service 應用程式。Therefore, it may be better to move processing into a compute resource, such as a VM or App Service app, that can easily scale out.

此反模式會發生通常是因為:This antipattern typically occurs because:

  • 資料庫檢視為服務,而不是存放庫。The database is viewed as a service rather than a repository. 應用程式可能會使用資料庫伺服器來格式化資料 (例如,轉換成 XML)、操作字串資料,或執行複雜的計算。An application might use the database server to format data (for example, converting to XML), manipulate string data, or perform complex calculations.
  • 開發人員嘗試寫入查詢,其結果可以直接向使用者顯示。Developers try to write queries whose results can be displayed directly to users. 例如,查詢可能會根據地區設定合併欄位,或格式化日期、時間和貨幣。For example, a query might combine fields or format dates, times, and currency according to locale.
  • 開發人員藉由將計算推送至資料庫,嘗試更正沒有直接關聯的擷取Developers are trying to correct the Extraneous Fetching antipattern by pushing computations to the database.
  • 預存程序用來封裝商務邏輯,可能是因為被視為更容易維護和更新。Stored procedures are used to encapsulate business logic, perhaps because they are considered easier to maintain and update.

下列範例會擷取指定銷售領域的 20 個最有價值的訂單,並將結果格式化為 XML。The following example retrieves the 20 most valuable orders for a specified sales territory and formats the results as XML. 它會使用 Transact-SQL 函式來剖析資料,並將結果轉換成 XML。It uses Transact-SQL functions to parse the data and convert the results to XML. 您可以在這裡找到完整的範例。You can find the complete sample here.

SELECT TOP 20
  soh.[SalesOrderNumber]  AS '@OrderNumber',
  soh.[Status]            AS '@Status',
  soh.[ShipDate]          AS '@ShipDate',
  YEAR(soh.[OrderDate])   AS '@OrderDateYear',
  MONTH(soh.[OrderDate])  AS '@OrderDateMonth',
  soh.[DueDate]           AS '@DueDate',
  FORMAT(ROUND(soh.[SubTotal],2),'C')
                          AS '@SubTotal',
  FORMAT(ROUND(soh.[TaxAmt],2),'C')
                          AS '@TaxAmt',
  FORMAT(ROUND(soh.[TotalDue],2),'C')
                          AS '@TotalDue',
  CASE WHEN soh.[TotalDue] > 5000 THEN 'Y' ELSE 'N' END
                          AS '@ReviewRequired',
  (
  SELECT
    c.[AccountNumber]     AS '@AccountNumber',
    UPPER(LTRIM(RTRIM(REPLACE(
    CONCAT( p.[Title], ' ', p.[FirstName], ' ', p.[MiddleName], ' ', p.[LastName], ' ', p.[Suffix]),
    '  ', ' '))))         AS '@FullName'
  FROM [Sales].[Customer] c
    INNER JOIN [Person].[Person] p
  ON c.[PersonID] = p.[BusinessEntityID]
  WHERE c.[CustomerID] = soh.[CustomerID]
  FOR XML PATH ('Customer'), TYPE
  ),

  (
  SELECT
    sod.[OrderQty]      AS '@Quantity',
    FORMAT(sod.[UnitPrice],'C')
                        AS '@UnitPrice',
    FORMAT(ROUND(sod.[LineTotal],2),'C')
                        AS '@LineTotal',
    sod.[ProductID]     AS '@ProductId',
    CASE WHEN (sod.[ProductID] >= 710) AND (sod.[ProductID] <= 720) AND (sod.[OrderQty] >= 5) THEN 'Y' ELSE 'N' END
                        AS '@InventoryCheckRequired'

  FROM [Sales].[SalesOrderDetail] sod
  WHERE sod.[SalesOrderID] = soh.[SalesOrderID]
  ORDER BY sod.[SalesOrderDetailID]
  FOR XML PATH ('LineItem'), TYPE, ROOT('OrderLineItems')
  )

FROM [Sales].[SalesOrderHeader] soh
WHERE soh.[TerritoryId] = @TerritoryId
ORDER BY soh.[TotalDue] DESC
FOR XML PATH ('Order'), ROOT('Orders')

很明顯地,這是複雜的查詢。Clearly, this is complex query. 我們稍後將會看到,它轉為在資料庫伺服器上使用大量處理資源。As we'll see later, it turns out to use significant processing resources on the database server.

如何修正問題How to fix the problem

將處理從資料庫伺服器移到其他應用程式層。Move processing from the database server into other application tiers. 在理想情況下,您應該將資料庫限制為執行資料存取作業,僅使用資料庫已最佳化的功能,例如 RDBMS 中的彙總。Ideally, you should limit the database to performing data access operations, using only the capabilities that the database is optimized for, such as aggregation in an RDBMS.

例如,先前的 Transact-SQL 程式碼可以使用只擷取要處理的資料之陳述式來取代。For example, the previous Transact-SQL code can be replaced with a statement that simply retrieves the data to be processed.

SELECT
soh.[SalesOrderNumber]  AS [OrderNumber],
soh.[Status]            AS [Status],
soh.[OrderDate]         AS [OrderDate],
soh.[DueDate]           AS [DueDate],
soh.[ShipDate]          AS [ShipDate],
soh.[SubTotal]          AS [SubTotal],
soh.[TaxAmt]            AS [TaxAmt],
soh.[TotalDue]          AS [TotalDue],
c.[AccountNumber]       AS [AccountNumber],
p.[Title]               AS [CustomerTitle],
p.[FirstName]           AS [CustomerFirstName],
p.[MiddleName]          AS [CustomerMiddleName],
p.[LastName]            AS [CustomerLastName],
p.[Suffix]              AS [CustomerSuffix],
sod.[OrderQty]          AS [Quantity],
sod.[UnitPrice]         AS [UnitPrice],
sod.[LineTotal]         AS [LineTotal],
sod.[ProductID]         AS [ProductId]
FROM [Sales].[SalesOrderHeader] soh
INNER JOIN [Sales].[Customer] c ON soh.[CustomerID] = c.[CustomerID]
INNER JOIN [Person].[Person] p ON c.[PersonID] = p.[BusinessEntityID]
INNER JOIN [Sales].[SalesOrderDetail] sod ON soh.[SalesOrderID] = sod.[SalesOrderID]
WHERE soh.[TerritoryId] = @TerritoryId
AND soh.[SalesOrderId] IN (
    SELECT TOP 20 SalesOrderId
    FROM [Sales].[SalesOrderHeader] soh
    WHERE soh.[TerritoryId] = @TerritoryId
    ORDER BY soh.[TotalDue] DESC)
ORDER BY soh.[TotalDue] DESC, sod.[SalesOrderDetailID]

接著,應用程式會使用 .NET Framework System.Xml.Linq API,將結果格式化為 XML。The application then uses the .NET Framework System.Xml.Linq APIs to format the results as XML.

// Create a new SqlCommand to run the Transact-SQL query
using (var command = new SqlCommand(...))
{
    command.Parameters.AddWithValue("@TerritoryId", id);

    // Run the query and create the initial XML document
    using (var reader = await command.ExecuteReaderAsync())
    {
        var lastOrderNumber = string.Empty;
        var doc = new XDocument();
        var orders = new XElement("Orders");
        doc.Add(orders);

        XElement lineItems = null;
        // Fetch each row in turn, format the results as XML, and add them to the XML document
        while (await reader.ReadAsync())
        {
            var orderNumber = reader["OrderNumber"].ToString();
            if (orderNumber != lastOrderNumber)
            {
                lastOrderNumber = orderNumber;

                var order = new XElement("Order");
                orders.Add(order);
                var customer = new XElement("Customer");
                lineItems = new XElement("OrderLineItems");
                order.Add(customer, lineItems);

                var orderDate = (DateTime)reader["OrderDate"];
                var totalDue = (Decimal)reader["TotalDue"];
                var reviewRequired = totalDue > 5000 ? 'Y' : 'N';

                order.Add(
                    new XAttribute("OrderNumber", orderNumber),
                    new XAttribute("Status", reader["Status"]),
                    new XAttribute("ShipDate", reader["ShipDate"]),
                    ... // More attributes, not shown.

                    var fullName = string.Join(" ",
                        reader["CustomerTitle"],
                        reader["CustomerFirstName"],
                        reader["CustomerMiddleName"],
                        reader["CustomerLastName"],
                        reader["CustomerSuffix"]
                    )
                   .Replace("  ", " ") //remove double spaces
                   .Trim()
                   .ToUpper();

               customer.Add(
                    new XAttribute("AccountNumber", reader["AccountNumber"]),
                    new XAttribute("FullName", fullName));
            }

            var productId = (int)reader["ProductID"];
            var quantity = (short)reader["Quantity"];
            var inventoryCheckRequired = (productId >= 710 && productId <= 720 && quantity >= 5) ? 'Y' : 'N';

            lineItems.Add(
                new XElement("LineItem",
                    new XAttribute("Quantity", quantity),
                    new XAttribute("UnitPrice", ((Decimal)reader["UnitPrice"]).ToString("C")),
                    new XAttribute("LineTotal", RoundAndFormat(reader["LineTotal"])),
                    new XAttribute("ProductId", productId),
                    new XAttribute("InventoryCheckRequired", inventoryCheckRequired)
                ));
        }
        // Match the exact formatting of the XML returned from SQL
        var xml = doc
            .ToString(SaveOptions.DisableFormatting)
            .Replace(" />", "/>");
    }
}

注意

此程式碼較為複雜。This code is somewhat complex. 對於新的應用程式,您可能會想要使用序列化程式庫。For a new application, you might prefer to use a serialization library. 不過,這裡的假設是開發小組重構現有的應用程式,讓方法必須傳回與原始程式碼完全相同的格式。However, the assumption here is that the development team is refactoring an existing application, so the method needs to return the exact same format as the original code.

考量Considerations

  • 許多資料庫系統高度最佳化,以執行特定類型的資料處理,例如計算大型資料集的彙總值。Many database systems are highly optimized to perform certain types of data processing, such as calculating aggregate values over large datasets. 不要將這些類型的處理移出資料庫。Don't move those types of processing out of the database.

  • 如果重新定位處理會導致資料庫透過網路傳送更多資料,請不要這樣做。Do not relocate processing if doing so causes the database to transfer far more data over the network. 請參閱沒有直接關聯的擷取反模式See the Extraneous Fetching antipattern.

  • 如果您將處理移到應用程式層,該階層可能需要相應放大以處理其他工作。If you move processing to an application tier, that tier may need to scale out to handle the additional work.

如何偵測問題How to detect the problem

忙碌資料庫的徵兆包含在存取資料庫的作業中不成比例的拒絕輸送量和回應時間。Symptoms of a busy database include a disproportionate decline in throughput and response times in operations that access the database.

您可以執行下列步驟來協助識別此問題:You can perform the following steps to help identify this problem:

  1. 使用效能監視,以識別生產系統執行資料庫活動所花費的時間。Use performance monitoring to identify how much time the production system spends performing database activity.

  2. 檢查這些期間由資料庫執行的工作。Examine the work performed by the database during these periods.

  3. 如果您懷疑特定作業可能會導致資料庫活動過多,在受控制的環境中執行負載測試。If you suspect that particular operations might cause too much database activity, perform load testing in a controlled environment. 每個測試應該使用變數使用者負載來執行混合懷疑作業。Each test should run a mixture of the suspect operations with a variable user load. 檢查負載測試的遙測,來觀察資料庫的使用方式。Examine the telemetry from the load tests to observe how the database is used.

  4. 如果資料庫活動顯示大量處理,但是資料流量很少,檢閱原始程式碼,以判斷處理是否可以在其他位置獲得更好的執行。If the database activity reveals significant processing but little data traffic, review the source code to determine whether the processing can better be performed elsewhere.

如果資料庫活動量低或回應時間相對快速,則忙碌資料庫不太可能是效能問題。If the volume of database activity is low or response times are relatively fast, then a busy database is unlikely to be a performance problem.

範例診斷Example diagnosis

下列各節會將這些步驟套用到稍早所述的範例應用程式。The following sections apply these steps to the sample application described earlier.

監視資料庫活動量Monitor the volume of database activity

下圖顯示針對相同應用程式,使用最多 50 個並行使用者的步驟負載,執行負載測試的結果。The following graph shows the results of running a load test against the sample application, using a step load of up to 50 concurrent users. 要求數量會快速達到限制並且維持在該層級,而平均回應時間則穩定增加。The volume of requests quickly reaches a limit and stays at that level, while the average response time steadily increases. 針對這兩個計量使用對數範圍。A logarithmic scale is used for those two metrics.

Load-test results for performing processing in the database

這個折線圖顯示使用者負載、每秒要求數和平均回應時間。This line graph shows user load, requests per second, and average response time. 這個圖表顯示回應時間會隨著負載增加而增加。The graph shows that response time increases as load increases.

下一個圖表會將 CPU 使用率和 DTU 顯示為服務配額百分比。The next graph shows CPU utilization and DTUs as a percentage of service quota. DTU 會提供資料庫執行多少處理的量值。DTUs provide a measure of how much processing the database performs. 圖表會顯示即將達到 100% 的 CPU 和 DTU 使用率。The graph shows that CPU and DTU utilization both quickly reached 100%.

Azure SQL Database monitor showing the performance of the database while performing processing

這個折線圖顯示一段時間的 CPU 百分比和 DTU 百分比。This line graph shows CPU percentage and DTU percentage over time. 這個圖表顯示這兩個項目都快速達到 100%。The graph shows that both quickly reach 100%.

檢查由資料庫執行的工作Examine the work performed by the database

資料庫執行的工作可能是正版資料存取作業,而不是處理,因此請務必了解當資料庫忙碌時執行的 SQL 陳述式。It could be that the tasks performed by the database are genuine data access operations, rather than processing, so it is important to understand the SQL statements being run while the database is busy. 監控系統以擷取 SQL 流量,並且讓 SQL 作業與應用程式要求相互關聯。Monitor the system to capture the SQL traffic and correlate the SQL operations with application requests.

如果資料庫作業只是單純的資料存取作業,而沒有太多處理,則問題可能是沒有直接關聯的擷取If the database operations are purely data access operations, without a lot of processing, then the problem might be Extraneous Fetching.

實作解決方案並確認結果Implement the solution and verify the result

下圖顯示使用更新程式碼的負載測試。The following graph shows a load test using the updated code. 輸送量相當高,相較於先前的 12 個,每秒要求數超過 400 個。Throughput is significantly higher, over 400 requests per second versus 12 earlier. 平均回應時間也低非常多,相較於超過 4 秒,只要 0.1 秒。The average response time is also much lower, just above 0.1 seconds compared to over 4 seconds.

Graph showing load-test results for performing processing in the client application.

這個折線圖顯示使用者負載、每秒要求數和平均回應時間。This line graph shows user load, requests per second, and average response time. 這個圖表顯示在整個負載測試中回應時間大致維持不變。The graph shows that response time remains roughly constant throughout the load test.

CPU 和 DTU 使用率會顯示儘管增加了輸送量,系統花費較長的時間才達到飽和。CPU and DTU utilization shows that the system took longer to reach saturation, despite the increased throughput.

Azure SQL Database monitor showing the performance of the database while performing processing in the client application

這個折線圖顯示一段時間的 CPU 百分比和 DTU 百分比。This line graph shows CPU percentage and DTU percentage over time. 這個圖表顯示 CPU 和 DTU 需要較先前更久的時間來達到 100%。The graph shows that CPU and DTU take longer to reach 100% than previously.