針對 Azure Data Factory 中的管線協調流程和觸發程式進行疑難排解Troubleshoot pipeline orchestration and triggers in Azure Data Factory

適用於: Azure Data Factory Azure Synapse Analytics

在 Azure Data Factory 中,「管線執行」可定義管線執行的執行個體。A pipeline run in Azure Data Factory defines an instance of a pipeline execution. 例如,假設您有一個在上午8:00 點、9:00 AM 和 10:00 AM 執行的管線。For example, let's say you have a pipeline that runs at 8:00 AM, 9:00 AM, and 10:00 AM. 在此情況下,有三個不同的管線執行。In this case, there are three separate pipeline runs. 每個管線執行都有唯一的管線執行識別碼。Each pipeline run has a unique pipeline run ID. 執行識別碼是 (GUID) 的全域唯一識別碼,可定義該特定管線執行。A run ID is a globally unique identifier (GUID) that defines that particular pipeline run.

將引數傳遞給管線中定義的參數,通常可將管線執行具現化。Pipeline runs are typically instantiated by passing arguments to parameters that you define in the pipeline. 您可以手動方式或使用觸發程式來執行管線。You can run a pipeline either manually or by using a trigger. 如需詳細資料,請參閱 Azure Data Factory 中的管線執行和觸發 程式。See Pipeline execution and triggers in Azure Data Factory for details.

常見的問題、原因和解決方案Common issues, causes, and solutions

Azure Functions 的應用程式管線擲回私人端點連線錯誤An Azure Functions app pipeline throws an error with private endpoint connectivity

您有 Data Factory 和在私人端點上執行的 Azure 函數應用程式。You have Data Factory and an Azure function app running on a private endpoint. 您正在嘗試執行與函數應用程式互動的管線。You're trying to run a pipeline that interacts with the function app. 您已嘗試三種不同的方法,但其中一個會傳回錯誤「不正確的要求」,而其他兩個方法會傳回「103錯誤禁止」。You've tried three different methods, but one returns error "Bad Request," and the other two methods return "103 Error Forbidden."

原因Cause

Data Factory 目前不支援函數應用程式的私人端點連接器。Data Factory currently doesn't support a private endpoint connector for function apps. Azure Functions 拒絕呼叫,因為它已設定為只允許來自私用連結的連接。Azure Functions rejects calls because it's configured to allow only connections from a private link.

解決方法Resolution

建立 PrivateLinkService 端點,並提供函數應用程式的 DNS。Create a PrivateLinkService endpoint and provide your function app's DNS.

管線執行已取消,但監視器仍顯示進度狀態A pipeline run is canceled but the monitor still shows progress status

原因Cause

當您取消管線執行時,管線監視通常仍會顯示進度狀態。When you cancel a pipeline run, pipeline monitoring often still shows the progress status. 因為瀏覽器快取問題,所以會發生這種情況。This happens because of a browser cache issue. 您也可能沒有正確的監視篩選。You also might not have the correct monitoring filters.

解決方法Resolution

重新整理瀏覽器並套用正確的監視篩選。Refresh the browser and apply the correct monitoring filters.

複製管線時看到「DelimitedTextMoreColumnsThanDefined」錯誤You see a "DelimitedTextMoreColumnsThanDefined" error when copying a pipeline

原因Cause

如果您要複製的資料夾包含具有不同架構的檔案,例如變數的資料行數目、不同的分隔符號、引號字元設定或某些資料問題,Data Factory 管線可能會擲回此錯誤:If a folder you're copying contains files with different schemas, such as variable number of columns, different delimiters, quote char settings, or some data issue, the Data Factory pipeline might throw this error:

Operation on target Copy_sks failed: Failure happened on 'Sink' side. ErrorCode=DelimitedTextMoreColumnsThanDefined, 'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException, Message=Error found when processing 'Csv/Tsv Format Text' source '0_2020_11_09_11_43_32.avro' with row number 53: found more columns than expected column count 27., Source=Microsoft.DataTransfer.Common,'

解決方法Resolution

建立複製活動時,請選取 [ 二進位複製 ] 選項。Select the Binary Copy option while creating the Copy activity. 如此一來,您就可以將資料從一個 data lake 大量複製或遷移到另一個 data lake,Data Factory 不會開啟檔案來讀取架構。This way, for bulk copies or migrating your data from one data lake to another, Data Factory won't open the files to read the schema. 相反地,Data Factory 會將每個檔案視為二進位檔,並將它複製到另一個位置。Instead, Data Factory will treat each file as binary and copy it to the other location.

當您達到資料流程整合執行時間的容量限制時,管線執行失敗A pipeline run fails when you reach the capacity limit of the integration runtime for data flow

問題Issue

錯誤訊息:Error message:

Type=Microsoft.DataTransfer.Execution.Core.ExecutionException,Message=There are substantial concurrent MappingDataflow executions which is causing failures due to throttling under Integration Runtime 'AutoResolveIntegrationRuntime'.

原因Cause

您已達到整合執行時間的容量限制。You've reached the integration runtime's capacity limit. 您可能會同時使用相同的整合執行時間來執行大量的資料流程。You might be running a large amount of data flow by using the same integration runtime at the same time. 如需詳細資料,請參閱 Azure 訂用帳戶和服務限制、配額和條件約束See Azure subscription and service limits, quotas, and constraints for details.

解決方法Resolution

  • 在不同的觸發時間執行您的管線。Run your pipelines at different trigger times.
  • 建立新的整合執行時間,並將您的管線分割到多個整合執行時間。Create a new integration runtime, and split your pipelines across multiple integration runtimes.

在 Web 活動中叫用 REST api 時發生管線執行錯誤A pipeline run error while invoking REST api in a Web activity

問題Issue

錯誤訊息:Error message:

Operation on target Cancel failed: {“error”:{“code”:”AuthorizationFailed”,”message”:”The client ‘<client>’ with object id ‘<object>’ does not have authorization to perform action ‘Microsoft.DataFactory/factories/pipelineruns/cancel/action’ over scope ‘/subscriptions/<subscription>/resourceGroups/<resource group>/providers/Microsoft.DataFactory/factories/<data factory name>/pipelineruns/<pipeline run id>’ or the scope is invalid. If access was recently granted, please refresh your credentials.”}}

原因Cause

只有在將「參與者」角色指派給 Azure Data Factory 成員時,管線才能使用 Web 活動來呼叫 ADF REST API 方法。Pipelines may use the Web activity to call ADF REST API methods if and only if the Azure Data Factory member is assigned the Contributor role. 您必須先將 Azure Data Factory 受控識別新增至「參與者」安全性角色。You must first configure add the Azure Data Factory managed identity to the Contributor security role.

解決方法Resolution

在 Web 活動的 [設定] 索引標籤中使用 Azure Data Factory 的 REST API 之前,必須先設定安全性。Before using the Azure Data Factory’s REST API in a Web activity’s Settings tab, security must be configured. 只有在將「 參與者 」角色指派給 Azure Data Factory 的受控識別時,Azure Data Factory 管線才能使用 Web 活動來呼叫 ADF REST API 方法。Azure Data Factory pipelines may use the Web activity to call ADF REST API methods if and only if the Azure Data Factory managed identity is assigned the Contributor role. 首先,開啟 Azure 入口網站,然後按一下左側功能表上的 [ 所有資源 ] 連結。Begin by opening the Azure portal and clicking the All resources link on the left menu. 按一下 [新增角色指派] 方塊中的 [新增] 按鈕,選取 Azure Data Factory 新增具有參與者角色的 ADF 受控識別。Select Azure Data Factory to add ADF managed identity with Contributor role by clicking the Add button in the Add a role assignment box.

如何在管線中檢查和分支活動層級成功和失敗How to check and branch on activity-level success and failure in pipelines

原因Cause

Azure Data Factory 協調流程允許條件式邏輯,並可讓使用者根據上一個活動的結果來採用不同的路徑。Azure Data Factory orchestration allows conditional logic and enables users to take different paths based upon the outcome of a previous activity. 它允許四個條件式路徑: 成功時 (預設傳遞) 、在 失敗 時、 完成 時以及 略過時It allows four conditional paths: Upon Success (default pass), Upon Failure, Upon Completion, and Upon Skip.

Azure Data Factory 會評估所有分葉層級活動的結果。Azure Data Factory evaluates the outcome of all leaf-level activities. 只有當所有的離開都成功時,管線結果才會成功。Pipeline results are successful only if all leaves succeed. 如果略過分葉活動,我們會改為評估其父活動。If a leaf activity was skipped, we evaluate its parent activity instead.

解決方法Resolution

如何以固定間隔監視管線失敗How to monitor pipeline failures in regular intervals

原因Cause

您可能需要依間隔(例如5分鐘)監視失敗的 Data Factory 管線。You might need to monitor failed Data Factory pipelines in intervals, say 5 minutes. 您可以使用端點來查詢和篩選資料處理站中的管線執行。You can query and filter the pipeline runs from a data factory by using the endpoint.

解決方法Resolution

平行處理原則的程度增加不會產生較高的輸送量Degree of parallelism increase does not result in higher throughput

原因Cause

ForEach 中的平行處理原則程度實際上是平行處理原則的最大程度。The degree of parallelism in ForEach is actually max degree of parallelism. 我們無法保證在同一時間發生特定的執行數目,但此參數將保證永遠不會超過所設定的值。We cannot guarantee a specific number of executions happening at the same time, but this parameter will guarantee that we never go above the value that was set. 您應該會看到此限制,可在控制對來源和接收器的平行存取時加以運用。You should see this as a limit, to be leveraged when controlling concurrent access to your sources and sinks.

關於 ForEach 的已知事實Known Facts about ForEach

  • Foreach 有一個名為 batch count 的屬性 (n) 其中預設值為20,而最大值為50。Foreach has a property called batch count(n) where default value is 20 and the max is 50.
  • 批次計數(n)是用來建立 n 個佇列。The batch count, n, is used to construct n queues.
  • 每個佇列都會依序執行,但您可以平行執行多個佇列。Every queue runs sequentially, but you can have several queues running in parallel.
  • 佇列已預先建立。The queues are pre-created. 這表示在執行時間期間,佇列不會重新平衡。This means there is no rebalancing of the queues during the runtime.
  • 在任何時候,每個佇列最多會處理一個專案。At any time, you have at most one item being process per queue. 這表示在任何指定時間最多會處理 n 個專案。This means at most n items being processed at any given time.
  • Foreach 總處理時間等於最長佇列的處理時間。The foreach total processing time is equal to the processing time of the longest queue. 這表示 foreach 活動取決於佇列的建立方式。This means that the foreach activity depends on how the queues are constructed.

解決方法Resolution

  • 在中,您不應該針對平行執行的 每個 使用 SetVariable 活動。You should not use SetVariable activity inside For Each that runs in parallel.
  • 考慮建立佇列的方式,客戶可以藉由設定 foreach 的倍數來改善 foreach 效能,其中每個 foreach 的專案都有類似的處理時間。Taking in consideration the way the queues are constructed, customer can improve the foreach performance by setting multiples of foreach where each foreach will have items with similar processing time.
  • 這可確保以平行方式處理長時間執行,而不是依序處理。This will ensure that long runs are processed in parallel rather sequentially.

管線狀態已排入佇列或停滯很長的時間Pipeline status is queued or stuck for a long time

原因Cause

發生這種情況的原因可能有許多種,例如達到並行限制、服務中斷、網路失敗等等。This can happen for various reasons like hitting concurrency limits, service outages, network failures and so on.

解決方法Resolution

  • 平行存取限制:如果您的管線具有平行存取原則,請確認沒有任何舊的管線執行正在進行中。Concurrency Limit: If your pipeline has a concurrency policy, verify that there are no old pipeline runs in progress. Azure Data Factory 中允許的最大管線並行處理為10個管線。The maximum pipeline concurrency allowed in Azure Data Factory is 10 pipelines .
  • 監視限制:前往 ADF authoring canvas、選取您的管線,然後判斷它是否有指派的並行屬性。Monitoring limits: Go to the ADF authoring canvas, select your pipeline, and determine if it has a concurrency property assigned to it. 如果有,請移至監視檢視,並確定過去 45 天內沒有任何進行中的執行。If it does, go to the Monitoring view, and make sure there's nothing in the past 45 days that's in progress. 如果有一些正在進行中的工作,您可以將它取消,而新的管線執行應該會啟動。If there is something in progress, you can cancel it and the new pipeline run should start.
  • 暫時性問題:可能是您的執行受到暫時性網路問題、認證失敗、服務中斷等所影響。 如果發生這種情況,Azure Data Factory 的內部復原程式會監視所有執行,並在發現發生錯誤時啟動它們。Transient Issues: It is possible that your run was impacted by a transient network issue, credential failures, services outages etc. If this happens, Azure Data Factory has an internal recovery process that monitors all the runs and starts them when it notices something went wrong. 此程式會每隔一小時執行一次,因此,如果您的回合停滯超過一小時,請建立支援案例。This process happens every one hour, so if your run is stuck for more than an hour, create a support case.

ADF 複製和資料流程中活動的啟動時間較長Longer start up times for activities in ADF Copy and Data Flow

原因Cause

如果您尚未針對資料流程或優化 SHIR 來實行存留時間功能,就會發生這種情況。This can happen if you have not implemented time to live feature for Data Flow or optimized SHIR.

解決方法Resolution

  • 如果每個複製活動最多需要 2 分鐘的時間來啟動,而且問題主要發生在 VNet 聯結上 (相對於Azure IR),則這可能是複製效能問題。If each copy activity is taking up to 2 minutes to start, and the problem occurs primarily on a VNet join (vs. Azure IR), this can be a copy performance issue. 若要查看疑難排解步驟,請移至 複製效能改進。To review troubleshooting steps, go to Copy Performance Improvement.
  • 您可以使用「存留時間」功能來減少「資料流程」活動的叢集啟動時間。You can use time to live feature to decrease cluster start up time for data flow activities. 請參閱 Integration Runtime 的資料流程。Please review Data Flow Integration Runtime.

SHIR (自我裝載 Integration Runtime) 的容量問題Hitting capacity issues in SHIR(Self Hosted Integration Runtime)

原因Cause

如果您尚未依據您的工作負載調整 SHIR,可能會發生這種情況。This can happen if you have not scaled up SHIR as per your workload.

解決方法Resolution

  • 如果您遇到 SHIR 的容量問題,請升級 VM 以增加節點以平衡活動。If you encounter a capacity issue from SHIR, upgrade the VM to increase the node to balance the activities. 如果您收到有關自我裝載 IR 一般失敗或錯誤的錯誤訊息、自我裝載 IR 升級或自我裝載 IR 連線問題(可產生長佇列),請移至針對 自我裝載整合執行時間進行疑難排解。If you receive an error message about a self-hosted IR general failure or error, a self-hosted IR upgrade, or self-hosted IR connectivity issues, which can generate a long queue, go to Troubleshoot self-hosted integration runtime.

由於 ADF 複製和資料流程的佇列過長而產生的錯誤訊息Error messages due to long queues for ADF Copy and Data Flow

原因Cause

可能會因為各種原因而出現較長的佇列相關錯誤訊息。Long queue related error messages can appear for various reasons.

解決方法Resolution

後續步驟Next steps

如需更多疑難排解的協助,請嘗試下列資源:For more troubleshooting help, try these resources: