External Tables

 

Important

This topic and its sub-topics provides the JSON format that is supported by the older versions of Azure PowerShell. If you are using the July 2015 Release version of Azure PowerShell or later, see Policy Element for the latest JSON format. You can convert the JSON from old format to new format by using the tool: JSON Upgrade Tool

External tables refer to tables that are not explicitly produced by the pipelines running in the Azure Data Factory. For example, the input data for a Copy Activity might be produced outside of the Azure Data Factory, and copied to Azure blob storage.

When defining external tables, you will need to use the waitOnExternal property in the table JSON definition. This enables the data slices to be marked as Ready once the data is available in the store. All the properties in waitOnExternal are optional. If you want to mark a table as external, and use the default values, specify waitOnExternal: { }. For an on-premises SQL Server database, only the dataDelay property is applicable.

The following example shows the usage of the waitOnExternal property.

Warning

Use “waitOnExternal”: {} to use default values for all the properties in this tag.

{
    "name": "MyDemoBlob",
    "properties":
    {
        "location":
        {
            "type": "AzureBlobLocation",
            "folderPath": "MyContainer/MySubFolder/",
            "linkedServiceName": "MyLinkedService",
            "format":
            {
                "type": "TextFormat",
                "columnDelimiter": ",",
                "rowDelimiter": ";"
            }
        },
        "availability":
        {
            "frequency": "Hour",
            "interval": 1,
            "waitOnExternal":
            {
                "dataDelay": "00:10:00",
                "retryInterval": "00:01:00",
                "retryTimeout": "00:10:00",
                "maximumRetry": 3
            }
        }
    }
}

The following table lists the properties used in the waitOnExternal section of the JSON script.

Name

Description

Default Value

dataDelay

Time to delay the check on availability of the external data. For example if the data is supposed to be available hourly, the check to see the external data is actually available and the corresponding slice is Ready can be delayed by DataDelay.

Only applies to the present time; for example, if it is 1:00 PM right now and this value is 10 minutes, the validation will start at 1:10 PM.

This setting does not affect slices in the past (slices with Slice End Time + dataDelay < Now) will be processed without any delay.

0

retryInterval

The wait time between a failure and the next retry attempt. Applies to present time; if the previous try failed, we wait this long after the last try.

If it is 1:00pm right now, we will begin the first try. If the duration to complete the first validation check is 1 minute and the operation failed, the next retry will be at 1:00 + 1min (duration) + 1min (retry interval) = 1:02pm.

For slices in the past, there will be no delay. The retry will happen immediately.

00:01:00 (1 minute)

retryTimeout

The timeout for each retry attempt.

If this is set to 10 minutes, the validation needs to be completed within 10 minutes. If it takes longer than 10 minutes to perform the validation (e.g. there are 1,000,000 rows in azure storage) the retry will time out.

If all attempts for the validation time out, the slice will be marked as TimedOut.

00:10:00 (10 minutes)

maximumRetry

Number of times to check for the availability of the external data. The allowed maximum value is 10.

3