Get metadata activity in Azure Data Factory

GetMetadata activity can be used to retrieve metadata of any data in Azure Data Factory. This activity is supported only for data factories of version 2. It can be used in the following scenarios:

  • Validate the metadata information of any data
  • Trigger a pipeline when data is ready/ available

The following functionality is available in the control flow:

  • The output from GetMetadata Activity can be used in conditional expressions to perform validation.
  • A pipeline can be triggered when condition is satisfied via Do-Until looping

The GetMetadata Activity takes a dataset as a required input, and outputs metadata information available as output. Currently, only Azure blob dataset is supported. The supported metadata fields are size, structure, and lastModified time.

Note

This article applies to version 2 of Data Factory, which is currently in preview. If you are using version 1 of the Data Factory service, which is generally available (GA), see Data Factory V1 documentation.

Syntax

Get Metadata Activity definition:

In the following example, the GetMetadata activity returns metadata about the data represented by the MyDataset.

{
    "name": "MyActivity",
    "type": "GetMetadata",
    "typeProperties": {
        "fieldList" : ["size", "lastModified", "structure"],
        "dataset": {
            "referenceName": "MyDataset",
            "type": "DatasetReference"
        }
    }
}

Dataset definition:

{
    "name": "MyDataset",
    "properties": {
    "type": "AzureBlob",
        "linkedService": {
            "referenceName": "StorageLinkedService",
            "type": "LinkedServiceReference"
        },
        "typeProperties": {
            "folderPath":"container/folder",
            "Filename": "file.json",
            "format":{
                "type":"JsonFormat"
                "nestedSeperator": ","
            }
        }
    }
}

Output

{
    "size": 1024,
    "structure": [
        {
            "name": "id",
            "type": "Int64"
        }, 
    ],
    "lastModified": "2016-07-12T00:00:00Z"
}

Type properties

Currently GetMetadata activity can fetch the following types of metadata information from an Azure storage dataset.

Property Description Allowed Values Required
fieldList Lists the types of metadata information required.
  • size
  • structure
  • lastModified
No
If empty, activity returns all 3 supported metadata information.
dataset The reference dataset whose metadata activity is to be retrieved by the GetMetadata Activity.

Currently supported dataset type is Azure Blob. Two sub properties are:
  • referenceName: reference to an existing Azure Blob Dataset
  • type: since the dataset is being referenced, it is of the type "DatasetReference"
  • String
  • DatasetReference
Yes

Next steps

See other control flow activities supported by Data Factory: