Avro format in Azure Data Factory

Follow this article when you want to parse the Avro files or write the data into Avro format.

Avro format is supported for the following connectors: Amazon S3, Azure Blob, Azure Data Lake Storage Gen1, Azure Data Lake Storage Gen2, Azure File Storage, File System, FTP, Google Cloud Storage, HDFS, HTTP, and SFTP.

Dataset properties

For a full list of sections and properties available for defining datasets, see the Datasets article. This section provides a list of properties supported by the Avro dataset.

Property Description Required
type The type property of the dataset must be set to Avro. Yes
location Location settings of the file(s). Each file-based connector has its own location type and supported properties under location. See details in connector article -> Dataset properties section. Yes
avroCompressionCodec The compression codec to use when writing to Avro files. When reading from Avro files, Data Factory automatically determine the compression codec based on the file metadata.
Supported types are "none" (default), "deflate", "snappy".
No

Note

White space in column name is not supported for Avro files.

Below is an example of Avro dataset on Azure Blob Storage:

{
    "name": "AvroDataset",
    "properties": {
        "type": "Avro",
        "linkedServiceName": {
            "referenceName": "<Azure Blob Storage linked service name>",
            "type": "LinkedServiceReference"
        },
        "schema": [ < physical schema, optional, retrievable during authoring > ],
        "typeProperties": {
            "location": {
                "type": "AzureBlobStorageLocation",
                "container": "containername",
                "folderPath": "folder/subfolder",
            },
            "avroCompressionCodec": "snappy"
        }
    }
}

Copy activity properties

For a full list of sections and properties available for defining activities, see the Pipelines article. This section provides a list of properties supported by the Avro source and sink.

Avro as source

The following properties are supported in the copy activity *source* section.

Property Description Required
type The type property of the copy activity source must be set to AvroSource. Yes
storeSettings A group of properties on how to read data from a data store. Each file-based connector has its own supported read settings under storeSettings. See details in connector article -> Copy activity properties section. No

Avro as sink

The following properties are supported in the copy activity *sink* section.

Property Description Required
type The type property of the copy activity source must be set to AvroSink. Yes
storeSettings A group of properties on how to write data to a data store. Each file-based connector has its own supported write settings under storeSettings. See details in connector article -> Copy activity properties section. No

Data type support

Avro complex data types are not supported (records, enums, arrays, maps, unions, and fixed).

Next steps