Loads images from Azure BLOB Storage into a dataset
Category: OpenCV Library Modules
Applies to: Machine Learning Studio
This content pertains only to Studio. Similar drag and drop modules have been added to the visual interface in Machine Learning service. Learn more in this article comparing the two versions.
This article describes how to use the Import Images module in Azure Machine Learning Studio, to get multiple images from Azure Blob storage and create an image dataset from them.
When you use this module to load images from blob storage into your workspace, each image is converted to a series of numeric values for the red, green, and blue channels, together with the image file name. A dataset of such images consists of multiple rows in a table, each with a different set of RGB values and corresponding image file names. For instructions about how to prepare your images and connect to blob storage, see How to Import Images.
After you have converted all your images, you can then pass this dataset to the Score Model module, and connect a pre-trained image classification model to predict the image type.
You can import any kind of images used for machine learning; however, there are limitations, including the types and size of images that can be processed, see the Technical notes section.
How to use Import Images
This example assumes that you have uploaded multiple images to your account in Azure blob storage. The images are in a container designated for that purpose only. As a rule, each image must be fairly small and have the same dimensions and color channels. For a detailed list of requirements that apply to images, see the Technical notes section.
Add the Import Images module to your experiment in Studio.
In the Import Images module, configure the location of the images, and provide the authentication method, private or public:
If the image set is in a blob that has been configured for public access through Shared Access Signatures(SAS), type the URL to the container that holds the images.
If the images are stored in a private account in Azure storage, select Account, and then type the account name as it appears in the management portal. Then, paste in the primary or secondary account key.
For Path to container, type just the container name, and no other path elements.
Run the experiment.
Each row of the output dataset contains data from one image. The rows are sorted alphabetically by image name, and the columns contain the following information, in this order:
- The first column contains image names.
- All other columns contain flattened data from the red, green, and blue color channels, in that order.
- The transparency channel is ignored.
Depending on the color depth of the image and the image format, there could be many thousands of columns for a single image. Therefore, to view the results of the experiment, we recommend that you add the Select Columns in Dataset module, and select only these columns:
- Image Name
- Scored Labels
- Scored Probabilities
This section contains implementation details, tips, and answers to frequently asked questions
Supported image formats
The Import Images module determines the type of an image by reading the first few bytes of the content, not by the file extension. Based on that information, it determines whether the image is one of the supported image formats.
- Windows bitmap files: .bmp, .dib
- JPEG files: .jpeg, .jpg, .jpe
- JPEG 2000 files: .jp2
- Portable Network Graphics: .png
- Portable image format: .pbm, .pgm, .ppm
- Sun Raster: .sr, .ras
- TIFF files: .tiff, .tif
The following requirements apply to images processed by the Import Images module:
- All images must be the same shape.
- All images must have the same color channels. For example, you cannot mix grayscale images with RBG images.
- There is a limit of 65536 pixels per image. However, the number of images is not limited.
- If you specify a blob container as the source, the container must not contain other types of data. Ensure that the container contains only images before running the module.
If you intend to use the Pretrained Cascade Image Classification module, be aware that it currently supports only recognition of faces in frontal view; other image classifiers are not yet available.
|Please specify authentication type||List||AuthenticationType||Account||Public or Shared Access Signature (SAS) URI or user credentials|
|URI||Any||String||none||Uniform Resource Identifier with SAS or public access|
|Account name||Any||String||none||Name of the Azure Storage account|
|Account key||Any||SecureString||none||Key associated with the Azure Storage account|
|Path to container, directory or blob||Any||String||none||Path to blob or name of table|
|Results dataset||Data Table||Dataset with downloaded images|
|Error 0003||Exception occurs if one or more inputs are null or empty.|
|Error 0029||Exception occurs when invalid URI is passed.|
|Error 0009||Exception occurs if the Azure storage account name or container name is specified incorrectly.|
|Error 0015||Exception occurs if the database connection has failed.|
|Error 0030||Exception occurs when it is not possible to download a file.|
|Error 0049||Exception occurs when it is not possible to parse a file.|
|Error 0048||Exception occurs when it is not possible to open a file.|
For a list of errors specific to Studio modules, see Machine Learning Error codes.
For a list of API exceptions, see Machine Learning REST API Error Codes.