question

DavoudianAli-5815 avatar image
0 Votes"
DavoudianAli-5815 asked DavoudianAli-5815 answered

Training a TensorFlow model in Azure ML

I am following the link below for training a TensorFlow model in Azure ML:

https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/ml-frameworks/tensorflow/train-hyperparameter-tune-deploy-with-tensorflow/train-hyperparameter-tune-deploy-with-tensorflow.ipynb

However, as my training dataset is in a container named "sample-datasets" in ADLS Gen2, I changed the following code (in the above link) to refer to the paths in my data lake. So I replaced code A (in the link above) with code B (my code)

Code A:

urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/train-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 'train-images-idx3-ubyte.gz'))
urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/train-labels-idx1-ubyte.gz',
filename=os.path.join(data_folder, 'train-labels-idx1-ubyte.gz'))
urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/t10k-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'))
urllib.request.urlretrieve('https://azureopendatastorage.blob.core.windows.net/mnist/t10k-labels-idx1-ubyte.gz',
filename=os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'))


Code B:

from azureml.core.dataset import Dataset
urllib.request.urlretrieve('https://lakehousestgenrichedzone.dfs.core.windows.net/sample-datasets/train-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 'train-images-idx3-ubyte.gz'))
urllib.request.urlretrieve('https://lakehousestgenrichedzone.dfs.core.windows.net/sample-datasets/train-labels-idx1-ubyte.gz', filename=os.path.join(data_folder, 'train-labels-idx1-ubyte.gz'))
urllib.request.urlretrieve('https://lakehousestgenrichedzone.dfs.core.windows.net/sample-datasets/t10k-images-idx3-ubyte.gz', filename=os.path.join(data_folder, 't10k-images-idx3-ubyte.gz'))
urllib.request.urlretrieve('https://lakehousestgenrichedzone.dfs.core.windows.net/sample-datasets/t10k-labels-idx1-ubyte.gz', filename=os.path.join(data_folder, 't10k-labels-idx1-ubyte.gz'))

But I receive the following error:

HTTPError: HTTP Error 401: Server failed to authenticate the request. Please refer to the information in the www-authenticate header.

Can you please let me know how I can train the model using my data which are stored in the data lake? More precisely, how my Python code can copy the training dataset from my data lake into data_folder?

PS: Please note that I have already granted the Blob Storage data Contributor role on my data lake storage account to my Azure ML workspace as a managed identity.

azure-machine-learningazure-data-lake-storageazure-machine-learning-studio-classicazure-machine-learning-inference
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

anonymous user Did the below response help to use the SDK to download the required files from ADLS?

0 Votes 0 ·
romungi-MSFT avatar image
0 Votes"
romungi-MSFT answered

anonymous user I have not worked on ADLS scenarios with Azure ML but I have added the ADLS tag to this thread for others to chip in and add their views.

Based on the documentation for ADLS REST API it supports Azure Active Directory (Azure AD), Shared Key, and shared access signature (SAS) authorization with the APIs that are available to download the files from its storage. So, I think a direct download might not work in this case without authentication.

I think the easiest way to get your files locally from ADLS is to use the python SDK to authenticate using account key or AD as listed here.

If you have many files that needs to be downloaded and referenced in your ML experiments then you may also consider to use the import data module of designer for designer experiments or register them as dataset from dataset tab of ml.azure.com which can also be referenced using the Azure ML SDK.


If an answer is helpful, please click on 130616-image.png or upvote 130671-image.png which might help other community members reading this thread.



5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

DavoudianAli-5815 avatar image
0 Votes"
DavoudianAli-5815 answered

I solved the problem by assigning an user-assigned managed identity to the target compute to access my ASDLS Gen2

5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.