Hi,
I have been following this tutorial:
https://docs.microsoft.com/en-us/azure/batch/tutorial-run-python-batch-azure-data-factory
I was able to execute this tutorial successfully earlier but recently, I added ~12 files(all greater than atleast 1Gb), since then I have not been able to run the pipeline successfully and it is taking very long to even run the pipeline.
Once i deleted all the newly loaded files and kept only iris.csv and main.py and ran the pipeline again, it is taking infinitely long to run the code which used to run in seconds earlier.
Also, the node shows in "unusable" state. It shows "Message: The VM disk is full. Delete jobs, tasks, or files on the node to free up space and then reboot the node."
PS: I have a "standard_d2s_v3" machine in my batch account.
Does the job run store all the files of a blob in cache?
What is the solution to this? How to know what is the limit of size/number of files that can be loaded to a storage to run efficiently?
or upvote
button whenever the information provided helps you.