We are in the middle of collecting data that requires processing in Matlab, and are looking to improve the processing performance by running it a multi-core VM with Matlab installed. The data is currently located in SharePoint and shared via OneDrive synchronization from users' personal computers. The best solution for accessing the data from the VM is not clear, so I am looking for some advice. It's feasible that we could migrate the files away from SharePoint and into some kind of Azure storage if there is clear benefit to doing so. I'm currently exploring the following possibilities:
Sync the data to storage on the same virtual drive as the VM itself using a OneDrive client on the VM. But Microsoft doesn’t produce a client for Linux, so this would use the open-source Linux client. There would still have to be a download step before doing any data processing, but hopefully that would be much faster since it should all be occurring within the Microsoft cloud backbone.
Setup an Azure Files space to be mounted by the VM and have users access the space via SMB as a network drive. The data would be immediately available to the VM once its uploaded, but we would also need to also migrate about 200GB of existing data from SharePoint.
Setup an Azure Files space to be mounted by the VM and have it synchronize from SharePoint via an Azure Logic App. This would allow us to keep the existing SharePoint \ OneDrive based workflow without requiring a separate synchronization step that is run from the VM.
I'm an Azure rookie and this is just what I've found from looking around. I was hoping there was a more direct way to connect from an Azure VM to SharePoint cloud storage. Is there a better option? How would file access performance differ between synchronizing the files onto the VM drive vs connecting to an Azure Files drive via SMB? Thanks in advance.