MS Azure Machine Learning: MemoryError: Unable to allocate 5.43 GiB for an array with shape (23847, 30582) and data type int64

Question

I am trying to extract pixel values from a raster image using xarray module. I tried to "stack" the coordinates to get a third dimension but I end up getting the error above. I create a compute instance of 56GB RAM so I was wondering why the 5.43 GiB, I would have expected going beyond 56GB but the values seems off.

Thank you.

Answer

@PG Thanks for the question. Can you please add more details about the code that you are trying and the compute instance series details. There are some operations that will require a pick of memory usage while executing. So even when your dataframe fits in memory, the operation requires some more during operation.

We would recommend using the M series. We introduced this new vm family recently for high memory operations. There are known outage issue in storage, please raise a azure support ticket with the details..
Doc for M Series:
https://learn.microsoft.com/en-us/azure/virtual-machines/m-series?toc=/azure/virtual-machines/linux/toc.json&bc=/azure/virtual-machines/linux/breadcrumb/toc.json

You can get a summary of the memory used by a Pandas DataFrame by calling df.info(memory_usage=”deep”)
docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html

MS Azure Machine Learning: MemoryError: Unable to allocate 5.43 GiB for an array with shape (23847, 30582) and data type int64

1 answer