question

PG-6613 avatar image
0 Votes"
PG-6613 asked ·

MS Azure Machine Learning: MemoryError: Unable to allocate 5.43 GiB for an array with shape (23847, 30582) and data type int64

I am trying to extract pixel values from a raster image using xarray module. I tried to "stack" the coordinates to get a third dimension but I end up getting the error above. I create a compute instance of 56GB RAM so I was wondering why the 5.43 GiB, I would have expected going beyond 56GB but the values seems off.

Thank you.

azure-machine-learning
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

1 Answer

ramr-msft avatar image
0 Votes"
ramr-msft answered ·

@PG-6613 Thanks for the question. Can you please add more details about the code that you are trying and the compute instance series details. There are some operations that will require a pick of memory usage while executing. So even when your dataframe fits in memory, the operation requires some more during operation.

We would recommend using the M series. We introduced this new vm family recently for high memory operations. There are known outage issue in storage, please raise a azure support ticket with the details..
Doc for M Series:
https://docs.microsoft.com/en-us/azure/virtual-machines/m-series?toc=/azure/virtual-machines/linux/toc.json&bc=/azure/virtual-machines/linux/breadcrumb/toc.json

You can get a summary of the memory used by a Pandas DataFrame by calling df.info(memory_usage=”deep”)
docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html

·
10 |1000 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.